diff options
author | Case Duckworth | 2022-08-02 09:25:42 -0500 |
---|---|---|
committer | Case Duckworth | 2022-08-02 09:26:59 -0500 |
commit | 0d81f5100640c7f961fe6d6e79a6b0d801b3289b (patch) | |
tree | d5edcc746b7612e8ebebf2ea7407c6eabd53b9f7 /README.html | |
download | docawk-main.tar.gz docawk-main.zip |
Diffstat (limited to 'README.html')
-rw-r--r-- | README.html | 203 |
1 files changed, 203 insertions, 0 deletions
diff --git a/README.html b/README.html new file mode 100644 index 0000000..94c9c2d --- /dev/null +++ b/README.html | |||
@@ -0,0 +1,203 @@ | |||
1 | <!DOCTYPE html> | ||
2 | <title>doc.awk</title> | ||
3 | <link type="text/css" rel="stylesheet" href="style.css" /> | ||
4 | <body> | ||
5 | <h1 id="doc-awk"><a href="#doc-awk" class="header">DOC AWK <svg width="16" height="16" xmlns="http://www.w3.org/2000/svg"><g transform="rotate(-30, 8, 8)" stroke="#000000" opacity="0.25"><rect fill="none" height="6" width="8" x="2" y="6" rx="1.5"/><rect fill="none" height="6" width="8" x="6" y="4" rx="1.5"/></g></svg></a></h1> | ||
6 | <p>A quick-and-dirty literate-programming-style documentation generator | ||
7 | inspired by <a class="normal" href="https://ashkenas.com/docco/" title="">docco</a>.</p> | ||
8 | <p>by Case Duckworth <a class="normal" href="mailto:acdw@acdw.net">acdw@acdw.net</a></p> | ||
9 | <p>Source available under the <a class="normal" href="https://acdw.casa/gcl" title="">Good Choices License</a>.</p> | ||
10 | <p>There's a lot of quick-and-dirty "literate programming tools" out there, many | ||
11 | of which were inspired by, and also borrowed from, docco. I was particularly | ||
12 | interested in <a class="normal" href="https://rtomayko.github.io/shocco/" title="">shocco</a>, written in POSIX shell (of which I am a fan).</p> | ||
13 | <p>Notably missing, however, was a converter of some kind written in AWK. Thus, | ||
14 | DOC AWK was born.</p> | ||
15 | <p>This page is the result of DOC AWK working on itself. Not bad for < 250 lines | ||
16 | including commentary! You can pick up the raw source code of doc.awk <a class="normal" href="https://git.acdw.net/doc.awk" title="">in its | ||
17 | git repository</a> to use it yourself.</p> | ||
18 | <h2 id="code"><a href="#code" class="header">Code <svg width="16" height="16" xmlns="http://www.w3.org/2000/svg"><g transform="rotate(-30, 8, 8)" stroke="#000000" opacity="0.25"><rect fill="none" height="6" width="8" x="2" y="6" rx="1.5"/><rect fill="none" height="6" width="8" x="6" y="4" rx="1.5"/></g></svg></a></h2> | ||
19 | <pre><code>BEGIN { | ||
20 | </code></pre> | ||
21 | <p>All the best awk scripts start with a <code>BEGIN</code> block. In this one, we | ||
22 | set a few variables from the environment, with defaults. I use the | ||
23 | convenience function <code>getenv</code>, further down this script, to make it | ||
24 | easier.</p> | ||
25 | <p>First, the comment regex. This regex detects a comment <em>line</em>, not an | ||
26 | inline comment. By default, it's set up for awk, shell, and other | ||
27 | languages that use <code>#</code> as a comment delimiter, but you can make it | ||
28 | whatever you want.</p> | ||
29 | <pre><code> COMMENT = getenv("DOCAWK_COMMENT", COMMENT, "^[ \t]*#+[ \t]*") | ||
30 | </code></pre> | ||
31 | <p>You can set <code>DOCAWK_TEXTPROC</code> to any text processor you want, but the | ||
32 | default is the vendored <code>mdown.awk</code> script in this repo. It's from | ||
33 | <a class="normal" href="https://github.com/wernsey/d.awk" title="">d.awk</a>.</p> | ||
34 | <pre><code> TEXTPROC = getenv("DOCAWK_TEXTPROC", TEXTPROC, "./mdown.awk") | ||
35 | </code></pre> | ||
36 | <p>You can also set the processor for code sections of the source file; | ||
37 | the included <code>htmlsafe.awk</code> simply escapes <, &, and >.</p> | ||
38 | <pre><code> CODEPROC = getenv("DOCAWK_CODEPROC", CODEPROC, "./htmlsafe.awk") | ||
39 | </code></pre> | ||
40 | <p>Usually, a file header and footer are enough for most documents. The | ||
41 | defaults here are the included header.html and footer.html, since the | ||
42 | default output type is html.</p> | ||
43 | <p>Each of these documents are actually <em>templates</em>, with keys that can | ||
44 | expand to variables inside of <code>@@VARIABLE@@</code>. This is mostly | ||
45 | for title expansion.</p> | ||
46 | <pre><code> HEADER = getenv("DOCAWK_HEADER", HEADER, "./header.html") | ||
47 | FOOTER = getenv("DOCAWK_FOOTER", FOOTER, "./footer.html") | ||
48 | } | ||
49 | </code></pre> | ||
50 | <p>Because <code>FILENAME</code> is unset during <code>BEGIN</code>, template expansion that attempts | ||
51 | to view the filename doesn't work. Thus, I need a state variable to track | ||
52 | whether we've started or not (so that I don't print a header with every new | ||
53 | file).</p> | ||
54 | <pre><code>! begun { | ||
55 | </code></pre> | ||
56 | <p>The template array is initialized with the document's title.</p> | ||
57 | <pre><code> TV["TITLE"] = get_title() | ||
58 | </code></pre> | ||
59 | <p>Print the header here, since if multiple files are passed to DOC AWK | ||
60 | they'll all be concatenated anyway.</p> | ||
61 | <pre><code> file_print(HEADER) | ||
62 | } | ||
63 | </code></pre> | ||
64 | <p><code>doc.awk</code> is multi-file aware. It also removes the shebang line from the | ||
65 | script if it exists, because you probably don't want that in the output.</p> | ||
66 | <p>It wouldn't be a <em>bad</em> idea to make a heuristic for determining the type of | ||
67 | source file we're converting here.</p> | ||
68 | <pre><code>FNR == 1 { | ||
69 | begun = 1 | ||
70 | if ($0 ~ COMMENT) { | ||
71 | lt = "text" | ||
72 | } else { | ||
73 | lt = "code" | ||
74 | } | ||
75 | if ($0 !~ /^#!/) { | ||
76 | bufadd(lt) | ||
77 | } | ||
78 | next | ||
79 | } | ||
80 | </code></pre> | ||
81 | <p>The main logic is quite simple: if a given line is a comment as defined by | ||
82 | <code>DOCAWK_COMMENT</code>, it's in a text block and should be treated as such; | ||
83 | otherwise, it's in a code block. Accumulate each part in a dedicated buffer, | ||
84 | and on a switch-over between code and text, print the buffer and reset.</p> | ||
85 | <pre><code>$0 !~ COMMENT { | ||
86 | lt = "code" | ||
87 | bufprint("text") | ||
88 | } | ||
89 | |||
90 | $0 ~ COMMENT { | ||
91 | lt = "text" | ||
92 | bufprint("code") | ||
93 | sub(COMMENT, "", $0) | ||
94 | } | ||
95 | |||
96 | { | ||
97 | bufadd(lt) | ||
98 | } | ||
99 | </code></pre> | ||
100 | <p>Of course, at the end there might be something in either buffer, so print that | ||
101 | out too. I've decided to put text last for the possibility of ending commentary.</p> | ||
102 | <pre><code>END { | ||
103 | bufprint("code") | ||
104 | bufprint("text") | ||
105 | file_print(FOOTER) | ||
106 | } | ||
107 | </code></pre> | ||
108 | <h2 id="functions"><a href="#functions" class="header">Functions <svg width="16" height="16" xmlns="http://www.w3.org/2000/svg"><g transform="rotate(-30, 8, 8)" stroke="#000000" opacity="0.25"><rect fill="none" height="6" width="8" x="2" y="6" rx="1.5"/><rect fill="none" height="6" width="8" x="6" y="4" rx="1.5"/></g></svg></a></h2> | ||
109 | <p><em>bufadd</em>: Add a STR to buffer TYPE. STR defaults to $0, the input record.</p> | ||
110 | <pre><code>function bufadd(type, str) | ||
111 | { | ||
112 | buf[type] = buf[type] (str ? str : $0) "\n" | ||
113 | } | ||
114 | </code></pre> | ||
115 | <p><em>bufprint</em>: Print a buffer of TYPE. Automatically wrap the code blocks in a | ||
116 | little HTML code block. I could maybe have a DOCAWK_CODE_PRE/POST and maybe | ||
117 | even one for text too, to make it more extensible (to other markup languages, | ||
118 | for example).</p> | ||
119 | <pre><code>function bufprint(type) | ||
120 | { | ||
121 | buf[type] = trim(buf[type]) | ||
122 | if (buf[type]) { | ||
123 | if (type == "code") { | ||
124 | printf "<pre><code>" | ||
125 | printf(buf[type]) | CODEPROC | ||
126 | close(CODEPROC) | ||
127 | print "</code></pre>" | ||
128 | } else if (type == "text") { | ||
129 | print(buf[type]) | TEXTPROC | ||
130 | close(TEXTPROC) | ||
131 | } | ||
132 | buf[type] = "" | ||
133 | } | ||
134 | } | ||
135 | </code></pre> | ||
136 | <p><em>file_print</em>: Print FILE line-by-line. The <code>> 0</code> check here ensures that it | ||
137 | bails on error (-1).</p> | ||
138 | <pre><code>function file_print(file) | ||
139 | { | ||
140 | if (file) { | ||
141 | while ((getline l < file) > 0) { | ||
142 | print template_expand(l) | ||
143 | } | ||
144 | close(file) | ||
145 | } | ||
146 | } | ||
147 | </code></pre> | ||
148 | <p><em>get_title</em>: get the title of the current script, for the expanded document. | ||
149 | If variables are set, use those; otherwise try to figure out the title from | ||
150 | the document's basename.</p> | ||
151 | <pre><code>function get_title() | ||
152 | { | ||
153 | title = getenv("DOCAWK_TITLE", TITLE) | ||
154 | if (! title) { | ||
155 | title = FILENAME | ||
156 | sub(/.*\//, "", title) | ||
157 | } | ||
158 | return title | ||
159 | } | ||
160 | </code></pre> | ||
161 | <p><em>getenv</em>: a convenience function for pulling values out of the environment. | ||
162 | If an environment variable ENV isn't found, test if VAR is set (i.e., <code>doc.awk | ||
163 | -v var=foo</code>.) and return it if it's set. Otherwise, return the default value | ||
164 | DEF.</p> | ||
165 | <pre><code>function getenv(env, var, def) | ||
166 | { | ||
167 | if (ENVIRON[env]) { | ||
168 | return ENVIRON[env] | ||
169 | } else if (var) { | ||
170 | return var | ||
171 | } else { | ||
172 | return def | ||
173 | } | ||
174 | } | ||
175 | </code></pre> | ||
176 | <p><em>template_expand</em>: expand templates of the form <code>@@template@@</code> in the text. | ||
177 | Currently it only does variables, and works by line.</p> | ||
178 | <p>Due to the way awk works, template variables need to live in their own special | ||
179 | array, <code>TV</code>. I'd love it if awk had some kind of <code>eval</code> functionality, but at | ||
180 | least POSIX awk doesn't.</p> | ||
181 | <pre><code>function template_expand(text) | ||
182 | { | ||
183 | if (match(text, /@@[^@]*@@/)) { | ||
184 | var = substr(text, RSTART + 2, RLENGTH - 4) | ||
185 | new = substr(text, 1, RSTART - 1) | ||
186 | new = new TV[var] | ||
187 | new = new substr(text, RSTART + RLENGTH) | ||
188 | } else { | ||
189 | new = text | ||
190 | } | ||
191 | return new | ||
192 | } | ||
193 | </code></pre> | ||
194 | <p><em>trim</em>: remove whitespace from either end of a string.</p> | ||
195 | <pre><code>function trim(str) | ||
196 | { | ||
197 | sub(/^[ \n]*/, "", str) | ||
198 | sub(/[ \n]*$/, "", str) | ||
199 | return str | ||
200 | } | ||
201 | </code></pre> | ||
202 | </body> | ||
203 | </html> | ||