about summary refs log tree commit diff stats
path: root/README.html
diff options
context:
space:
mode:
authorCase Duckworth2022-08-02 09:25:42 -0500
committerCase Duckworth2022-08-02 09:26:59 -0500
commit0d81f5100640c7f961fe6d6e79a6b0d801b3289b (patch)
treed5edcc746b7612e8ebebf2ea7407c6eabd53b9f7 /README.html
downloaddocawk-main.tar.gz
docawk-main.zip
Initial commit HEAD main
Diffstat (limited to 'README.html')
-rw-r--r--README.html203
1 files changed, 203 insertions, 0 deletions
diff --git a/README.html b/README.html new file mode 100644 index 0000000..94c9c2d --- /dev/null +++ b/README.html
@@ -0,0 +1,203 @@
1<!DOCTYPE html>
2<title>doc.awk</title>
3<link type="text/css" rel="stylesheet" href="style.css" />
4<body>
5<h1 id="doc-awk"><a href="#doc-awk" class="header">DOC AWK&nbsp;<svg width="16" height="16" xmlns="http://www.w3.org/2000/svg"><g transform="rotate(-30, 8, 8)" stroke="#000000" opacity="0.25"><rect fill="none" height="6" width="8" x="2" y="6" rx="1.5"/><rect fill="none" height="6" width="8" x="6" y="4" rx="1.5"/></g></svg></a></h1>
6<p>A quick-and-dirty literate-programming-style documentation generator
7inspired by <a class="normal" href="https://ashkenas.com/docco/" title="">docco</a>.</p>
8<p>by Case Duckworth <a class="normal" href="&#x6D;a&#105;&#108;&#x74;&#x6F;&#58;a&#x63;&#x64;&#x77;&#x40;&#97;&#x63;&#x64;&#x77;&#x2E;&#x6E;&#x65;&#x74;">&#97;&#x63;&#x64;w&#x40;&#97;&#x63;&#100;&#119;&#x2E;&#110;&#101;t</a></p>
9<p>Source available under the <a class="normal" href="https://acdw.casa/gcl" title="">Good Choices License</a>.</p>
10<p>There's a lot of quick-and-dirty "literate programming tools" out there, many
11of which were inspired by, and also borrowed from, docco. I was particularly
12interested in <a class="normal" href="https://rtomayko.github.io/shocco/" title="">shocco</a>, written in POSIX shell (of which I am a fan).</p>
13<p>Notably missing, however, was a converter of some kind written in AWK. Thus,
14DOC AWK was born.</p>
15<p>This page is the result of DOC AWK working on itself. Not bad for &lt; 250 lines
16including commentary! You can pick up the raw source code of doc.awk <a class="normal" href="https://git.acdw.net/doc.awk" title="">in its
17git repository</a> to use it yourself.</p>
18<h2 id="code"><a href="#code" class="header">Code&nbsp;<svg width="16" height="16" xmlns="http://www.w3.org/2000/svg"><g transform="rotate(-30, 8, 8)" stroke="#000000" opacity="0.25"><rect fill="none" height="6" width="8" x="2" y="6" rx="1.5"/><rect fill="none" height="6" width="8" x="6" y="4" rx="1.5"/></g></svg></a></h2>
19<pre><code>BEGIN {
20</code></pre>
21<p>All the best awk scripts start with a <code>BEGIN</code> block. In this one, we
22set a few variables from the environment, with defaults. I use the
23convenience function <code>getenv</code>, further down this script, to make it
24easier.</p>
25<p>First, the comment regex. This regex detects a comment <em>line</em>, not an
26inline comment. By default, it's set up for awk, shell, and other
27languages that use <code>#</code> as a comment delimiter, but you can make it
28whatever you want.</p>
29<pre><code> COMMENT = getenv("DOCAWK_COMMENT", COMMENT, "^[ \t]*#+[ \t]*")
30</code></pre>
31<p>You can set <code>DOCAWK_TEXTPROC</code> to any text processor you want, but the
32default is the vendored <code>mdown.awk</code> script in this repo. It's from
33<a class="normal" href="https://github.com/wernsey/d.awk" title="">d.awk</a>.</p>
34<pre><code> TEXTPROC = getenv("DOCAWK_TEXTPROC", TEXTPROC, "./mdown.awk")
35</code></pre>
36<p>You can also set the processor for code sections of the source file;
37the included <code>htmlsafe.awk</code> simply escapes &lt;, &amp;, and &gt;.</p>
38<pre><code> CODEPROC = getenv("DOCAWK_CODEPROC", CODEPROC, "./htmlsafe.awk")
39</code></pre>
40<p>Usually, a file header and footer are enough for most documents. The
41defaults here are the included header.html and footer.html, since the
42default output type is html.</p>
43<p>Each of these documents are actually <em>templates</em>, with keys that can
44expand to variables inside of <code>@@VARIABLE@@</code>. This is mostly
45for title expansion.</p>
46<pre><code> HEADER = getenv("DOCAWK_HEADER", HEADER, "./header.html")
47 FOOTER = getenv("DOCAWK_FOOTER", FOOTER, "./footer.html")
48}
49</code></pre>
50<p>Because <code>FILENAME</code> is unset during <code>BEGIN</code>, template expansion that attempts
51to view the filename doesn't work. Thus, I need a state variable to track
52whether we've started or not (so that I don't print a header with every new
53file).</p>
54<pre><code>! begun {
55</code></pre>
56<p>The template array is initialized with the document's title.</p>
57<pre><code> TV["TITLE"] = get_title()
58</code></pre>
59<p>Print the header here, since if multiple files are passed to DOC AWK
60they'll all be concatenated anyway.</p>
61<pre><code> file_print(HEADER)
62}
63</code></pre>
64<p><code>doc.awk</code> is multi-file aware. It also removes the shebang line from the
65script if it exists, because you probably don't want that in the output.</p>
66<p>It wouldn't be a <em>bad</em> idea to make a heuristic for determining the type of
67source file we're converting here.</p>
68<pre><code>FNR == 1 {
69 begun = 1
70 if ($0 ~ COMMENT) {
71 lt = "text"
72 } else {
73 lt = "code"
74 }
75 if ($0 !~ /^#!/) {
76 bufadd(lt)
77 }
78 next
79}
80</code></pre>
81<p>The main logic is quite simple: if a given line is a comment as defined by
82<code>DOCAWK_COMMENT</code>, it's in a text block and should be treated as such;
83otherwise, it's in a code block. Accumulate each part in a dedicated buffer,
84and on a switch-over between code and text, print the buffer and reset.</p>
85<pre><code>$0 !~ COMMENT {
86 lt = "code"
87 bufprint("text")
88}
89
90$0 ~ COMMENT {
91 lt = "text"
92 bufprint("code")
93 sub(COMMENT, "", $0)
94}
95
96{
97 bufadd(lt)
98}
99</code></pre>
100<p>Of course, at the end there might be something in either buffer, so print that
101out too. I've decided to put text last for the possibility of ending commentary.</p>
102<pre><code>END {
103 bufprint("code")
104 bufprint("text")
105 file_print(FOOTER)
106}
107</code></pre>
108<h2 id="functions"><a href="#functions" class="header">Functions&nbsp;<svg width="16" height="16" xmlns="http://www.w3.org/2000/svg"><g transform="rotate(-30, 8, 8)" stroke="#000000" opacity="0.25"><rect fill="none" height="6" width="8" x="2" y="6" rx="1.5"/><rect fill="none" height="6" width="8" x="6" y="4" rx="1.5"/></g></svg></a></h2>
109<p><em>bufadd</em>: Add a STR to buffer TYPE. STR defaults to $0, the input record.</p>
110<pre><code>function bufadd(type, str)
111{
112 buf[type] = buf[type] (str ? str : $0) "\n"
113}
114</code></pre>
115<p><em>bufprint</em>: Print a buffer of TYPE. Automatically wrap the code blocks in a
116little HTML code block. I could maybe have a DOCAWK_CODE_PRE/POST and maybe
117even one for text too, to make it more extensible (to other markup languages,
118for example).</p>
119<pre><code>function bufprint(type)
120{
121 buf[type] = trim(buf[type])
122 if (buf[type]) {
123 if (type == "code") {
124 printf "&lt;pre&gt;&lt;code&gt;"
125 printf(buf[type]) | CODEPROC
126 close(CODEPROC)
127 print "&lt;/code&gt;&lt;/pre&gt;"
128 } else if (type == "text") {
129 print(buf[type]) | TEXTPROC
130 close(TEXTPROC)
131 }
132 buf[type] = ""
133 }
134}
135</code></pre>
136<p><em>file_print</em>: Print FILE line-by-line. The <code>&gt; 0</code> check here ensures that it
137bails on error (-1).</p>
138<pre><code>function file_print(file)
139{
140 if (file) {
141 while ((getline l &lt; file) &gt; 0) {
142 print template_expand(l)
143 }
144 close(file)
145 }
146}
147</code></pre>
148<p><em>get_title</em>: get the title of the current script, for the expanded document.
149If variables are set, use those; otherwise try to figure out the title from
150the document's basename.</p>
151<pre><code>function get_title()
152{
153 title = getenv("DOCAWK_TITLE", TITLE)
154 if (! title) {
155 title = FILENAME
156 sub(/.*\//, "", title)
157 }
158 return title
159}
160</code></pre>
161<p><em>getenv</em>: a convenience function for pulling values out of the environment.
162If an environment variable ENV isn't found, test if VAR is set (i.e., <code>doc.awk
163-v var=foo</code>.) and return it if it's set. Otherwise, return the default value
164DEF.</p>
165<pre><code>function getenv(env, var, def)
166{
167 if (ENVIRON[env]) {
168 return ENVIRON[env]
169 } else if (var) {
170 return var
171 } else {
172 return def
173 }
174}
175</code></pre>
176<p><em>template_expand</em>: expand templates of the form <code>@@template@@</code> in the text.
177Currently it only does variables, and works by line.</p>
178<p>Due to the way awk works, template variables need to live in their own special
179array, <code>TV</code>. I'd love it if awk had some kind of <code>eval</code> functionality, but at
180least POSIX awk doesn't.</p>
181<pre><code>function template_expand(text)
182{
183 if (match(text, /@@[^@]*@@/)) {
184 var = substr(text, RSTART + 2, RLENGTH - 4)
185 new = substr(text, 1, RSTART - 1)
186 new = new TV[var]
187 new = new substr(text, RSTART + RLENGTH)
188 } else {
189 new = text
190 }
191 return new
192}
193</code></pre>
194<p><em>trim</em>: remove whitespace from either end of a string.</p>
195<pre><code>function trim(str)
196{
197 sub(/^[ \n]*/, "", str)
198 sub(/[ \n]*$/, "", str)
199 return str
200}
201</code></pre>
202</body>
203</html>