From 0d81f5100640c7f961fe6d6e79a6b0d801b3289b Mon Sep 17 00:00:00 2001 From: Case Duckworth Date: Tue, 2 Aug 2022 09:25:42 -0500 Subject: Initial commit --- README.html | 203 ++++++++++++++ doc.awk | 213 ++++++++++++++ footer.html | 2 + header.html | 4 + htmlsafe.awk | 8 + mdown.awk | 903 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ style.css | 31 ++ 7 files changed, 1364 insertions(+) create mode 100644 README.html create mode 100755 doc.awk create mode 100644 footer.html create mode 100644 header.html create mode 100755 htmlsafe.awk create mode 100755 mdown.awk create mode 100644 style.css diff --git a/README.html b/README.html new file mode 100644 index 0000000..94c9c2d --- /dev/null +++ b/README.html @@ -0,0 +1,203 @@ + +
A quick-and-dirty literate-programming-style documentation generator +inspired by docco.
+by Case Duckworth acdw@acdw.net
+Source available under the Good Choices License.
+There's a lot of quick-and-dirty "literate programming tools" out there, many +of which were inspired by, and also borrowed from, docco. I was particularly +interested in shocco, written in POSIX shell (of which I am a fan).
+Notably missing, however, was a converter of some kind written in AWK. Thus, +DOC AWK was born.
+This page is the result of DOC AWK working on itself. Not bad for < 250 lines +including commentary! You can pick up the raw source code of doc.awk in its +git repository to use it yourself.
+BEGIN {
+
+All the best awk scripts start with a BEGIN
block. In this one, we
+set a few variables from the environment, with defaults. I use the
+convenience function getenv
, further down this script, to make it
+easier.
First, the comment regex. This regex detects a comment line, not an
+inline comment. By default, it's set up for awk, shell, and other
+languages that use #
as a comment delimiter, but you can make it
+whatever you want.
COMMENT = getenv("DOCAWK_COMMENT", COMMENT, "^[ \t]*#+[ \t]*")
+
+You can set DOCAWK_TEXTPROC
to any text processor you want, but the
+default is the vendored mdown.awk
script in this repo. It's from
+d.awk.
TEXTPROC = getenv("DOCAWK_TEXTPROC", TEXTPROC, "./mdown.awk")
+
+You can also set the processor for code sections of the source file;
+the included htmlsafe.awk
simply escapes <, &, and >.
CODEPROC = getenv("DOCAWK_CODEPROC", CODEPROC, "./htmlsafe.awk")
+
+Usually, a file header and footer are enough for most documents. The +defaults here are the included header.html and footer.html, since the +default output type is html.
+Each of these documents are actually templates, with keys that can
+expand to variables inside of @@VARIABLE@@
. This is mostly
+for title expansion.
HEADER = getenv("DOCAWK_HEADER", HEADER, "./header.html")
+ FOOTER = getenv("DOCAWK_FOOTER", FOOTER, "./footer.html")
+}
+
+Because FILENAME
is unset during BEGIN
, template expansion that attempts
+to view the filename doesn't work. Thus, I need a state variable to track
+whether we've started or not (so that I don't print a header with every new
+file).
! begun {
+
+The template array is initialized with the document's title.
+ TV["TITLE"] = get_title()
+
+Print the header here, since if multiple files are passed to DOC AWK +they'll all be concatenated anyway.
+ file_print(HEADER)
+}
+
+doc.awk
is multi-file aware. It also removes the shebang line from the
+script if it exists, because you probably don't want that in the output.
It wouldn't be a bad idea to make a heuristic for determining the type of +source file we're converting here.
+FNR == 1 {
+ begun = 1
+ if ($0 ~ COMMENT) {
+ lt = "text"
+ } else {
+ lt = "code"
+ }
+ if ($0 !~ /^#!/) {
+ bufadd(lt)
+ }
+ next
+}
+
+The main logic is quite simple: if a given line is a comment as defined by
+DOCAWK_COMMENT
, it's in a text block and should be treated as such;
+otherwise, it's in a code block. Accumulate each part in a dedicated buffer,
+and on a switch-over between code and text, print the buffer and reset.
$0 !~ COMMENT {
+ lt = "code"
+ bufprint("text")
+}
+
+$0 ~ COMMENT {
+ lt = "text"
+ bufprint("code")
+ sub(COMMENT, "", $0)
+}
+
+{
+ bufadd(lt)
+}
+
+Of course, at the end there might be something in either buffer, so print that +out too. I've decided to put text last for the possibility of ending commentary.
+END {
+ bufprint("code")
+ bufprint("text")
+ file_print(FOOTER)
+}
+
+bufadd: Add a STR to buffer TYPE. STR defaults to $0, the input record.
+function bufadd(type, str)
+{
+ buf[type] = buf[type] (str ? str : $0) "\n"
+}
+
+bufprint: Print a buffer of TYPE. Automatically wrap the code blocks in a +little HTML code block. I could maybe have a DOCAWK_CODE_PRE/POST and maybe +even one for text too, to make it more extensible (to other markup languages, +for example).
+function bufprint(type)
+{
+ buf[type] = trim(buf[type])
+ if (buf[type]) {
+ if (type == "code") {
+ printf "<pre><code>"
+ printf(buf[type]) | CODEPROC
+ close(CODEPROC)
+ print "</code></pre>"
+ } else if (type == "text") {
+ print(buf[type]) | TEXTPROC
+ close(TEXTPROC)
+ }
+ buf[type] = ""
+ }
+}
+
+file_print: Print FILE line-by-line. The > 0
check here ensures that it
+bails on error (-1).
function file_print(file)
+{
+ if (file) {
+ while ((getline l < file) > 0) {
+ print template_expand(l)
+ }
+ close(file)
+ }
+}
+
+get_title: get the title of the current script, for the expanded document. +If variables are set, use those; otherwise try to figure out the title from +the document's basename.
+function get_title()
+{
+ title = getenv("DOCAWK_TITLE", TITLE)
+ if (! title) {
+ title = FILENAME
+ sub(/.*\//, "", title)
+ }
+ return title
+}
+
+getenv: a convenience function for pulling values out of the environment.
+If an environment variable ENV isn't found, test if VAR is set (i.e., doc.awk
+-v var=foo
.) and return it if it's set. Otherwise, return the default value
+DEF.
function getenv(env, var, def)
+{
+ if (ENVIRON[env]) {
+ return ENVIRON[env]
+ } else if (var) {
+ return var
+ } else {
+ return def
+ }
+}
+
+template_expand: expand templates of the form @@template@@
in the text.
+Currently it only does variables, and works by line.
Due to the way awk works, template variables need to live in their own special
+array, TV
. I'd love it if awk had some kind of eval
functionality, but at
+least POSIX awk doesn't.
function template_expand(text)
+{
+ if (match(text, /@@[^@]*@@/)) {
+ var = substr(text, RSTART + 2, RLENGTH - 4)
+ new = substr(text, 1, RSTART - 1)
+ new = new TV[var]
+ new = new substr(text, RSTART + RLENGTH)
+ } else {
+ new = text
+ }
+ return new
+}
+
+trim: remove whitespace from either end of a string.
+function trim(str)
+{
+ sub(/^[ \n]*/, "", str)
+ sub(/[ \n]*$/, "", str)
+ return str
+}
+
+
+