diff options
-rw-r--r-- | README.txt | 176 | ||||
-rwxr-xr-x | ht.awk | 385 | ||||
-rw-r--r-- | ht.conf | 24 | ||||
-rwxr-xr-x | ht.sh | 62 | ||||
-rw-r--r-- | test.ht | 45 | ||||
-rw-r--r-- | test.txt | 24 |
6 files changed, 298 insertions, 418 deletions
diff --git a/README.txt b/README.txt deleted file mode 100644 index c409335..0000000 --- a/README.txt +++ /dev/null | |||
@@ -1,176 +0,0 @@ | |||
1 | # HAT TRICK | ||
2 | |||
3 | HAT TRICK is both a lightweight markup language inspired by gemtext and html, | ||
4 | and this awk program to convert the markup language to gemtext, html, and | ||
5 | gophermap markup. It uses a mixture of "block"-level and line-level sigils to | ||
6 | extend the pure line-based markup of gemtext, while removing some of the more | ||
7 | annoying points (to my mind) of writing pure html---i.e., repetitive tags and | ||
8 | other boilerplate. | ||
9 | |||
10 | ## Syntax | ||
11 | |||
12 | ### Blocks | ||
13 | |||
14 | In HAT TRICK, block of text separated by a blank line is a type of "block." The | ||
15 | default block is a paragraph, or <p> tag in html. (In gemini and gophermaps, no | ||
16 | extra tags are added.) Other blocks defined by the syntax are as follows: | ||
17 | |||
18 | >>> | ||
19 | # HEADING 1 | ||
20 | ## HEADING 2 | ||
21 | ### HEADING 3 | ||
22 | <<< | ||
23 | |||
24 | Correspond to <hx> in html; passed through unmodified to gemtext and gophermaps. | ||
25 | |||
26 | >>> | ||
27 | > BLOCK QUOTE | ||
28 | <<< | ||
29 | |||
30 | Corresponds to <blockquote>; passed through unmodified to gemtext and | ||
31 | gophermaps. | ||
32 | |||
33 | >>> | ||
34 | - UNORDERED LIST ITEM | ||
35 | 1. ORDERED LIST ITEM | ||
36 | <<< | ||
37 | |||
38 | The first list item in a block automatically opens the necessary list tag in | ||
39 | html. In gemtext, the "-" is converted to "*" (which signifies a list item); | ||
40 | the hyphen is passed through in gophermaps, because I think it's better syntax | ||
41 | personally. | ||
42 | |||
43 | >>> | ||
44 | --- SECTION BREAK | ||
45 | <<< | ||
46 | |||
47 | A visual indication to break sections. Corresponds to <hr> in html | ||
48 | (TODO: consider html5 <section> tags --- this would take more logic.) | ||
49 | |||
50 | HAT TRICK reflows blocks, which means that only the first line of a block | ||
51 | needs to start with the sigils outlined above. However, each line of the block | ||
52 | can begin with the sigil character for easier reading. | ||
53 | |||
54 | ### Lines | ||
55 | |||
56 | Within blocks, there are certain other sigils that apply only to the line they | ||
57 | prepend. They include the following: | ||
58 | |||
59 | >>> | ||
60 | => LINK | ||
61 | <<< | ||
62 | |||
63 | Links are probably the most important element in any hypertext language---since | ||
64 | without them, it's hardly hypertext. HAT TRICK borrows its link syntax from | ||
65 | gemtext: the line starts with a "=>", the next field is the link's URL, and the | ||
66 | rest of the line is the link's display text. | ||
67 | |||
68 | >>> | ||
69 | <TAG> HTML TAG | ||
70 | <<< | ||
71 | |||
72 | Lines beginning with an html tag are passed on to html verbatim. The closing | ||
73 | tag is automatically appended to the end of the line, before any ending | ||
74 | punctuation. I've found that 99 times out of 100, I don't want formatting to | ||
75 | include the ending punctuation. | ||
76 | |||
77 | A backslash (\) at the end of the tag line will prevent the tag from being | ||
78 | ended, which is useful for tag-included punctuation as well as nesting tags. | ||
79 | However, the tag is never closed, so you'll have to close it yourself on the | ||
80 | next line. In addition, text that isn't in a tag is html-escaped, so for the | ||
81 | markup to properly apply, you'll need to write something like this: | ||
82 | |||
83 | >>> | ||
84 | <b>She sells \ | ||
85 | <i>sea shells \ | ||
86 | </i> | ||
87 | on the sea shore.\ | ||
88 | </b> | ||
89 | <<< | ||
90 | |||
91 | So while this markup is possible, it's discouraged through the awkwardness of | ||
92 | the syntax. | ||
93 | |||
94 | To translate these tags to meaningful markup in gemtext and gophermaps, a lookup | ||
95 | table is used to correspond the tags to opening and closing characters around | ||
96 | the line's text. This correspondance can be defined with the environment | ||
97 | variable HT_TAGCHARS or HAT TRICK's second positional argument (See INVOCATION, | ||
98 | below). | ||
99 | |||
100 | >>> | ||
101 | ; COMMENT | ||
102 | <<< | ||
103 | |||
104 | Comments in HAT TRICK aren't passed on to the output text---even in html, which | ||
105 | has a comment syntax. Instead, comments are passed, including the prepending | ||
106 | semicolon, to standard error for further processing. | ||
107 | |||
108 | ### Verbatim blocks | ||
109 | |||
110 | Finally, there is a special type of block for passing raw text through to the | ||
111 | next phase of processing: the verbatim block. | ||
112 | |||
113 | >>> | ||
114 | >>> [OUTPUTS] | ||
115 | VERBATIM TEXT | ||
116 | <<< | ||
117 | <<< | ||
118 | |||
119 | In html, the verbatim text is wrapped in <pre><code> tags; in gemtext, it's | ||
120 | wrapped in gemtext's own verbatim text markers ```; and the text is unwrapped in | ||
121 | gophermaps for a cleaner look. | ||
122 | |||
123 | The OUTPUTS can be any output specifier HAT TRICK accepts; see INVOCATION below | ||
124 | for details. If OUTPUTS is present, the verbatim text will only be passed | ||
125 | through in the output formats listed; with no OUTPUTS listed it will output to | ||
126 | all formats. | ||
127 | |||
128 | ## "Escaping" line- and block-types | ||
129 | |||
130 | Each of the types listed above are anchored at the beginning of the line. | ||
131 | Therefore, a simple "escaping" mechanism is available for free: simply prepend a | ||
132 | space to any line you don't want processed as a line or block and you'll be | ||
133 | gravy. Astute readers will notice I did just that above, to describe the syntax | ||
134 | for verbatim fencing. | ||
135 | |||
136 | ## Invocation | ||
137 | |||
138 | An invocation of HAT TRICK will look something like this: | ||
139 | |||
140 | >>> | ||
141 | ./ht.awk [HT_FORMATS] [HT_TAGCHARS] < INPUT | ||
142 | <<< | ||
143 | |||
144 | It processes text from standard input and uses two positional parameters to | ||
145 | customize its usage, in addition to environment variables. In each instance, | ||
146 | the parameter will override the variable, and if neither are provided, HAT TRICK | ||
147 | will choose a default. | ||
148 | |||
149 | ### HT_FORMATS (default: "html") | ||
150 | |||
151 | The format(s) to export to. Can be one or more of "html", "gemini", and | ||
152 | "gopher". As a convenience, a format can be prepended with a "-" (i.e., | ||
153 | "-html"), in which case every other format will be exported to. Multiple | ||
154 | formats can also be specified by separating them with a comma. The special | ||
155 | keyword "all" will export to all formats (this is the default). | ||
156 | |||
157 | If HAT TRICK exports to one format, it will simply print out each line | ||
158 | translated into that format. However, if more than one format is given, | ||
159 | HAT TRICK prints each line multiple times, prepending the name of the format to | ||
160 | the output. This allows for further processing to filter outputs according to | ||
161 | output type with just one pass through the input. | ||
162 | |||
163 | ### HT_TAGCHARS (default: 'b:**,i://,code:``') | ||
164 | |||
165 | The correspondance between html tag lines and other output formats. If | ||
166 | HT_FORMAT is only html, this option has no real meaning. | ||
167 | |||
168 | Each correspondance is of the (exploded) form | ||
169 | |||
170 | >>> | ||
171 | TAG : LEFT_CHAR RIGHT_CHAR | ||
172 | <<< | ||
173 | |||
174 | where TAG is the html tag, LEFT_CHAR is the character on the left of the | ||
175 | enclosed text, and RIGHT_CHAR is the character on the right. Rules can be | ||
176 | separated by commas to pass multiple ones to HAT TRICK. | ||
diff --git a/ht.awk b/ht.awk index 60e042b..b9ae377 100755 --- a/ht.awk +++ b/ht.awk | |||
@@ -1,246 +1,235 @@ | |||
1 | #!/usr/bin/awk -f | 1 | #!/bin/awk -f |
2 | # -*- indent-tabs-mode: t; -*- | ||
3 | # HAT TRICK | 2 | # HAT TRICK |
4 | # (C) 2022 C. Duckworth | 3 | # Copyright (C) 2022 Case Duckworth <acdw@acdw.net> |
5 | 4 | # | |
6 | ### Commentary: | ||
7 | |||
8 | # OLDIFS=$IFS; IFS=$'\n'; | ||
9 | # for line in `cat testfile`; do | ||
10 | # test=`echo "$line" | grep -E '[\]$'`; | ||
11 | # if [ $test ]; then | ||
12 | # newline=`echo $line | rev | cut -c 2- | rev`; | ||
13 | # echo -n "$newline"; else echo "$line"; | ||
14 | # fi; done; | ||
15 | # IFS=$OLDIFS | ||
16 | |||
17 | ### Code: | ||
18 | BEGIN { | 5 | BEGIN { |
19 | width = 72 | 6 | # Configuration |
20 | default_htag = "p" | 7 | DEFAULT_CONFIG_MODE = "config" |
21 | default_gtag = "" | 8 | config_initialize() |
22 | default_ftag = "" | 9 | config_parse(ENVIRON["HT_CONFIG"] ? ENVIRON["HT_CONFIG"] : "ht.conf") |
23 | } | 10 | # State |
24 | 11 | DEFTAG = CONFIG["default_tag"] | |
25 | ### Raw formatting | 12 | DEFATTR = CONFIG["default_attr"] |
26 | /^>>>/ { | 13 | TAG = DEFTAG |
27 | getline first_raw | 14 | ATTR = DEFATTR |
28 | if (raw_fmt_p("html")) { | 15 | } |
29 | raw_html = 1 | 16 | |
30 | html[++hpar] = "<pre><code>" html_escape(first_raw) | 17 | # Mutliple-file awareness |
31 | } | 18 | FNR == 1 { |
32 | if (raw_fmt_p("gemini")) { | 19 | fileflush() |
33 | raw_gemini = 1 | 20 | } |
34 | gemini[++gpar] = "```" | 21 | |
35 | gemini[++gpar] = first_raw | 22 | # Handle raw sections |
36 | } | 23 | $0 ~ CONFIG["raw_delim"] { |
37 | if (raw_fmt_p("gopher")) { | 24 | RAW = ! RAW |
38 | raw_gopher = 1 | 25 | if (RAW) { |
39 | gopher[++fpar] = first_raw | 26 | buflush() |
27 | bufpush(CONFIG["raw_beg"], -1) | ||
28 | } else { | ||
29 | bufpush(CONFIG["raw_end"], -1) | ||
30 | print BUFFER | ||
31 | BUFFER = "" | ||
40 | } | 32 | } |
41 | raw = 1 | ||
42 | next | 33 | next |
43 | } | 34 | } |
44 | 35 | ||
45 | /^<<</ { | 36 | RAW { |
46 | if (raw_html) { | 37 | bufpush($0) |
47 | html[hpar] = html[hpar] "</code></pre>" | ||
48 | } | ||
49 | if (raw_gemini) { | ||
50 | gemini[++gpar] = "```" | ||
51 | gemini[++gpar] = "" | ||
52 | } | ||
53 | if (raw_gopher) { | ||
54 | gopher[++fpar] = "" | ||
55 | } | ||
56 | raw_html = 0 | ||
57 | raw_gemini = 0 | ||
58 | raw_gopher = 0 | ||
59 | raw = 0 | ||
60 | next | 38 | next |
61 | } | 39 | } |
62 | 40 | ||
63 | raw { | 41 | # Comments |
64 | if (raw_html) { | 42 | $0 ~ ("^" COMMENT_DELIM) { |
65 | html_empty = 0 | ||
66 | html[++hpar] = html_escape($0) | ||
67 | } | ||
68 | if (raw_gemini) { | ||
69 | gemini_empty = 0 | ||
70 | gemini[++gpar] = $0 | ||
71 | } | ||
72 | if (raw_gopher) { | ||
73 | gopher_empty = 0 | ||
74 | gopher[++fpar] = $0 | ||
75 | } | ||
76 | next | 43 | next |
77 | } | 44 | } |
78 | 45 | ||
79 | # Block types | 46 | # HTML escape hatch |
80 | /^#/ { | 47 | /^</ { |
81 | match($0, /#+/) | 48 | HTML = 1 |
82 | htag = "h" (RLENGTH > 6 ? 6 : RLENGTH) | 49 | bufpush($0) |
83 | gtag = substr($0, RSTART, (RLENGTH > 3 ? 3 : RLENGTH)) " " | 50 | next |
84 | ftag = substr($0, RSTART, RLENGTH) " " | ||
85 | sub(/^#+[ \t]*/, "", $0) | ||
86 | } | 51 | } |
87 | 52 | ||
88 | # Line types | 53 | # Sure, let's do templating! This makes it less... weird. |
89 | /^=>/ { | 54 | /\$/ { |
90 | title = "" | 55 | # XXX: This is probably the dumbest way to do it. |
91 | for (i = 3; i <= NF; i++) { | 56 | gsub(/\$\$/, "$\a", $0) |
92 | title = title (title ? " " : "") $i | 57 | gsub(/\$[^\a]/, "\\\\&", $0) |
93 | } | 58 | gsub(/\$\a/, "$", $0) |
94 | hbuf[++hline] = "<a href=\"" $2 "\">" title "</a>" | ||
95 | gbuf[++gline] = "\ngemini\t" $0 | ||
96 | # TODO: gopher | ||
97 | next | ||
98 | } | 59 | } |
99 | 60 | ||
100 | ### Everything else | 61 | # Blocks of text |
101 | /./ { | 62 | /./ { |
102 | html_empty = 0 | 63 | # EOL escape |
103 | gemini_empty = 0 | 64 | if (match($0, /\\$/)) { |
104 | gopher_empty = 0 | 65 | sep = -1 |
105 | hbuf[++hline] = $0 | 66 | $0 = substr($0, 1, RSTART - 1) |
106 | gbuf[++gline] = $0 | 67 | } else { |
107 | fbuf[++fline] = $0 | 68 | sep = "\n" |
69 | } | ||
70 | # Loop through BLOCK_TYPES | ||
71 | for (bt in BLOCK_TYPES) { | ||
72 | if (match($0, "^" bt "[ \t]*")) { | ||
73 | $0 = substr($0, RSTART + RLENGTH) | ||
74 | if (match(BLOCK_TYPES[bt], "[ \t]*>[ \t]*")) { | ||
75 | parent = substr(BLOCK_TYPES[bt], 1, RSTART - 1) | ||
76 | child = substr(BLOCK_TYPES[bt], RSTART + RLENGTH) | ||
77 | } | ||
78 | if (parent) { | ||
79 | split(parent, pa, FS) | ||
80 | split(child, bl, FS) | ||
81 | if (! IN_PARENT) { | ||
82 | IN_PARENT = pa[1] | ||
83 | } | ||
84 | TAG = IN_PARENT | ||
85 | ATTR = "" | ||
86 | for (i = 2; i <= length(pa); i++) { | ||
87 | ATTR = ATTR (ATTR ? " " : "") pa[i] | ||
88 | } | ||
89 | bufpush("<" child ">" $0 "</" bl[1] ">") | ||
90 | next # XXX: This is messy. | ||
91 | } else { | ||
92 | split(BLOCK_TYPES[bt], bl, FS) | ||
93 | if (IN_PARENT) { | ||
94 | bufpush("</" IN_PARENT ">") | ||
95 | IN_PARENT = "" | ||
96 | } | ||
97 | if (! BUFFER) { | ||
98 | TAG = bl[1] | ||
99 | for (b = 2; b <= length(bl); b++) { | ||
100 | ATTR = ATTR (ATTR ? " " : "") bl[b] | ||
101 | } | ||
102 | } else { | ||
103 | $0 = "<" BLOCK_TYPES[bt] ">" $0 "</" bl[1] ">" | ||
104 | } | ||
105 | } | ||
106 | } | ||
107 | } | ||
108 | # Loop through LINE_TYPES | ||
109 | for (lt in LINE_TYPES) { | ||
110 | if (match($0, "^" lt "[ \t]*")) { | ||
111 | $0 = substr($0, RSTART + RLENGTH) | ||
112 | templ = LINE_TYPES[lt] | ||
113 | while (match(templ, /\$[0-9]+/)) { | ||
114 | sub(/\$[0-9]+/, $(substr(templ, RSTART + 1, RLENGTH - 1)), templ) | ||
115 | } | ||
116 | $0 = templ | ||
117 | } | ||
118 | } | ||
119 | # Push to buffer | ||
120 | bufpush($0, sep) | ||
108 | } | 121 | } |
109 | 122 | ||
123 | # Blank lines end blocks | ||
110 | /^$/ { | 124 | /^$/ { |
111 | bufput() | 125 | if (HTML) { |
126 | html_end() | ||
127 | } | ||
128 | if (! RAW) { | ||
129 | buflush() | ||
130 | } | ||
112 | } | 131 | } |
113 | 132 | ||
133 | # Clean up | ||
114 | END { | 134 | END { |
115 | bufput() | 135 | if (HTML) { |
116 | printarr(html, "html") | 136 | html_end() |
117 | printarr(gemini, "gemini") | 137 | } else if (RAW) { |
118 | printarr(gopher, "gopher") | 138 | bufpush(CONFIG["raw_end"], -1) |
119 | } | 139 | print BUFFER |
120 | 140 | } else { | |
121 | 141 | buflush() | |
122 | function bufput() | ||
123 | { | ||
124 | hbufput() | ||
125 | gbufput() | ||
126 | fbufput() | ||
127 | } | ||
128 | |||
129 | function clear(arr) | ||
130 | { | ||
131 | for (x in arr) { | ||
132 | delete arr[x] | ||
133 | } | 142 | } |
134 | } | 143 | } |
135 | 144 | ||
136 | function fbufput() | ||
137 | { | ||
138 | if (! length(fbuf)) { | ||
139 | next | ||
140 | } | ||
141 | for (ln in fbuf) { # XXX: gopher line types | ||
142 | paragraph = paragraph (paragraph ? " " : "") fbuf[ln] | ||
143 | } | ||
144 | fill(paragraph) | ||
145 | for (ln in fp) { | ||
146 | gopher[++fpar] = ((ln == 1) ? ftag : "") fp[ln] | ||
147 | } | ||
148 | gopher[++fpar] = "" | ||
149 | paragraph = "" | ||
150 | ftag = default_ftag | ||
151 | clear(fp) | ||
152 | clear(fbuf) | ||
153 | } | ||
154 | 145 | ||
155 | function fill(paragraph) | 146 | ### Buffer-y functions |
147 | function buflush() | ||
156 | { | 148 | { |
157 | char = 0 | 149 | buftrim() |
158 | ln = 1 | 150 | if (BUFFER) { |
159 | split(paragraph, words, FS) | 151 | if (TAG) { |
160 | for (word in words) { | 152 | TAG_BEG = "<" TAG (ATTR ? " " ATTR : "") ">" |
161 | char += length(words[word]) | 153 | TAG_END = "</" TAG ">" |
162 | if (char <= width) { | ||
163 | fp[ln] = fp[ln] (fp[ln] ? " " : "") words[word] | ||
164 | } else { | ||
165 | fp[++ln] = words[word] | ||
166 | char = length(words[word]) | ||
167 | } | 154 | } |
155 | print TAG_BEG BUFFER TAG_END | ||
156 | BUFFER = "" | ||
157 | TAG = DEFTAG | ||
158 | ATTR = DEFATTR | ||
159 | IN_PARENT = "" | ||
168 | } | 160 | } |
169 | } | 161 | } |
170 | 162 | ||
171 | function gbufput() | 163 | function bufpush(text, separator) |
172 | { | 164 | { |
173 | if (! length(gbuf)) { | 165 | if (! separator) { |
174 | next | 166 | separator = "\n" |
175 | } | 167 | } |
176 | for (ln in gbuf) { | 168 | if (separator == -1) { |
177 | paragraph = paragraph (paragraph ? " " : "") gbuf[ln] | 169 | separator = "" |
178 | } | 170 | } |
179 | gemini[++gpar] = gtag paragraph | 171 | BUFFER = BUFFER text (separator ? separator : "") |
180 | gemini[++gpar] = "" | ||
181 | gtag = default_gtag | ||
182 | paragraph = "" | ||
183 | clear(gbuf) | ||
184 | } | 172 | } |
185 | 173 | ||
186 | function gopher_line(type, display, selector, hostname, port) | 174 | function buftrim() |
187 | { | 175 | { |
188 | return (type display "\t" selector "\t" hostname "\t" port) | 176 | if (match(BUFFER, "\n+$")) { |
189 | } | 177 | BUFFER = substr(BUFFER, 1, RSTART - 1) |
190 | |||
191 | function hbufput() | ||
192 | { | ||
193 | if (! length(hbuf)) { | ||
194 | next | ||
195 | } | ||
196 | for (ln in hbuf) { | ||
197 | paragraph = paragraph (paragraph ? " " : "") hbuf[ln] | ||
198 | } | ||
199 | fill(paragraph) | ||
200 | for (ln in fp) { | ||
201 | html[++hpar] = ((ln == 1) ? "<" (htag ? htag : default_htag) ">" : "") fp[ln] | ||
202 | } | 178 | } |
203 | html[hpar] = html[hpar] (htag_end ? htag_end : "</" (htag ? htag : default_htag) ">") | ||
204 | paragraph = "" | ||
205 | htag = default_htag | ||
206 | clear(fp) | ||
207 | clear(hbuf) | ||
208 | } | 179 | } |
209 | 180 | ||
210 | function html_escape(text) | 181 | ### Config functions |
182 | function config_initialize() | ||
211 | { | 183 | { |
212 | gsub(/&/, "\\&", text) | 184 | COMMENT_DELIM = ";" |
213 | gsub(/</, "\\<", text) | 185 | CONFIG["raw_delim"] = "```" |
214 | gsub(/>/, "\\>", text) | 186 | CONFIG["raw_beg"] = "<pre><code>" |
215 | return text | 187 | CONFIG["raw_end"] = "</code></pre>" |
216 | } | 188 | CONFIG["default_tag"] = "p" |
217 | 189 | CONFIG["default_attr"] = "" | |
218 | function printarr(arr, prefix) | 190 | LINE_TYPES["@"] = "<a href=\"$1\">$2</a>" |
191 | LINE_TYPES["`"] = "<code>$0</code>" | ||
192 | BLOCK_TYPES["#"] = "h1" | ||
193 | BLOCK_TYPES["##"] = "h2" | ||
194 | BLOCK_TYPES["###"] = "h3" | ||
195 | BLOCK_TYPES["-"] = "ul>li" | ||
196 | } | ||
197 | |||
198 | function config_parse(file) | ||
219 | { | 199 | { |
220 | if (prefix) { | 200 | mode = DEFAULT_CONFIG_MODE |
221 | fmt = "%s\t%s\n" | 201 | while ((getline < file) > 0) { |
222 | } else { | 202 | if (match($0, /^#/) || ! $0) { |
223 | fmt = "%s%s\n" | 203 | continue |
224 | } | 204 | } |
225 | for (x in arr) { | 205 | if (match($0, /^\\/)) { |
226 | printf fmt, prefix, arr[x] | 206 | $0 = substr($0, 2) |
207 | } | ||
208 | if (match($0, /\[[^\]]+\]/)) { | ||
209 | mode = substr($0, RSTART + 1, RLENGTH - 2) | ||
210 | continue | ||
211 | } else { | ||
212 | var = $1 | ||
213 | val = "" | ||
214 | for (i = 2; i <= NF; i++) { | ||
215 | val = val (val ? " " : "") $i | ||
216 | } | ||
217 | if (mode == "config") { | ||
218 | CONFIG[var] = val | ||
219 | } else if (mode == "block") { | ||
220 | BLOCK_TYPES[var] = val | ||
221 | } else if (mode == "line") { | ||
222 | LINE_TYPES[var] = val | ||
223 | } | ||
224 | } | ||
227 | } | 225 | } |
228 | } | 226 | } |
229 | 227 | ||
230 | function raw_fmt_p(format) | 228 | ### Other functions |
229 | function html_end() | ||
231 | { | 230 | { |
232 | if (NF < 2) { | 231 | buftrim() |
233 | return 1 | 232 | print BUFFER |
234 | } | 233 | BUFFER = "" |
235 | if ($2 ~ /-/) { | 234 | HTML = 0 |
236 | if ($2 ~ ("-" format)) { | ||
237 | return 0 | ||
238 | } else { | ||
239 | return 1 | ||
240 | } | ||
241 | } | ||
242 | if ($2 ~ format) { | ||
243 | return 1 | ||
244 | } | ||
245 | return 0 | ||
246 | } | 235 | } |
diff --git a/ht.conf b/ht.conf new file mode 100644 index 0000000..0634a94 --- /dev/null +++ b/ht.conf | |||
@@ -0,0 +1,24 @@ | |||
1 | # hat trick configuration file | ||
2 | [config] | ||
3 | raw_delim ``` | ||
4 | raw_begin <pre><code> | ||
5 | raw_end </pre></code> | ||
6 | |||
7 | [block] | ||
8 | \# h1 | ||
9 | \## h2 | ||
10 | \### h3 | ||
11 | \#### h4 | ||
12 | \##### h5 | ||
13 | \###### h6 | ||
14 | |||
15 | - ul>li | ||
16 | % ol>li | ||
17 | |||
18 | > blockquote | ||
19 | |||
20 | [line] | ||
21 | @ <a href="$1">$2</a> | ||
22 | ` <code>$0</code> | ||
23 | / <em>$0</em> | ||
24 | * <strong>$0</strong> | ||
diff --git a/ht.sh b/ht.sh new file mode 100755 index 0000000..cc0d0ba --- /dev/null +++ b/ht.sh | |||
@@ -0,0 +1,62 @@ | |||
1 | #!/bin/sh | ||
2 | # ht.sh | ||
3 | # *.ht -> *html | ||
4 | |||
5 | # config | ||
6 | header_file=header.htm | ||
7 | footer_file=footer.htm | ||
8 | meta_file=meta.sh | ||
9 | |||
10 | # state | ||
11 | HTDAT="$(date +%s)" | ||
12 | HT_TMPL_COUNT=0 | ||
13 | |||
14 | print() { | ||
15 | printf '%s\n' "$*" | ||
16 | } | ||
17 | |||
18 | htt() { # htt FILE | ||
19 | # Like `cat`, but with templating. | ||
20 | ht_end="ht_main_${HTDAT}_${HT_TMPL_COUNT}" # be extra double sure | ||
21 | eval "$( | ||
22 | print "cat <<$ht_end" | ||
23 | cat "$@" | ||
24 | |||
25 | print "$ht_end" | ||
26 | )" | ||
27 | HT_TMPL_COUNT=$((HT_TMPL_COUNT + 1)) | ||
28 | } | ||
29 | |||
30 | htmeta_clear() { | ||
31 | # Generate metadata-clearing commands from $meta_file. | ||
32 | while read -r line; do | ||
33 | case "$line" in | ||
34 | *'()'*) # function | ||
35 | unset -f "${line%()*}" | ||
36 | ;; | ||
37 | *=*) # variable assignment | ||
38 | unset -v "${line%=*}" | ||
39 | ;; | ||
40 | *) # other -- XXX: Don't know what to do | ||
41 | ;; | ||
42 | esac | ||
43 | done <"$meta_file" | ||
44 | } | ||
45 | |||
46 | htmeta() { # htmeta FILE | ||
47 | # Collect metadata from FILE. | ||
48 | # Metadata looks like this: `;;@<SHELL_EXPRESSION>` | ||
49 | sed -n 's/^;;@//p' "$1" | tee "$meta_file" | ||
50 | } | ||
51 | |||
52 | main() { | ||
53 | # Make two passes over each input file, collecting metadata and content. | ||
54 | : | ||
55 | # Of course, this isn't safe, but you trust yourself, right? | ||
56 | for file; do | ||
57 | eval "$(htmeta_clear)" | ||
58 | eval "$(htmeta "$file")" | ||
59 | |||
60 | ./ht.awk <"$file" | htt "$header_file" - "$footer_file" >"${file}ml" | ||
61 | done | ||
62 | } | ||
diff --git a/test.ht b/test.ht index 0208568..97425a9 100644 --- a/test.ht +++ b/test.ht | |||
@@ -1,27 +1,32 @@ | |||
1 | # a test | 1 | # ht: a bespoke document preparation system |
2 | 2 | ||
3 | here's a test for ht.awk. | 3 | ;; comments are like this. |
4 | it's got paragraphs (these bad boys), long lines and such, and also raw blocks. | 4 | ;; they're a good time. |
5 | => https://example.com and links! | ||
6 | 5 | ||
7 | >>> | 6 | `ht |
8 | rawblock example1: all of them, & more <hi!> | 7 | is a quasi-line-based markup language that takes inspiration from |
9 | ## fee fi fo fum | 8 | @https://gemini.circumlunar.space/docs/gemtext.gmi gemtext\ |
10 | <<< | 9 | , |
10 | @https://daringfireball.net/projects/markdown/ markdown\ | ||
11 | , and others. | ||
12 | Its aim is to be somewhat easy to read while being fairly easy to parse. | ||
11 | 13 | ||
12 | ## just html | 14 | In fact, |
13 | but over two lines | 15 | `ht |
16 | is a simple awk script. | ||
14 | 17 | ||
15 | >>> html | 18 | ## Usage |
16 | rawblock example2: just html | ||
17 | hey adora | ||
18 | <<< | ||
19 | 19 | ||
20 | ### not html | 20 | - one |
21 | - two | ||
22 | - three | ||
21 | 23 | ||
22 | >>> -html | 24 | ordered list: |
23 | rawblock example3: everything /but/ html | ||
24 | # with a header inside, blah | ||
25 | <<< | ||
26 | 25 | ||
27 | and finally, the end of the file. | 26 | % one |
27 | % two | ||
28 | % three | ||
29 | |||
30 | ``` | ||
31 | ./ht.awk source.ht | ||
32 | ``` | ||
diff --git a/test.txt b/test.txt deleted file mode 100644 index 8c47543..0000000 --- a/test.txt +++ /dev/null | |||
@@ -1,24 +0,0 @@ | |||
1 | html <p> | ||
2 | html here's a test for ht.awk. it's got paragraphs (these bad boys), long lines and such, | ||
3 | html and also raw blocks. | ||
4 | html </p> | ||
5 | html </code></pre> | ||
6 | html </code></pre> | ||
7 | html <p> | ||
8 | html and finally, the end of the file. | ||
9 | html </p> | ||
10 | gemini here's a test for ht.awk. it's got paragraphs (these bad boys), long lines and such, and also raw blocks. | ||
11 | gemini | ||
12 | gemini ``` | ||
13 | gemini rawblock example1: all of them. | ||
14 | gemini fee fi fo fum | ||
15 | gemini ``` | ||
16 | gemini | ||
17 | gemini and finally, the end of the file. | ||
18 | gemini | ||
19 | gopher here's a test for ht.awk. it's got paragraphs (these bad boys), long lines and such, | ||
20 | gopher and also raw blocks. | ||
21 | gopher | ||
22 | gopher | ||
23 | gopher and finally, the end of the file. | ||
24 | gopher | ||