about summary refs log tree commit diff stats
path: root/ht.txt
blob: c409335c0a9427ba266562380b3dbf9b9a397e33 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
# HAT TRICK

HAT TRICK is both a lightweight markup language inspired by gemtext and html,
and this awk program to convert the markup language to gemtext, html, and
gophermap markup.  It uses a mixture of "block"-level and line-level sigils to
extend the pure line-based markup of gemtext, while removing some of the more
annoying points (to my mind) of writing pure html---i.e., repetitive tags and
other boilerplate.

## Syntax

### Blocks

In HAT TRICK, block of text separated by a blank line is a type of "block."  The
default block is a paragraph, or <p> tag in html.  (In gemini and gophermaps, no
extra tags are added.)  Other blocks defined by the syntax are as follows:

>>>
#	HEADING 1
##	HEADING 2
###	HEADING 3
<<<

Correspond to <hx> in html; passed through unmodified to gemtext and gophermaps.

>>>
>	BLOCK QUOTE
<<<

Corresponds to <blockquote>; passed through unmodified to gemtext and
gophermaps.

>>>
-	UNORDERED LIST ITEM
1.	ORDERED LIST ITEM
<<<

The first list item in a block automatically opens the necessary list tag in
html.  In gemtext, the "-" is converted to "*" (which signifies a list item);
the hyphen is passed through in gophermaps, because I think it's better syntax
personally.

>>>
---	SECTION BREAK
<<<

A visual indication to break sections.  Corresponds to <hr> in html
(TODO: consider html5 <section> tags --- this would take more logic.) 

HAT TRICK reflows blocks, which means that only the first line of a block
needs to start with the sigils outlined above.  However, each line of the block
can begin with the sigil character for easier reading.

### Lines

Within blocks, there are certain other sigils that apply only to the line they
prepend.  They include the following:

>>>
=>	LINK
<<<

Links are probably the most important element in any hypertext language---since
without them, it's hardly hypertext.  HAT TRICK borrows its link syntax from
gemtext: the line starts with a "=>", the next field is the link's URL, and the
rest of the line is the link's display text.

>>>
<TAG>	HTML TAG
<<<

Lines beginning with an html tag are passed on to html verbatim.  The closing
tag is automatically appended to the end of the line, before any ending
punctuation.  I've found that 99 times out of 100, I don't want formatting to
include the ending punctuation.

A backslash (\) at the end of the tag line will prevent the tag from being
ended, which is useful for tag-included punctuation as well as nesting tags.
However, the tag is never closed, so you'll have to close it yourself on the
next line.  In addition, text that isn't in a tag is html-escaped, so for the
markup to properly apply, you'll need to write something like this:

>>>
<b>She sells \
<i>sea shells \
</i>
on the sea shore.\
</b>
<<<

So while this markup is possible, it's discouraged through the awkwardness of
the syntax.

To translate these tags to meaningful markup in gemtext and gophermaps, a lookup
table is used to correspond the tags to opening and closing characters around
the line's text.  This correspondance can be defined with the environment
variable HT_TAGCHARS or HAT TRICK's second positional argument (See INVOCATION,
below).

>>>
;	COMMENT
<<<

Comments in HAT TRICK aren't passed on to the output text---even in html, which
has a comment syntax.  Instead, comments are passed, including the prepending
semicolon, to standard error for further processing.

### Verbatim blocks

Finally, there is a special type of block for passing raw text through to the
next phase of processing: the verbatim block.

>>>
 >>> [OUTPUTS]
 VERBATIM TEXT
 <<<
<<<

In html, the verbatim text is wrapped in <pre><code> tags; in gemtext, it's
wrapped in gemtext's own verbatim text markers ```; and the text is unwrapped in
gophermaps for a cleaner look.

The OUTPUTS can be any output specifier HAT TRICK accepts; see INVOCATION below
for details.  If OUTPUTS is present, the verbatim text will only be passed
through in the output formats listed; with no OUTPUTS listed it will output to
all formats.

## "Escaping" line- and block-types

Each of the types listed above are anchored at the beginning of the line.
Therefore, a simple "escaping" mechanism is available for free: simply prepend a
space to any line you don't want processed as a line or block and you'll be
gravy.  Astute readers will notice I did just that above, to describe the syntax
for verbatim fencing.

## Invocation

An invocation of HAT TRICK will look something like this:

>>>
./ht.awk [HT_FORMATS] [HT_TAGCHARS] < INPUT
<<<

It processes text from standard input and uses two positional parameters to
customize its usage, in addition to environment variables.  In each instance,
the parameter will override the variable, and if neither are provided, HAT TRICK
will choose a default.

### HT_FORMATS (default: "html")

The format(s) to export to.  Can be one or more of "html", "gemini", and
"gopher".  As a convenience, a format can be prepended with a "-" (i.e.,
"-html"), in which case every other format will be exported to.  Multiple
formats can also be specified by separating them with a comma.  The special
keyword "all" will export to all formats (this is the default).

If HAT TRICK exports to one format, it will simply print out each line
translated into that format.  However, if more than one format is given, 
HAT TRICK prints each line multiple times, prepending the name of the format to
the output.  This allows for further processing to filter outputs according to
output type with just one pass through the input.

### HT_TAGCHARS (default: 'b:**,i://,code:``')

The correspondance between html tag lines and other output formats.  If
HT_FORMAT is only html, this option has no real meaning.

Each correspondance is of the (exploded) form

>>>
TAG : LEFT_CHAR RIGHT_CHAR
<<<

where TAG is the html tag, LEFT_CHAR is the character on the left of the
enclosed text, and RIGHT_CHAR is the character on the right.  Rules can be
separated by commas to pass multiple ones to HAT TRICK.