about summary refs log tree commit diff stats
path: root/protocol.txt
diff options
context:
space:
mode:
Diffstat (limited to 'protocol.txt')
-rw-r--r--protocol.txt134
1 files changed, 0 insertions, 134 deletions
diff --git a/protocol.txt b/protocol.txt deleted file mode 100644 index 323a0d8..0000000 --- a/protocol.txt +++ /dev/null
@@ -1,134 +0,0 @@
1# Postcard protocol, version 1
2
3The POSTCARD PROTOCOL is a federated asynchronous communication protocol that
4aims to simulate the experience of sending postcards through the physical mail
5online. A NETWORK consists of a series of servers, known as POST OFFICES, which
6provide a limited number of user accounts (P.O. BOXES) that can send and receive
7short messages, called POSTCARDS. The protocol is implemented on top of UDP and
8tries to be relatively secure and mitigative against abuse.
9
10The postcard protocol is in ALPHA status and may change while we're working
11toward a version 1.0. Below, each part of the protocol is described.
12
13## POSTCARD.
14
15A POSTCARD is a datagram that fits within a single UDP packet. While the
16*actual* limit to UDP packet size is 2^16 bits[1], the number I've seen around
17the internet for a *safe* UDP packet size is more like 512 bytes. So that's
18what we're going with --- especially because a physical postcard is much closer
19in data size to 512 bytes than 2^16 or even 1500 (another limit mentioned in
20Julia Evans's blog referenced above).
21
22A POSTCARD's size is further restricted (though not by much) by a short header
23with the following fields:
24
25* 2b MAGIC NUMBER: the ASCII codepoints of the letters "PC" (for postcard)
26* 1b VERSION: The version of the postcard protocol being used
27* 1b ENCODING: The text encoding of the postcard's message (see Appendix A)
28* 1b TOBOX: The P.O. BOX of the message recipient
29* 1b FROMBOX: The P.O. BOX of the message sender
30
31This header is 6 bytes long, leaving 506 bytes for message text:
32
33```
34,___________________________________________________________________.
35|________________ POSTCARD PROTOCOL DATAGRAM SKETCH ________________|
36> "PC" (2 bytes) |version,encoding|to(1b) from(1b)|message(506b) |
37|0101000001000011|0000000100000000|ttttttttffffffff| . . . . . . . .|
38^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
39```
40
41## P.O. BOX.
42
43## POST OFFICE.
44
45## NETWORK.
46
47### APPENDIX A. Encoding table.
48
49While UTF-8 is the answer for almost every question of encoding in the modern
50day, on a byte-restricted format like the postcard protocol I think it's unfair
51for non-English speakers to be forced to use double or even triple the bytes to
52send the same message. Therefore, the ENCODING field of a postcard's metadata
53corresponds to the message encoding, which allows senders to choose a more
54storage-friendly encoding for their messages.
55
56ENCODING is stored as a 1-byte number, allowing for 256 possible encodings. As
57of Postcard protocol v. 1, these are recognized:
58
59``` | table
600 UTF-8
611 ASCII
622 EBCDIC
633 ISO 8859-1 Western Europe
644 ISO 8859-2 Western and Central Europe
655 ISO 8859-3 Western Europe and South European
666 ISO 8859-4 Western Europe and Baltic countries
677 ISO 8859-5 Cyrillic alphabet
688 ISO 8859-6 Arabic
699 ISO 8859-7 Greek
7010 ISO 8859-8 Hebrew
7111 ISO 8859-9 Western Europe with amended Turkish character set
7212 ISO 8859-10 Western Europe with rationalized Nordic character set
7313 ISO 8859-11 Thai
7414 ISO 8859-13 Baltic languages plus Polish
7515 ISO 8859-14 Celtic languages
7616 ISO 8859-15 ISO 8859-1 with rationalizations
7717 ISO 8859-16 Central, Eastern and Southern European languages
7818 CP437
7919 CP720
8020 CP737
8121 CP850
8222 CP852
8323 CP855
8424 CP857
8525 CP858
8626 CP860
8727 CP861
8828 CP862
8929 CP863
9030 CP865
9131 CP866
9232 CP869
9333 CP872
9434 Windows-1250 for Central European languages that use Latin script
9535 Windows-1251 for Cyrillic alphabets
9636 Windows-1252 for Western languages
9737 Windows-1253 for Greek
9838 Windows-1254 for Turkish
9939 Windows-1255 for Hebrew
10040 Windows-1256 for Arabic
10141 Windows-1257 for Baltic languages
10242 Windows-1258 for Vietnamese
10343 Mac OS Roman
10444 KOI8-R
10545 KOI8-U
10646 KOI7
10747 MIK
10848 ISCII
10949 TSCII
11050 VISCII
11151 Shift JIS
11252 EUC-JP
11353 ISO-2022-JP
11454 JIS X 0213
11555 Shift_JIS-2004
11656 EUC-JIS-2004
11757 ISO-2022-JP-2004
11858 GB 2312
11959 GBK (Microsoft Code page 936)
12060 GB 18030
12161 Taiwan Big5 (a more famous variant is Microsoft Code page 950)
12262 Hong Kong HKSCS
12363 Korean
12464 EUC-KR
12565 ISO-2022-KR
126```
127
128This list is pulled from Wikipedia's entry on common character encodings[2], so
129it may need to be revised.
130
131## END NOTES.
132
133=> https://jvns.ca/blog/2017/02/07/mtu/ [1]: J. Evans. "How big can a packet get?"
134=> https://en.wikipedia.org/wiki/Character_encoding#Common_character_encodings [2]: Wikipedia. "Character encoding"