about summary refs log tree commit diff stats

# Postcard protocol, version 1

The POSTCARD PROTOCOL is a federated asynchronous communication protocol that
aims to simulate the experience of sending postcards through the physical mail
online.  A NETWORK consists of a series of servers, known as POST OFFICES, which
provide a limited number of user accounts (P.O. BOXES) that can send and receive
short messages, called POSTCARDS.  The protocol is implemented on top of UDP and
tries to be relatively secure and mitigative against abuse.

The postcard protocol is in ALPHA status and may change while we're working
toward a version 1.0.  Below, each part of the protocol is described.

## POSTCARD.

A POSTCARD is a datagram that fits within a single UDP packet.  While the
*actual* limit to UDP packet size is 2^16 bits[1], the number I've seen around
the internet for a *safe* UDP packet size is more like 512 bytes.  So that's
what we're going with --- especially because a physical postcard is much closer
in data size to 512 bytes than 2^16 or even 1500 (another limit mentioned in
Julia Evans's blog referenced above).

A POSTCARD's size is further restricted (though not by much) by a short header
with the following fields:

* 2b MAGIC NUMBER: the ASCII codepoints of the letters "PC" (for postcard)
* 1b VERSION: The version of the postcard protocol being used
* 1b ENCODING: The text encoding of the postcard's message (see Appendix A)
* 1b TOBOX: The P.O. BOX of the message recipient
* 1b FROMBOX: The P.O. BOX of the message sender

This header is 6 bytes long, leaving 506 bytes for message text:

```
,___________________________________________________________________.
|________________ POSTCARD PROTOCOL DATAGRAM SKETCH ________________|
> "PC" (2 bytes) |version,encoding|to(1b)  from(1b)|message(506b)   |
|0101000001000011|0000000100000000|ttttttttffffffff| . . . . . . . .|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
```

## P.O. BOX.

## POST OFFICE.

## NETWORK.

### APPENDIX A. Encoding table.

While UTF-8 is the answer for almost every question of encoding in the modern
day, on a byte-restricted format like the postcard protocol I think it's unfair
for non-English speakers to be forced to use double or even triple the bytes to
send the same message.  Therefore, the ENCODING field of a postcard's metadata
corresponds to the message encoding, which allows senders to choose a more
storage-friendly encoding for their messages.

ENCODING is stored as a 1-byte number, allowing for 256 possible encodings.  As
of Postcard protocol v. 1, these are recognized:

``` | table
0	UTF-8
1	ASCII
2	EBCDIC
3	ISO 8859-1 Western Europe
4	ISO 8859-2 Western and Central Europe
5	ISO 8859-3 Western Europe and South European
6	ISO 8859-4 Western Europe and Baltic countries
7	ISO 8859-5 Cyrillic alphabet
8	ISO 8859-6 Arabic
9	ISO 8859-7 Greek
10	ISO 8859-8 Hebrew
11	ISO 8859-9 Western Europe with amended Turkish character set
12	ISO 8859-10 Western Europe with rationalized Nordic character set
13	ISO 8859-11 Thai
14	ISO 8859-13 Baltic languages plus Polish
15	ISO 8859-14 Celtic languages
16	ISO 8859-15 ISO 8859-1 with rationalizations
17	ISO 8859-16 Central, Eastern and Southern European languages
18	CP437
19	CP720
20	CP737
21	CP850
22	CP852
23	CP855
24	CP857
25	CP858
26	CP860
27	CP861
28	CP862
29	CP863
30	CP865
31	CP866
32	CP869
33	CP872
34	Windows-1250 for Central European languages that use Latin script
35	Windows-1251 for Cyrillic alphabets
36	Windows-1252 for Western languages
37	Windows-1253 for Greek
38	Windows-1254 for Turkish
39	Windows-1255 for Hebrew
40	Windows-1256 for Arabic
41	Windows-1257 for Baltic languages
42	Windows-1258 for Vietnamese
43	Mac OS Roman
44	KOI8-R
45	KOI8-U
46	KOI7
47	MIK
48	ISCII
49	TSCII
50	VISCII
51	Shift JIS
52	EUC-JP
53	ISO-2022-JP
54	JIS X 0213
55	Shift_JIS-2004
56	EUC-JIS-2004
57	ISO-2022-JP-2004
58	GB 2312
59	GBK (Microsoft Code page 936)
60	GB 18030
61	Taiwan Big5 (a more famous variant is Microsoft Code page 950)
62	Hong Kong HKSCS
63	Korean
64	EUC-KR
65	ISO-2022-KR
```

This list is pulled from Wikipedia's entry on common character encodings[2], so
it may need to be revised.

## END NOTES.

=> https://jvns.ca/blog/2017/02/07/mtu/ [1]: J. Evans. "How big can a packet get?"
=> https://en.wikipedia.org/wiki/Character_encoding#Common_character_encodings [2]: Wikipedia.  "Character encoding"