From 9d59c162fa5f2c0d49feffb94694d76864dfccf3 Mon Sep 17 00:00:00 2001 From: Case Duckworth Date: Tue, 4 Jun 2024 12:29:05 -0500 Subject: Move protocol.txt to readme.txt --- protocol.txt | 134 ----------------------------------------------------------- 1 file changed, 134 deletions(-) delete mode 100644 protocol.txt (limited to 'protocol.txt') diff --git a/protocol.txt b/protocol.txt deleted file mode 100644 index 323a0d8..0000000 --- a/protocol.txt +++ /dev/null @@ -1,134 +0,0 @@ -# Postcard protocol, version 1 - -The POSTCARD PROTOCOL is a federated asynchronous communication protocol that -aims to simulate the experience of sending postcards through the physical mail -online. A NETWORK consists of a series of servers, known as POST OFFICES, which -provide a limited number of user accounts (P.O. BOXES) that can send and receive -short messages, called POSTCARDS. The protocol is implemented on top of UDP and -tries to be relatively secure and mitigative against abuse. - -The postcard protocol is in ALPHA status and may change while we're working -toward a version 1.0. Below, each part of the protocol is described. - -## POSTCARD. - -A POSTCARD is a datagram that fits within a single UDP packet. While the -*actual* limit to UDP packet size is 2^16 bits[1], the number I've seen around -the internet for a *safe* UDP packet size is more like 512 bytes. So that's -what we're going with --- especially because a physical postcard is much closer -in data size to 512 bytes than 2^16 or even 1500 (another limit mentioned in -Julia Evans's blog referenced above). - -A POSTCARD's size is further restricted (though not by much) by a short header -with the following fields: - -* 2b MAGIC NUMBER: the ASCII codepoints of the letters "PC" (for postcard) -* 1b VERSION: The version of the postcard protocol being used -* 1b ENCODING: The text encoding of the postcard's message (see Appendix A) -* 1b TOBOX: The P.O. BOX of the message recipient -* 1b FROMBOX: The P.O. BOX of the message sender - -This header is 6 bytes long, leaving 506 bytes for message text: - -``` -,___________________________________________________________________. -|________________ POSTCARD PROTOCOL DATAGRAM SKETCH ________________| -> "PC" (2 bytes) |version,encoding|to(1b) from(1b)|message(506b) | -|0101000001000011|0000000100000000|ttttttttffffffff| . . . . . . . .| -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -``` - -## P.O. BOX. - -## POST OFFICE. - -## NETWORK. - -### APPENDIX A. Encoding table. - -While UTF-8 is the answer for almost every question of encoding in the modern -day, on a byte-restricted format like the postcard protocol I think it's unfair -for non-English speakers to be forced to use double or even triple the bytes to -send the same message. Therefore, the ENCODING field of a postcard's metadata -corresponds to the message encoding, which allows senders to choose a more -storage-friendly encoding for their messages. - -ENCODING is stored as a 1-byte number, allowing for 256 possible encodings. As -of Postcard protocol v. 1, these are recognized: - -``` | table -0 UTF-8 -1 ASCII -2 EBCDIC -3 ISO 8859-1 Western Europe -4 ISO 8859-2 Western and Central Europe -5 ISO 8859-3 Western Europe and South European -6 ISO 8859-4 Western Europe and Baltic countries -7 ISO 8859-5 Cyrillic alphabet -8 ISO 8859-6 Arabic -9 ISO 8859-7 Greek -10 ISO 8859-8 Hebrew -11 ISO 8859-9 Western Europe with amended Turkish character set -12 ISO 8859-10 Western Europe with rationalized Nordic character set -13 ISO 8859-11 Thai -14 ISO 8859-13 Baltic languages plus Polish -15 ISO 8859-14 Celtic languages -16 ISO 8859-15 ISO 8859-1 with rationalizations -17 ISO 8859-16 Central, Eastern and Southern European languages -18 CP437 -19 CP720 -20 CP737 -21 CP850 -22 CP852 -23 CP855 -24 CP857 -25 CP858 -26 CP860 -27 CP861 -28 CP862 -29 CP863 -30 CP865 -31 CP866 -32 CP869 -33 CP872 -34 Windows-1250 for Central European languages that use Latin script -35 Windows-1251 for Cyrillic alphabets -36 Windows-1252 for Western languages -37 Windows-1253 for Greek -38 Windows-1254 for Turkish -39 Windows-1255 for Hebrew -40 Windows-1256 for Arabic -41 Windows-1257 for Baltic languages -42 Windows-1258 for Vietnamese -43 Mac OS Roman -44 KOI8-R -45 KOI8-U -46 KOI7 -47 MIK -48 ISCII -49 TSCII -50 VISCII -51 Shift JIS -52 EUC-JP -53 ISO-2022-JP -54 JIS X 0213 -55 Shift_JIS-2004 -56 EUC-JIS-2004 -57 ISO-2022-JP-2004 -58 GB 2312 -59 GBK (Microsoft Code page 936) -60 GB 18030 -61 Taiwan Big5 (a more famous variant is Microsoft Code page 950) -62 Hong Kong HKSCS -63 Korean -64 EUC-KR -65 ISO-2022-KR -``` - -This list is pulled from Wikipedia's entry on common character encodings[2], so -it may need to be revised. - -## END NOTES. - -=> https://jvns.ca/blog/2017/02/07/mtu/ [1]: J. Evans. "How big can a packet get?" -=> https://en.wikipedia.org/wiki/Character_encoding#Common_character_encodings [2]: Wikipedia. "Character encoding" -- cgit 1.4.1-21-gabe81