Move protocol.txt to readme.txt main

author: Case Duckworth 2024-06-04 12:29:05 -0500
committer: Case Duckworth 2024-06-04 12:29:05 -0500
commit: 9d59c162fa5f2c0d49feffb94694d76864dfccf3 (patch)
tree: b576efc07de70951e5394a4bbf492c40356295d4 /protocol.txt
parent: Add link to readme.txt (diff)
download: postcard-9d59c162fa5f2c0d49feffb94694d76864dfccf3.tar.gz
postcard-9d59c162fa5f2c0d49feffb94694d76864dfccf3.zip
1 files changed, 0 insertions, 134 deletions
diff --git a/protocol.txt b/protocol.txt
deleted file mode 100644
index 323a0d8..0000000
--- a/protocol.txt
+++ /dev/null

@@ -1,134 +0,0 @@
-# Postcard protocol, version 1
-The POSTCARD PROTOCOL is a federated asynchronous communication protocol that
-aims to simulate the experience of sending postcards through the physical mail
-online.  A NETWORK consists of a series of servers, known as POST OFFICES, which
-provide a limited number of user accounts (P.O. BOXES) that can send and receive
-short messages, called POSTCARDS.  The protocol is implemented on top of UDP and
-tries to be relatively secure and mitigative against abuse.
-The postcard protocol is in ALPHA status and may change while we're working
-toward a version 1.0.  Below, each part of the protocol is described.
-## POSTCARD.
-A POSTCARD is a datagram that fits within a single UDP packet.  While the
-*actual* limit to UDP packet size is 2^16 bits[1], the number I've seen around
-the internet for a *safe* UDP packet size is more like 512 bytes.  So that's
-what we're going with --- especially because a physical postcard is much closer
-in data size to 512 bytes than 2^16 or even 1500 (another limit mentioned in
-Julia Evans's blog referenced above).
-A POSTCARD's size is further restricted (though not by much) by a short header
-with the following fields:
-* 2b MAGIC NUMBER: the ASCII codepoints of the letters "PC" (for postcard)
-* 1b VERSION: The version of the postcard protocol being used
-* 1b ENCODING: The text encoding of the postcard's message (see Appendix A)
-* 1b TOBOX: The P.O. BOX of the message recipient
-* 1b FROMBOX: The P.O. BOX of the message sender
-This header is 6 bytes long, leaving 506 bytes for message text:
-```
-,___________________________________________________________________.
-|________________ POSTCARD PROTOCOL DATAGRAM SKETCH ________________|
-> "PC" (2 bytes) |version,encoding|to(1b)  from(1b)|message(506b)   |
-|0101000001000011|0000000100000000|ttttttttffffffff| . . . . . . . .|
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-```
-## P.O. BOX.
-## POST OFFICE.
-## NETWORK.
-### APPENDIX A. Encoding table.
-While UTF-8 is the answer for almost every question of encoding in the modern
-day, on a byte-restricted format like the postcard protocol I think it's unfair
-for non-English speakers to be forced to use double or even triple the bytes to
-send the same message.  Therefore, the ENCODING field of a postcard's metadata
-corresponds to the message encoding, which allows senders to choose a more
-storage-friendly encoding for their messages.
-ENCODING is stored as a 1-byte number, allowing for 256 possible encodings.  As
-of Postcard protocol v. 1, these are recognized:
-``` | table
-0       UTF-8
-1       ASCII
-2       EBCDIC
-3       ISO 8859-1 Western Europe
-4       ISO 8859-2 Western and Central Europe
-5       ISO 8859-3 Western Europe and South European
-6       ISO 8859-4 Western Europe and Baltic countries
-7       ISO 8859-5 Cyrillic alphabet
-8       ISO 8859-6 Arabic
-9       ISO 8859-7 Greek
-10      ISO 8859-8 Hebrew
-11      ISO 8859-9 Western Europe with amended Turkish character set
-12      ISO 8859-10 Western Europe with rationalized Nordic character set
-13      ISO 8859-11 Thai
-14      ISO 8859-13 Baltic languages plus Polish
-15      ISO 8859-14 Celtic languages
-16      ISO 8859-15 ISO 8859-1 with rationalizations
-17      ISO 8859-16 Central, Eastern and Southern European languages
-18      CP437
-19      CP720
-20      CP737
-21      CP850
-22      CP852
-23      CP855
-24      CP857
-25      CP858
-26      CP860
-27      CP861
-28      CP862
-29      CP863
-30      CP865
-31      CP866
-32      CP869
-33      CP872
-34      Windows-1250 for Central European languages that use Latin script
-35      Windows-1251 for Cyrillic alphabets
-36      Windows-1252 for Western languages
-37      Windows-1253 for Greek
-38      Windows-1254 for Turkish
-39      Windows-1255 for Hebrew
-40      Windows-1256 for Arabic
-41      Windows-1257 for Baltic languages
-42      Windows-1258 for Vietnamese
-43      Mac OS Roman
-44      KOI8-R
-45      KOI8-U
-46      KOI7
-47      MIK
-48      ISCII
-49      TSCII
-50      VISCII
-51      Shift JIS
-52      EUC-JP
-53      ISO-2022-JP
-54      JIS X 0213
-55      Shift_JIS-2004
-56      EUC-JIS-2004
-57      ISO-2022-JP-2004
-58      GB 2312
-59      GBK (Microsoft Code page 936)
-60      GB 18030
-61      Taiwan Big5 (a more famous variant is Microsoft Code page 950)
-62      Hong Kong HKSCS
-63      Korean
-64      EUC-KR
-65      ISO-2022-KR
-```
-This list is pulled from Wikipedia's entry on common character encodings[2], so
-it may need to be revised.
-## END NOTES.
-=> https://jvns.ca/blog/2017/02/07/mtu/ [1]: J. Evans. "How big can a packet get?"
-=> https://en.wikipedia.org/wiki/Character_encoding#Common_character_encodings [2]: Wikipedia.  "Character encoding"
author	Case Duckworth	2024-06-04 12:29:05 -0500
committer	Case Duckworth	2024-06-04 12:29:05 -0500
commit	9d59c162fa5f2c0d49feffb94694d76864dfccf3 (patch)
tree	b576efc07de70951e5394a4bbf492c40356295d4 /protocol.txt
parent	Add link to readme.txt (diff)
download	postcard-9d59c162fa5f2c0d49feffb94694d76864dfccf3.tar.gz postcard-9d59c162fa5f2c0d49feffb94694d76864dfccf3.zip

diff --git a/protocol.txt b/protocol.txt deleted file mode 100644 index 323a0d8..0000000 --- a/protocol.txt +++ /dev/null
@@ -1,134 +0,0 @@
1	# Postcard protocol, version 1
2
3	The POSTCARD PROTOCOL is a federated asynchronous communication protocol that
4	aims to simulate the experience of sending postcards through the physical mail
5	online. A NETWORK consists of a series of servers, known as POST OFFICES, which
6	provide a limited number of user accounts (P.O. BOXES) that can send and receive
7	short messages, called POSTCARDS. The protocol is implemented on top of UDP and
8	tries to be relatively secure and mitigative against abuse.
9
10	The postcard protocol is in ALPHA status and may change while we're working
11	toward a version 1.0. Below, each part of the protocol is described.
12
13	## POSTCARD.
14
15	A POSTCARD is a datagram that fits within a single UDP packet. While the
16	actual limit to UDP packet size is 2^16 bits[1], the number I've seen around
17	the internet for a safe UDP packet size is more like 512 bytes. So that's
18	what we're going with --- especially because a physical postcard is much closer
19	in data size to 512 bytes than 2^16 or even 1500 (another limit mentioned in
20	Julia Evans's blog referenced above).
21
22	A POSTCARD's size is further restricted (though not by much) by a short header
23	with the following fields:
24
25	* 2b MAGIC NUMBER: the ASCII codepoints of the letters "PC" (for postcard)
26	* 1b VERSION: The version of the postcard protocol being used
27	* 1b ENCODING: The text encoding of the postcard's message (see Appendix A)
28	* 1b TOBOX: The P.O. BOX of the message recipient
29	* 1b FROMBOX: The P.O. BOX of the message sender
30
31	This header is 6 bytes long, leaving 506 bytes for message text:
32
33	```
34	,___________________________________________________________________.
35	\|________________ POSTCARD PROTOCOL DATAGRAM SKETCH ________________\|
36	> "PC" (2 bytes) \|version,encoding\|to(1b) from(1b)\|message(506b) \|
37	\|0101000001000011\|0000000100000000\|ttttttttffffffff\| . . . . . . . .\|
38	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
39	```
40
41	## P.O. BOX.
42
43	## POST OFFICE.
44
45	## NETWORK.
46
47	### APPENDIX A. Encoding table.
48
49	While UTF-8 is the answer for almost every question of encoding in the modern
50	day, on a byte-restricted format like the postcard protocol I think it's unfair
51	for non-English speakers to be forced to use double or even triple the bytes to
52	send the same message. Therefore, the ENCODING field of a postcard's metadata
53	corresponds to the message encoding, which allows senders to choose a more
54	storage-friendly encoding for their messages.
55
56	ENCODING is stored as a 1-byte number, allowing for 256 possible encodings. As
57	of Postcard protocol v. 1, these are recognized:
58
59	``` \| table
60	0 UTF-8
61	1 ASCII
62	2 EBCDIC
63	3 ISO 8859-1 Western Europe
64	4 ISO 8859-2 Western and Central Europe
65	5 ISO 8859-3 Western Europe and South European
66	6 ISO 8859-4 Western Europe and Baltic countries
67	7 ISO 8859-5 Cyrillic alphabet
68	8 ISO 8859-6 Arabic
69	9 ISO 8859-7 Greek
70	10 ISO 8859-8 Hebrew
71	11 ISO 8859-9 Western Europe with amended Turkish character set
72	12 ISO 8859-10 Western Europe with rationalized Nordic character set
73	13 ISO 8859-11 Thai
74	14 ISO 8859-13 Baltic languages plus Polish
75	15 ISO 8859-14 Celtic languages
76	16 ISO 8859-15 ISO 8859-1 with rationalizations
77	17 ISO 8859-16 Central, Eastern and Southern European languages
78	18 CP437
79	19 CP720
80	20 CP737
81	21 CP850
82	22 CP852
83	23 CP855
84	24 CP857
85	25 CP858
86	26 CP860
87	27 CP861
88	28 CP862
89	29 CP863
90	30 CP865
91	31 CP866
92	32 CP869
93	33 CP872
94	34 Windows-1250 for Central European languages that use Latin script
95	35 Windows-1251 for Cyrillic alphabets
96	36 Windows-1252 for Western languages
97	37 Windows-1253 for Greek
98	38 Windows-1254 for Turkish
99	39 Windows-1255 for Hebrew
100	40 Windows-1256 for Arabic
101	41 Windows-1257 for Baltic languages
102	42 Windows-1258 for Vietnamese
103	43 Mac OS Roman
104	44 KOI8-R
105	45 KOI8-U
106	46 KOI7
107	47 MIK
108	48 ISCII
109	49 TSCII
110	50 VISCII
111	51 Shift JIS
112	52 EUC-JP
113	53 ISO-2022-JP
114	54 JIS X 0213
115	55 Shift_JIS-2004
116	56 EUC-JIS-2004
117	57 ISO-2022-JP-2004
118	58 GB 2312
119	59 GBK (Microsoft Code page 936)
120	60 GB 18030
121	61 Taiwan Big5 (a more famous variant is Microsoft Code page 950)
122	62 Hong Kong HKSCS
123	63 Korean
124	64 EUC-KR
125	65 ISO-2022-KR
126	```
127
128	This list is pulled from Wikipedia's entry on common character encodings[2], so
129	it may need to be revised.
130
131	## END NOTES.
132
133	=> https://jvns.ca/blog/2017/02/07/mtu/ [1]: J. Evans. "How big can a packet get?"
134	=> https://en.wikipedia.org/wiki/Character_encoding#Common_character_encodings [2]: Wikipedia. "Character encoding"