Autocento of the breakfast table

about this site

Introduction

Autocento of the breakfast table is a hypertextual exploration of the workings of revision across time. Somebody[citation needed] once said that every relationship we have is part of the same relationship; the same is true of authorship. As we write, as we continue writing across our lives, patterns thread themselves through our work: images, certain phrases, preoccupations. This project attempts to make those threads more apparent, using the technology of hypertext and the opposing ideas of the hapax legomenon and the cento, held in tension with each other.

I’m also an MFA candidate at Northern Arizona University. This is my thesis. Let me tell you about it.

Hapax legomenon, or You are special

Hapax legomenon (ἅπαξ λεγόμενον) is Greek for “something said only once.” It comes from the field of corpus linguistics, where it causes problems for translators of ancient texts. Because it only happens once in its corpus, a hapax legmonenon is an enigma: there’s only one context to guess its meaning from. This means that many hapax legomena remain untranslated, as in Mayan tablets, or are questionably translated, as in the Bible.

Given the way we use language every day, treading over the same words and thoughts in a way that is nonetheless comforting, and given the fact that a hapax legomenon is, by its definition, the rarest word in the place it appears, you might think that hapax legomena, as phenomena, are rare. You’d be wrong. In the Brown Corpus of American English Text, which comprises some fifty thousand words, about half are hapax legomena. In most large corpora, in fact, between forty and sixty per cent of the words occur only once, and another ten to fifteen per cent occur only twice, a fact that I imagine causes translators all sorts of grief.

This seeming paradox is reminiscent of another in biology, as summed up by this infographic I keep seeing around the Internet1: Really. I see it everywhere.

Apparently, the chances of you, dear Reader, being born is something like one in 102,685,000. The chances of me being born is something like one in 102,685,000. The chances of the guy you stood behind in line for your coffee this morning? His chance of being born was something like one in 102,685,000. The thing is, a number like one in 102,685,000 stops meaning so much when we take the number of times such a “rare” event occurs. There are about seven billion (or 7×1097 \times 10^{9}) people on Earth—and all of them have that same small chance of one in 102,685,000 of being born. And they all were.

It stops seeming so special after thinking about it.

Cento, or just like everyone else

Cento is Latin, stolen from the Greek κέντρόνη, which means “patchwork garment.” A cento is a poem composed completely from parts of other poems, a mash-up that makes up for its lack of originality in utterance with a novelty in arrangement.

If we apply the cento to biology, we can win back some of that uniqueness, we can resolve some of that paradox of the hapax legomenon. Sure, nothing is new under the sun, but it can be made new if we say it differently, or if we put it next to something it hasn’t met before. We can become hosts to the parties of our lives, and rub elbows with the same tired celebrities everyone’s rubbed elbows with, but make it different. Because we put the tables on roller skates. Because we told the joke this time with a Rabbi. Because we are special snowflakes, and it doesn’t matter that there’s more of us than there is sand on the beaches at Normandy. Because we are still all different somehow.

On n-grams

What we have so far: - A hapax legomenon technically refers only to one word in a corpus. - A cento technically refers to a poem with whole phrases taken from others, patchwork-style.

These concepts get more interesting as we play with their scopes. To do that, we need to take a look at the n-gram.

In linguistics and computational probability, an n-gram is a contiguous system of n items from a given sequence of text or speech. By looking at n-grams, linguists can look at deeper trends in language than with single words alone2. N-grams are also incredibly useful in natural language processing—for example, they’re how your phone can guess what you’re going to text your mom next3. They’re also the key to fully reconciling the hapax legomenon and the cento.

If the definition of hapax legomena is expanded to include n-grams of arbitrary lengths, including full utterances, complete poems, or the collected works of, say, Shakespeare, then we can say that all writing is a hapax legomenon, because no one else has said the same words in the same order. In short, everything written or in existence is individual. Everything is differentiated. Everything is an island.

If the definition of what comprises a cento is minimized to individual trigrams, bigrams, or even unigrams (individual words), or even parts of words, we arrive again at Solomon’s lament: that no writing is original; that every utterance has, in some scrambled way at least, been uttered before. To put it another way, nothing is individual. We’re stranded afloat on an ocean of language we did nothing to create, and the best we can hope to accomplish is to find some combination of flotsam and jetsam that hasn’t been put together too many times before.

This project, Autocento of the breakfast table, works within the tension caused by hapax legomena and centi, between the first and last half of the statement we are all unique, just like everyone else.

Process

In compiling this text, I’ve pulled from a few different projects:

as well as new poems, written quite recently. As I’ve compiled them into this project, I’ve linked them together based on common images or language, disregarding the order of their compositions. What I hope to have accomplished with this hypertext is an approximation of my self as it’s evolved, but all at one time. Ultimately, Autocento of the breakfast table is a long-exposure photograph of my mind.

A note on terminology

Autocento of the breakfast table comprises work of multiple genres, including prose, verse, tables, lists, and hybrid forms. Because of this, and because of my own personal hang-ups with terms like poem applying to works that aren’t verse (and even some that are4), piece applying to anything, really (it’s just annoying, in my opinion—a piece of what?), I’ve needed to find another word to refer to all the stuff in this project. While the terms “literary object” and “intertext,” à la Kristeva et al., more fully describe the things I’ve been writing and linking in this text, I’m worried that these terms are either too long or too esoteric for me to refer to them consistently when talking about my work. I believe I’ve found a solution in the term page, as in a page or leaf of a book, or a page on a website. After all, the term page is accurate as it refers to the objects herein–each one is a page—and it’s short and unassuming. But it’s probably pretty pretentious, too.

The inevitable creep of technology

Because this project lives online (welcome to the Internet!), I’ve used a fair amount of technology to get it there.

First, I typed all of the objects present into a human-readable markup format called Markdown by John Gruber, using a plain-text editor called Vim.5 Markdown is a plain-text format that uses unobtrusive mark-up to signal semantic meaning around a text. A text written with markup can then be passed to a compiler, such as John Gruber’s Markdown.pl script, to turn it into functioning HTML for viewing in a browser.

As an example, here’s the previous paragraph as I typed it:

First, I typed all of the objects present into a human-readable markup
            format called [Markdown][] by John Gruber, using a plain-text editor called
            [Vim][].[^5]  Markdown is a plain-text format that uses unobtrusive mark-up to
            signal semantic meaning around a text.  A text written with markup can then be
            passed to a compiler, such as John Gruber's original Markdown.pl script, to
            turn it into functioning HTML for viewing in a browser.
            
            [Markdown]: http://daringfireball.net/projects/markdown/
            [Vim]: http://www.vim.org
            
            [^5]: I could've used any text editor for the composition step, including
                  Notepad, but I personally like Vim for its extensibility, composability,
                  and honestly its colorschemes.

And here it is as a compiled HTML file:

<p>First, I typed all of the objects present into a human-readable markup format called <a href="http://daringfireball.net/projects/markdown/">Markdown</a> by John Gruber, using a plain-text editor called <a href="http://www.vim.org">Vim</a>.<a href="#fn1" class="footnoteRef" id="fnref1"><sup>1</sup></a> Markdown is a plain-text format that uses unobtrusive mark-up to signal semantic meaning around a text. A text written with markup can then be passed to a compiler, such as John Gruber's original Markdown.pl script, to turn it into functioning HTML for viewing in a browser.</p>
            <section class="footnotes">
            <hr />
            <ol>
            <li id="fn1"><p>I could've used any text editor for the composition step, including Notepad, but I personally like Vim for its extensibility, composability, and honestly its colorschemes.<a href="#fnref1"></a></p></li>
            </ol>
            </section>

For these files, I opted to use John McFarlane’s pandoc over the original Markdown.pl compiler, because it’s more consistent with edge cases in formatting, and because it can compile the Markdown source into a wide variety of different formats, including DOCX, ODT, PDF, HTML, and others. I use an HTML template for pandoc to correctly typeset each object in the web browser. The compiled HTML pages are what you’re reading now.

Since typing pandoc [file].txt -t html5 --template=_template.html --filter=trunk/versify.exe --smart --mathml --section-divs -o [file].html over 130 times is highly tedious, I’ve written a GNU Makefile that automates the process. In addition to compiling the HTML files for this project, the Makefile also compiles each page’s backlinks (accessible through the φ link at the bottom of each page), and the indexes of first lines, common titles, and hapax legomena of this project.

Finally, this project needs to enter the realm of the Internet. To do this, I use Github, an online code-collaboration tool that uses the version-control system git under the hood. git was originally written to keep track of the source code of the Linux kernel.6 I use it to keep track of the revisions of the text files in Autocento of the breakfast table, which means that you, dear Reader, can explore the path of my revision even more deeply by viewing the Github repository for this project online.

For more information on the process I took while compiling Autocento of the breakfast table, see my Process page.

Motivation

Although git and the other tools I use were developed or are mostly used by programmers, engineers, or other kinds of scientists, they’re useful in creative writing as well for a few different reasons:

  1. Facilitation of revision. By using a VCS like git and plain text files, I can revise a poem (for example, “[And][]”) and keep both the current version and a [much older one][old-and]. This lets me hold onto every idea I’ve had, and “throw things away” without actually throwing them away. They’re still there, somewhere, in the source tree.
  2. Future proofness. By using a simple text editor to write out my files instead of a proprietary word processor, I’ve ensured that no matter what may happen to the stocks of Microsoft, Apple, or Google in the following hundred years, my words will stay accessible and editable. Also, I don’t know how to insert links in Word.
  3. Philosophy of intellectual property. I use open-source, or libre, tools like vim, pandoc, and make because information should be free. This is also the reason why I’m releasing Autocento of the breakfast table under a Creative Commons license.

Autocento of the breakfast table and you

Using this site

Since all of the objects in this project are linked, you can begin from, say, here and follow the links through everything. But if you find yourself lost as in a funhouse maze, looping around and around to the same stupid fountain at the entrance, here are a few tips:

  • The ξ link at the bottom of each page leads to a random article.
  • The φ link at the bottom of each page leads to its back-link page, which lists the titles of pages that link back to the page you were just on.
  • Finally, if you’re really desperate, the ◊ link sends you back to the cover page, where you can start over. The cover page links you to the table of contents, as well as the indexes of first lines, common titles, and hapax legomena.

Contact me

If you’d like to contact me about the state of this work, its history, or its future; or about my writing in general, email me at [case dot duckworth plus autocento at gmail dot com][].


  1. Which apparently, though not really surprisingly given the nature of the Internet, has its roots in this blog post.

  2. For more fun with n-grams, I recommend the curious reader to point their browsers to the Google Ngram Viewer, which searches “lots of books” from most of history that matters.

  3. For fun, try only typing with the suggested words for a while. At least for me, they start repeating “I’ll be a bar of the new York NY and I can be a bar of the new York NY and I can.”

  4. For more discussion of this subject, see “Ars poetica,” “How to read this,” “A manifesto of poetics,” “On formal poetry,” and The third section of “Statements: a fragment.”

  5. I could’ve used any text editor for the composition step, including Notepad, but I personally like Vim for its extensibility, composability, and honestly its colorschemes.

  6. As it happens, the week I’m writing this (6 April 2015) is git’s tenth anniversary. The folks at Atlassian have made an interactive timeline for the occasion, and Linux.com has an interesting interview with Linus Torvalds, git’s creator.