The Information (12 page)

Read The Information Online

Authors: James Gleick

Tags: #Non-Fiction

BOOK: The Information
13.54Mb size Format: txt, pdf, ePub

For Cawdrey the dictionary was a snapshot; he could not see past his moment in time. Samuel Johnson was more explicitly aware of the dictionary’s historical dimension. He justified his ambitious program in
part as a means of bringing a wild thing under control—the wild thing being the language, “which, while it was employed in the cultivation of every species of literature, has itself been hitherto neglected; suffered to spread, under the direction of chance, into wild exuberance; resigned to the tyranny of time and fashion; and exposed to the corruptions of ignorance, and caprices of innovation.”

Not until the
OED
, though, did lexicography attempt to reveal the whole shape of a language across time. The
OED
becomes a historical panorama. The project gains poignancy if the electronic age is seen as a new age of orality, the word breaking free from the bonds of cold print. No publishing institution better embodies those bonds, but the
OED
, too, tries to throw them off. The editors feel they can no longer wait for a new word to appear in print, let alone in a respectably bound book, before they must take note. For
tighty-whities
(men’s underwear), new in 2007, they cite a typescript of North Carolina campus slang. For
kitesurfer
, they cite a posting to the Usenet newsgroup alt.kite and later a New Zealand newspaper found via an online database. Bits in the ether.

When Murray began work on the new dictionary, the idea was to find the words, and with them the signposts to their history. No one had any idea how many words were there to be found. By then the best and most comprehensive dictionary of English was American: Noah Webster’s, seventy thousand words. That was a baseline. Where were the rest to be discovered? For the first editors of what became the
OED
, it went almost without saying that the source, the wellspring, should be the literature of the language—particularly the books of distinction and quality. The dictionary’s first readers combed Milton and Shakespeare (still the single most quoted author, with more than thirty thousand references), Fielding and Swift, histories and sermons, philosophers and poets. Murray announced in a famous public appeal in 1879:

A thousand readers are wanted. The later sixteenth-century literature is very fairly done; yet here several books remain to be read. The seventeenth century, with so many more writers, naturally shows still more unexplored territory.

 
 

He considered the territory to be large but bounded. The founders of the dictionary explicitly meant to find every word, however many that would ultimately be. They planned a complete inventory. Why should they not? The number of books was unknown but not unlimited, and the number of words in those books was countable. The task seemed formidable but finite.

It no longer seems finite. Lexicographers are accepting the language’s boundlessness. They know by heart Murray’s famous remark: “The circle of the English language has a well-defined centre but no discernable circumference.” In the center are the words everyone knows. At the edges, where Murray placed slang and cant and scientific jargon and foreign border crossers, everyone’s sense of the language differs and no one’s can be called “standard.”

Murray called the center “well defined,” but infinitude and fuzziness can be seen there. The easiest, most common words—the words Cawdrey had no thought of including—require, in the
OED
, the most extensive entries. The entry for
make
alone would fill a book: it teases apart ninety-eight distinct senses of the verb, and some of these senses have a dozen or more subsenses. Samuel Johnson saw the problem with these words and settled on a solution: he threw up his hands.

My labor has likewise been much increased by a class of verbs too frequent in the English language, of which the signification is so loose and general, the use so vague and indeterminate, and the senses detorted so widely from the first idea, that it is hard to trace them through the maze of variation, to catch them on the brink of utter inanity, to circumscribe them by any limitations, or interpret them by any words of distinct and settled meaning; such are
bear, break, come, cast, full, get, give, do, put, set, go, run, make, take, turn, throw
. If of these the whole power is not accurately delivered, it must be remembered, that while our language is yet living, and variable by the caprice of every one that speaks it, these words are hourly shifting their relations, and can no more be ascertained in a dictionary, than a grove, in the agitation of a storm, can be accurately delineated from its picture in the water.

 
 

Johnson had a point. These are words that any speaker of English can press into new service at any time, on any occasion, alone or in combination, inventively or not, with hopes of being understood. In every revision, the
OED
’s entry for a word like
make
subdivides further and thus grows larger. The task is unbounded in an inward-facing direction.

The more obvious kind of unboundedness appears at the edges. Neologism never ceases. Words are coined by committee:
transistor
, Bell Laboratories, 1948. Or by wags:
booboisie
, H. L. Mencken, 1922. Most arise through spontaneous generation, organisms appearing in a petri dish, like
blog
(c. 1999). One batch of arrivals includes
agroterrorism
,
bada-bing
,
bahookie
(a body part),
beer pong
(a drinking game),
bippy
(as in, you bet your ———),
chucklesome
,
cypherpunk
,
tuneage
, and
wonky
. None are what Cawdrey would have seen as “hard, usual words,” and none are anywhere near Murray’s well-defined center, but they now belong to the common language. Even
bada-bing:
“Suggesting something happening suddenly, emphatically, or easily and predictably; ‘Just like that!’, ‘Presto!’ ” The historical citations begin with a 1965 audio recording of a comedy routine by Pat Cooper and continue with newspaper clippings, a television news transcript, and a line of dialogue from the first
Godfather
movie: “You’ve gotta get up close like this and bada-bing! you blow their brains all over your nice Ivy League suit.” The lexicographers also provide an etymology, an exquisite piece of guesswork: “Origin uncertain. Perh. imitative of the sound of a drum roll and cymbal clash. Perh. cf. Italian
bada bene
mark well.”

The English language no longer has such a thing as a geographic center, if it ever did. The universe of human discourse always has backwaters. The language spoken in one valley diverges from the language of the next valley, and so on. There are more valleys now than ever, even if the valleys are not so isolated. “We are listening to the language,” said Peter Gilliver, an
OED
lexicographer and resident historian. “When you are listening to the language by collecting pieces of paper, that’s fine, but now it’s as if we can hear everything said anywhere. Take an expatriate
community living in a non-English-speaking part of the world, expatriates who live at Buenos Aires or something. Their English, the English that they speak to one another every day, is full of borrowings from local Spanish. And so they would regard those words as part of their idiolect, their personal vocabulary.” Only now they may also speak in chat rooms and on blogs. When they coin a word, anyone may hear. Then it may or may not become part of the language.

If there is an ultimate limit to the sensitivity of lexicographers’ ears, no one has yet found it. Spontaneous coinages can have an audience of one. They can be as ephemeral as atomic particles in a bubble chamber. But many neologisms require a level of shared cultural knowledge. Perhaps
bada-bing
would not truly have become part of twenty-first-century English had it not been for the common experience of viewers of a particular American television program (though it is not cited by the
OED
).

The whole word hoard—the lexis—constitutes a symbol set of the language. It is the fundamental symbol set, in one way: words are the first units of meaning any language recognizes. They are recognized universally. But in another way it is far from fundamental: as communication evolves, messages in a language can be broken down and composed and transmitted in much smaller sets of symbols: the alphabet; dots and dashes; drumbeats high and low. These symbol sets are discrete. The lexis is not. It is messier. It keeps on growing. Lexicography turns out to be a science poorly suited to exact measurement. English, the largest and most widely shared language, can be said very roughly to possess a number of units of meaning that approaches a million. Linguists have no special yardsticks of their own; when they try to quantify the pace of neologism, they tend to look to the dictionary for guidance, and even the best dictionary runs from that responsibility. The edges always blur. A clear line cannot be drawn between word and unword.

So we count as we can. Robert Cawdrey’s little book, making no pretense to completeness, contained a vocabulary of only 2,500. We possess
now a more complete dictionary of English as it was circa 1600: the subset of the
OED
comprising words then current.

That vocabulary numbers 60,000 and keeps growing, because the discovery of sixteenth-century sources never ends. Even so, it is a tiny fraction of the words used four centuries later. The explanation for this explosive growth, from 60,000 to a million, is not simple. Much of what now needs naming did not yet exist, of course. And much of what existed was not recognized. There was no call for
transistor
in 1600, nor
nanobacterium
, nor
webcam
, nor
fen-phen
. Some of the growth comes from mitosis. The guitar divides into the electric and the acoustic; other words divide in reflection of delicate nuances (as of March 2007 the
OED
assigned a new entry to
prevert
as a form of
pervert
, taking the view that
prevert
was not just an error but a deliberately humorous effect). Other new words appear without any corresponding innovation in the world of real things. They crystallize in the solvent of universal information.

What, in the world, is a
mondegreen
? It is a misheard lyric, as when, for example, the Christian hymn is heard as “Lead on, O kinky turtle …”). In sifting the evidence, the
OED
first cites a 1954 essay in
Harper’s Magazine
by Sylvia Wright: “What I shall hereafter call mondegreens, since no one else has thought up a word for them.”

She explained the idea and the word this way:

When I was a child, my mother used to read aloud to me from Percy’s Reliques, and one of my favorite poems began, as I remember:

 
 

Ye Highlands and ye Lowlands
,

 

Oh, where hae ye been
?

 

They hae slain the Earl Amurray
,

 

And Lady Mondegreen
.

 
 

There the word lay, for some time. A quarter-century later, William Safire discussed the word in a column about language in
The New York Times Magazine
. Fifteen years after that, Steven Pinker, in his book
The Language Instinct
, offered a brace of examples, from “A girl with colitis
goes by” to “Gladly the cross-eyed bear,” and observed, “The interesting thing about mondegreens is that the mishearings are generally
less
plausible than the intended lyrics.”

But it was not books or magazines that gave the word its life; it was Internet sites, compiling mondegreens by the thousands. The
OED
recognized the word in June 2004.

A mondegreen is not a transistor, inherently modern. Its modernity is harder to explain. The ingredients—songs, words, and imperfect understanding—are all as old as civilization. Yet for mondegreens to arise in the culture, and for
mondegreen
to exist in the lexis, required something new: a modern level of linguistic self-consciousness and interconnectedness. People needed to mishear lyrics not just once, not just several times, but often enough to become aware of the mishearing as a thing worth discussing. They needed to have other such people with whom to share the recognition. Until the most modern times, mondegreens, like countless other cultural or psychological phenomena, simply did not need to be named. Songs themselves were not so common; not heard, anyway, on elevators and mobile phones. The word
lyrics
, meaning the words of a song, did not exist until the nineteenth century. The conditions for mondegreens took a long time to ripen. Similarly, the verb
to gaslight
now means “to manipulate a person by psychological means into questioning his or her own sanity”; it exists only because enough people saw the 1944 film of that title and could assume that their listeners had seen it, too. Might not the language Cawdrey spoke—which was, after all, the abounding and fertile language of Shakespeare—have found use for such a word? No matter: the technology for
gaslight
had not been invented. Nor had the technology for motion pictures.

The lexis is a measure of shared experience, which comes from interconnectedness. The number of users of the language forms only the first part of the equation: jumping in four centuries from 5 million English speakers to a billion. The driving factor is the number of connections between and among those speakers. A mathematician might say that messaging grows not geometrically, but combinatorially, which is much,
much faster. “I think of it as a saucepan under which the temperature has been turned up,” Gilliver said. “Any word, because of the interconnectedness of the English-speaking world, can spring from the backwater. And they are still backwaters, but they have this instant connection to ordinary, everyday discourse.” Like the printing press, the telegraph, and the telephone before it, the Internet is transforming the language simply by transmitting information differently. What makes cyberspace different from all previous information technologies is its intermixing of scales from the largest to the smallest without prejudice, broadcasting to the millions, narrowcasting to groups, instant messaging one to one.

Other books

Queen by Alex Haley
The Savage Garden by Mark Mills
Umbrella Summer by Graff, Lisa
Don't Close Your Eyes by Lynessa James
Electric Forest by Tanith Lee
The Crush by Scott Monk