Worthwhile piece in the NYT on the OED today. An excerpt:
The job of a new-words editor felt very different precyberspace, Paton says: “New words weren’t proliferating at quite the rate they have done in the last 10 years. Not just the Internet, but text messaging and so on has created lots and lots of new vocabulary.” Much of the new vocabulary appears online long before it will make it into books. Take geek. It was not till 2003 that O.E.D.3 caught up with the main modern sense: “a person who is extremely devoted to and knowledgeable about computers or related technology.” Internet chitchat provides the earliest known reference, a posting to a Usenet newsgroup, net.jokes, on Feb. 20, 1984.
The scouring of the Internet for evidence — the use of cyberspace as a language lab — is being systematized in a program called the Oxford English Corpus. This is a giant body of text that begins in 2000 and now contains more than 1.5 billion words, from published material but also from Web sites, Weblogs, chat rooms, fanzines, corporate home pages and radio transcripts. The corpus sends its home-built Web crawler out in search of text, raw material to show how the language is really used.