Watch CBS News

Studying Word Bursts Online

Computer scientist Jon Kleinberg is taking a virtual stroll down the information superhighway, surfing cyberspace for verbal megatrends.

Did you wince?

Those hopelessly passe terms were passably hip just a few years back. Then, due to overuse or a feckless public, they fell out of fashion. (Linguists suspect Al Gore of wearing out the superhighway quip.)

Kleinberg's research is really rather scientific. He uses algorithms to identify sudden jumps in the use of words, offering a glimpse into the mechanics of language evolution - what makes a word hot, or not.

"It's a fun tool to aim at things and see what happens," says Kleinberg, a Cornell University associate professor.

Search engines that scour Web pages for specific words work pretty well, although there's a lot of weeding out of old and weird results. Kleinberg's software is different. It looks at data without being given a keyword and reports back on significant topics.

For instance, the program scanned State of the Union addresses going back to 1790 (they're all online) and produced a list of "word bursts," words that jumped in frequency.

The program found "depression," "banks" and "recovery" on presidential lips in the '30s. In the late '40s and '50s, "atomic" was the explosive catchword.

The speech scan was a test to show the software could come up with results that correlate to the real world.

The program is intended to look at data about which the searcher has no clue - say a mountain of unread e-mail or documents - and divulge a list of what topics were hot and when they started to heat up.

So far, the software detects trends in retrospect. Kleinberg is making it more predictive.

Prabhakar Raghavan, chief technology officer of the software company Verity, has used Kleinberg's software to analyze Weblogs, online journals commonly known as "blogs."

Seeking emerging trends among cutting-edge bloggers, Raghavan looked for bursts of references and links to other people's Web sites. Raghavan found the software successfully identified such bursts, which could ultimately help advertisers target sales pitches.

To Web word watcher Paul McFedries, burst software sounds like a great idea. He's been using "wetware" - his wits - to trawl Internet databases for new uses of language.

McFedries posts the results on his site, The Word Spy.

"I kind of now have this sixth sense. I see a word I recognize as being new and then I check and see whether it's just something the writer made up or if it's something a lot of people are using," he says.

But, alas, new word-watching can pinch nerves in these days of closely protected Internet trademarks.

McFedries got into a spot of trouble when he noted that people have started using "google" - from the popular Google search engine - as a verb, meaning scoping out a subject or person, as in, "Naturally, I googled him before I agreed to go out with him."

Google lawyers took exception to using "google" as anything but a trademarked proper name. Harmony was restored after McFedries agreed to reference the Google trademark on his site.

Some of the words spotted by McFedries are tech-related, e.g., "ham," which means legitimate e-mail that gets lost in spam filters because it contains some spam-like phrases. Others are free-floating jargon, such as "induhvidual," meaning one who acts foolishly.

(Spam, the now-mainstream label for junk e-mail, is believed to derive from a Monty Python sketch that made fun of the eponymous canned meat.)

The Internet both creates and propagates new terminology, says "e-tymologist" McFedries. So, the "dead cat bounce," a phrase referring to a stock that dives, starts to rise then falls back, has left the trading floor for the world at large.

"Ping" has evolved from the sound of a sonar pulse to a way of checking to see if a computer is running, then to getting someone's attention online, as in, "I'll ping Frank to see if he's there."

Or there's bandwidth, which describes not just transmission but mental capacity. For instance, "I'm not sure he's up to the job. He's got awfully low bandwidth."

What makes a new word stick? Simple sells, clever crashes, says Allan Metcalf, author of "Predicting New Words: The Secrets of Their Success" and executive secretary of the American Dialect Society.

Of course, there's nothing new about creating words, he says.

One of the most dramatic language upheavals came after the Normans conquered Britain in 1066. Suddenly, all the upper classes were speaking French. With no one to lay down the law about proper English, the peasants had their merry way - dropping the Germanic inflections of Old English and developing easier-on-the-tongue Middle English.

What's different now is the Internet effect, capturing each mutation and revolutionizing the study of words.

"It's kind of like the Hubble telescope," Metcalf says. "The stars are what they were before, but now you can see them more clearly."

By Michelle Locke

View CBS News In
CBS News App Open
Chrome Safari Continue