Select Page

Brian Kong (II) places second, North American competition in computational linguistics

10-04_brian_kong_1On February 4, Brian Kong (II) was one contestant among 1,118 students in the first round of the fourth annual North American Computational Linguistics Olympiad. As one of the top 100 scorers, he took part in the second round on March 10. Brian earned second place in the continent-wide competition (missing first place by only .01 points). He is now eligible to compete in the International Linguistics Olympiad in Sweden this July.

Computational linguistics deals with statistical or rule-based modeling of natural languages. This interdisciplinary field often draws upon the skills of linguists, computer scientists, experts in artificial intelligence, mathematicians, logisticians and cognitive scientists.

“[In the test booklet] you’re presented with all these problems to solve that are based on a language you’ve never seen or heard of before,” Brian explains. “They’re usually languages spoken by very few people, or dialects that not many people use any more. Since you have no previous experience with the language, you have to rely on logic to solve the problems. It’s like a puzzle.

“All you have in front of you are maybe five words and five translations. Or they’ll tell you what language family the language is from, and if you’re lucky you might know something about the language family, but you’re basically just working with what’s on the page in front of you.”

To solve computational linguistics problems, a person must notice patterns; employ logic; exercise math skills: “A few times during the competition I built a system of equations using the words that I did know to figure out the ones that I didn’t,” Brian says. “You also have to use common sense; sometimes the booklet would provide information about the culture of the people who speak that language—whether they’re a farming-based community, for instance—and that might help you.”

The competition also included non-traditional language systems—the shorthand notation used by people working at a call center, or swirls of computer-generated characters. In the ancient Incan Empire, no record exists of written language, but scientists have discovered information recorded in the form of knots on collections of string, the patterns of the knots referred to as “khipu code”—a mystery linguists are still trying to unravel.

“I really enjoy the challenge of [computational linguistics],” Brian says. “Figuring out those problems is entertainment for me.  Learning about other parts of the world—their culture, their languages, their history—is also really interesting.”