Understanding Languages with Physics and Math
(Inside Science) -- A husband and wife scientist duo from Poland has developed a computer model that simulates how vocabulary exchanges occur between settlers and nomads. According to their results, published in the journal Physical Review E, the nomadic groups are more likely to adopt words from settlers than the other way around.
The new model provides a tool that can help sociolinguists understand how migration and intercultural interactions can influence the evolution of a language. While linguists tend to express careful skepticism about the significance of this and similar predictive computational models, these studies contribute to the growing interest of studying social behaviors with computational methods.
The messy science of language contact
"For this study, we developed a model using a method that has been around for 20 years or so," said Adam Lipowski, a physicist from Adam Mickiewicz University in Poznań, Poland. He is referring to the Naming Game, which simulates the exchange of information between individuals during face-to-face interactions.
The model divides individuals into two groups, each with their own language. During the simulation, the group that represents the settlers stays put, while individuals from the nomadic group move around. Over time, the model shows that the nomads are more likely to pick up new words than the settlers, even when everything else, such as the vocabulary size or the population size of the groups, were kept equal. This result came as a bit of a surprise to Lipowski and his co-author, Dorota Lipowska.
"It is quite obvious that factors like politics or economy might induce a preference toward a certain language. That migration should be on the list of such factors is perhaps less obvious," wrote Lipowska in an email. She is a linguist at the same university and is interested in developing computational methods for linguistic studies.
Other experts recommend skepticism towards the results obtained using these methods. According to Sara Thomason, a linguist from the University of Michigan in Ann Arbor, language contact is an extremely complicated process that involves many different interconnected factors. A relatively straightforward model such as the Lipowskis' can sometimes produce elegant results, she said, but it may not be meaningful when compared to reality.
"If you're going to explain why English is the main language of the United States, and not, say, Navajo, that's not what you're looking at," said Thomason. "What happened was you got a bunch of well-armed technologically superior invaders who conquered the land."
Because of the many factors such as military power and social behaviors that often dominate language exchanges, it is difficult to locate historic data to verify the link between the mobility of a group and its language as predicted by the Lipowskis' model.
"[Real world] data that take into account both language decline and migration are very scarce," wrote Lipowska in an email, acknowledging the difficulty to test the validity of their results. This is a fundamental difficulty in observational sciences where experimental results are not as easily obtainable as some of the other sciences -- it is much easier to replicate a projectile problem in physics than the 13th century Mongolian invasion of Eurasia. This is not unlike the recent debate in astronomy, where scientists argue over the value of theoretical studies when the predictions cannot be experimentally verified.
However, looking beyond the face value of the Lipowskis' results, their study represents part of a larger interdisciplinary effort to study extremely complex systems with computational methods.
Blind men and the elephant
Randy LaPolla, a linguist from Nanyang Technological University in Singapore, emphasized the importance of collaborations in such interdisciplinary studies. In an email, he noted several examples where previous studies with similar aims fell short because the researchers overlooked relevant previous research -- something that could have been avoided if those other researchers had consulted with a linguist. Like the parable of blind men and an elephant, researchers who undertake these studies must coordinate with other scientists across disciplines to ensure success. But for now, these computational methods tend to remain relatively narrow in scope.
"These models are very crude simplifications of reality," said Lipowski. "But very often computer simulations are the only tool to study these complex systems."
These models often borrow existing theories from physics and biology to study complex phenomena across all disciplines -- from how power outages surge through a network, to how rumors spread among individuals. They are all part of the effort to develop computational models for studying extremely complex systems.
"More linguists are getting more interested [in these models], especially the ones who have the mathematical skills and computational skills," said Thomason.
It took computer scientists many years to develop Deep Blue, the first ever chess-playing computer to beat a human world champion. Until these computational models are advanced enough to match the credibility of more traditional methods, Thomason recommends taking these results with a grain of salt.
"It would be great if one day these models could be sophisticated enough to tell us something that we don't already know about languages, but we're not there yet," she said.