A compulsive early adopter of high-tech gadgetry once invited me over to check out his latest acquisition – an Amazon Echo. “Go ahead, ask her something!” Reluctant to give Jeff Bezos my native voice imprint, I asked in my fractured Wienerisch, “Alexa, translate ‘Mistkübel’ into English!” Her Austro-hostile response, uttered snootily in high-Piefkenesisch, was an embarrassed: “I’m sorry, but I cannot locate any relevant data that answers your query.” Hmm … okay, then, “Alexa, translate ‘Mülleimer’ into English.” Suddenly, the Berlin matron became Miss Iowa and issued her answer in a cheery mid-Western twang, “treeyash keeyan.”
Today’s voice recognition, translation, speech generation, dictation software, auto-suggested texts and “chatbot” technologies owe their imperfect existence to decades of research in the field of natural language processing (NLP), whose core goal is creating a machine that can parse and process language as well as (or even better than) a human.
Treading the tightrope between linguistics and computer science (and, to an increasing degree, neuroscience), NLP has evolved exponentially alongside advances in computing power and the accumulation of digital data. What was thought inconceivable at the end of the last century is now standard features on every smart phone and PC. Still far from perfect, its applications already have had significant impact on globalization, politics, civil discourse, and business. What does NLP bode for the future and will its benefits outweigh its pitfalls? One thing is clear – NLP is big business: The global market for NLP-related technologies is estimated to be worth €31.4 billion by 2025 and is growing by 21.5 percent annually.
Colorless Green Ideas
In 1950, the British mathematician, cryptologist and computing pioneer Alan Turing predicted that a human observer eventually would not be able to differentiate replies made by a human from those of an artificially intelligent machine. Passing this “Turing Test” thereafter became a goal for computer scientists trying to wrest intelligence out of silicon chips.
Only a few years later, pioneering linguist Noam Chomsky revolutionized his field by alleging that the basic syntax of grammar is a universal, innate human characteristic. This so-called transformational generative grammar theory tipped the scales of the nature vs. nurture debate, raising doubts that language is acquired solely through environmental learning.
To demonstrate his theory, he compared two semantically nonsensical sentences sharing the same five words: “Colorless green ideas sleep furiously” and “Furiously sleep ideas green colorless.” Neither sentence had likely been expressed before, thus could not have been learned from experience. Yet only the first sentence has proper, comprehendible syntax. How would someone get that unless there is an innate instinct for grammar: “I will send you also for this something.”
While Chomsky did not intend his theories to be used to engineer products, computer scientists eagerly took on the task in a race to “pass” the Turing Test. If only the rules of syntax could be organized into logically structured “trees,” computers armed with a thorough lexicon could be instructed how to “learn” a language. This launched the era of “supervised learning” computational models, or symbolic NLP.
Even as Chomsky was writing his thesis, the “Georgetown-IBM experiment” had successfully programmed a computer to translate about 60 Russian phrases into English, leading its authors to predict, falsely, that machine translation would be solved within a few years. By the mid-1960s, the “chatterbox” era made headlines with ELIZA, a computer program that simulated a psychotherapist issuing contextual, echoic responses to a patient’s input. She became outed as a sort of counting-horse act, when it was discovered that if she did not understand her patient’s plaints, she gave ambiguous stock responses like “I see” and “please go on.” Actually, not so different from many real therapists.
In the 1980s and ’90s computational linguists formalized an architecture for classifying words and phrases with the head-driven phrase structure grammar (HPSG). This enabled development of applications such as the early speech-recognition program, Dragon- Dictate; Microsoft Window’s noodgey Office Assistant, a.k.a. “Clippy” (“It looks like you’re writing a
letter”); and the emerging web-indexing search engines.
Limited to such grammar-based parsing models, computer programmers began hitting a wall. Supervised learning required too much advance coding to interpret text (the phrase “Fruit flies like a banana” can be parsed five different ways).
Machines had difficulty hearing differences between phonemes (words or phrases that sound the same but are spelled differently) and accents, as well as deciphering linguistic idiosyncrasies, e.g., why is it “mice-infested” but not “rats-infested,” and a “hot dog” not a panting pooch? How can a machine tell “I scream” from “ice cream”? One wonders how many computers have overheated attempting to parse James Joyce’s “Love loves to love love”?
In his 1994 book, The Language Instinct, Chomsky-acolyte Steven Pinker doubled down by insisting that humans must have a “language organ” and “grammar genes” that provide an innate sixth sense for language. Erroneously doubting that a computer would be able to master speech recognition anytime soon, Pinker suggested that “there are only a few ways it could be solved in principle. If so, the way the brain does it may offer hints as to the best way to build a machine to do it, and how a successful machine does it may suggest hypotheses about how the brain does it.”
Enter the era of “unsupervised learning” and neural-network computing.
Learning how to learn
Humans’ hardwired sense of grammar gives them the power to infer meaning from language without having to learn it all experientially. But with supervised learning models, computers need to be given an encyclopedic instruction manual. However, as computing power exponentially increased (as Moore’s law predicted it would), computers started gaining an edge: the ability to process and memorize massive quantities of data that were becoming increasingly available in digital, machine-readable formats during the internet era.
Unsupervised learning models rely upon statistical probability to “teach” computers how to parse syntax and derive meaning. No longer necessary to input an entire HPSG-syntax manual, programmers could “simply” feed the machine a few stochastic algorithms that allow the computers to draw inferences from massive amounts of data. Google’s current machine-translation teams no longer employ linguists to help computers parse language. Instead, their data scientists program AI-agents to track statistical parallels between aligned bilingual texts, yielding astoundingly accurate results.
Today’s supercomputers are complex “deep-learning neural networks” that (only somewhat metaphorically) emulate our brain’s biological neurotransmitters and can likewise process data nonlinearly. They use both supervised and statistical models by autonomously learning features from only a small quantity of task-specific training data, then train themselves how to classify ever-larger datasets. By employing “word embeddings,” they can predict semantic meaning from statistical probability (“You shall know a word by the company it keeps,” said the pre-Chomskian British linguist J.R. Firth).
Linguist Dr. Friedrich Neubarth, of the Austrian Institute for Artificial Intelligence
(OFAI), recalls that when he started there in 1999, “mainstream NLP built upon the formalizations provided by linguistics. But the core of current technology plays with Big Data without even attempting to implement any form of linguistic understanding.”
Such is the case for Dr. Dadhichi Shukla, a senior data scientist at the Viennese company STRG, which develops AI technology for major Austrian media portals: “I’m not a native German speaker or a linguist, but I am working with German-language media data. The latest machine-learning models allow us to understand NLP mathematically, if not linguistically.”
As neural networks advance, so do NLP’s applications, as we can see already with today’s machine translation, voice recognition, speech synthesis and the increasing prevalence of online chatbots. The next generation of NLP technology is already emerging.
Among Neubarth’s OFAI projects is a prototype translation machine for converting standard German to phonetic Viennese dialect (just type in a phrase and press the “Translate now! / Jedsd übasetsn!” button – its speech synthesizer will even “ausschbrechn” an audible result). Because of the relative scarcity of data in Wienerisch compared to Hochdeutsch (standard German), the developers applied deep-learning techniques to the solution.
Other tools now in development allow digesting complex legal texts into lay language. So that their commercial clients can provide better and faster customer service, the Innsbruck-based startup DeepOpinion offers NLP models that automate “tedious text analysis tasks” by semantically parsing customer inquiries and monitoring social media commentary to determine specific meaning and emotional attitude.
“Robot” journalists can synthesize boilerplate reports on stats-heavy subjects like weather, sports and finance. The 2018 novel 1 the Road was “written” by a computerized Kerouac (whose digital Dean Moriarty yields lovely nuggets of Beat prose such as, “The time was one minute after midnight and the wind was still standing on the counter and the little patch of straw was still still and the street was open.”)
In 2018, Google demoed its Android smart phone “Assistant” by having it phone up a hair salon to make an appointment. The virtual Assistant’s voice interjects several gap-filling “uhs” and “mmm-hmmms” that helped fool the unwitting hairdresser. “It’s actually an easy conversation to program,” claims Shukla, “but even minor errors actually make an AI agent or chatbot seem more naturally human and can convince an observer.”
Even computer programmers are benefitting from NLP translation technology. They can now describe a function in plain English and the computer will translate it into complex code strings and mathematical algorithms. “What was considered sci-fi ten years ago, such as Tony Stark/Iron Man speaking commands to his computer, is now cutting-edge technology,” says Shukla.
He acknowledges there are “many areas where NLP needs improvement – it is still taking baby steps. It may take a few more years or another generation, but it’s not so far in the future that we won’t be able to see it.”
Neubarth is less optimistic: “Perhaps we should have taken those old science-fiction
novels and movies more seriously!” Advances in artificial intelligence raise many fears about reaching a so-called “singularity” – the point beyond which computers can no longer be controlled by man.
“Discussions about AI most often refer to hopes, expectations and fears, but the expectations are always way beyond technical reality,” insists Neubarth, yet “we systematically overlook the real impacts, such as how the feedback loop of data collection and regurgitation perpetuates unwanted existing social biases.” Machines learning by harvesting dialogue data from the digital universe “might produce unwanted and unacceptable racist, sexist, and extremist speech.”
“Misuse of language technology… might be used by a small class of people to enforce their will on our society,” warns Neubarth. Privacy and free-speech advocates have already sounded alarms about government surveillance of global digital communications; social-media platforms using algorithms to block “unacceptable” posts and to create feedback-loops of opinion; and machine “trollbots” exploiting such algorithms to auto-post politically manipulative, divisive comments.
Schukla says that NLP developers no longer can micromanage every detail of their machines’ processes in real time once they are set in motion, “but what humans can do is log the data and try to evaluate the computer’s progress,” especially during its initial training stages. “If the data go beyond certain defined validation and test parameters, we can ascertain the accuracy and progress of its learning,” like a teacher grading a student.
But if that virtual pupil misbehaves, does it get sent to the headmaster’s office? Neubarth
feels “the question is rather, to what extent do we want machines to populate our business,
educational or domestic life?” Ask your Google Assistant to ask Alexa to ask Siri – the answer might be, “Your haircut has been scheduled for 2:00 pm this Wednesday.”