If you’re like me, the universe gave you a great big present when Netflix negotiated with Paramount for the exclusive online streaming rights to every single episode of Star Trek ever. All flavors—TOS, TNG, DS9, VOY, and ENT (although why anyone would want to watch any Star Trek not featuring Captain Picard is beyond this WWJLPD bracelet wearing scientist). Even a cursory viewing of any Star Trek episode reveals that a lot has gone on in the scientific world between now and Captain Kirk, but since I am a psychologist who studies language, and builds computer models to try to understand language better, the science in Star Trek that is particularly interesting to me is that associated with Lt. Commander Data’s language use.
The Next Generation aired from 1987-1994, which was right smack in the middle of what I would consider one of the most important paradigmatic explosions that has maybe ever occurred in the modeling and understanding of language. That would be what myself and other computational modelers refer to as the Parallel Distributed Processing (PDP) revolution—a new way of thinking about language (and other parts of cognition, like memory) that had at its root the goal of making models of language more true to the brain hardware on which language is implemented. The PDP revolution challenged many classical assumptions about how language is produced and comprehended, most of which do not have anything to do with how Data talks on Star Trek. Luckily (?) for us though, one of the most publicized and fundamental challenges the PDP revolution had for classical accounts of language pertains to the degree to which explicit “rules” are used in language production, and this is where the revolution intersects big time with one of Data’s most infuriating android-signifiers: his inability to produce contractions.
Data’s smarmy, evil twin brother Lore sums Data’s deficiencies in this regard up nicely in Season 1 Episode Datalore:
…Haven’t you noticed how easily I handle human speech? I use their contractions… such as isn’t or can’t. You say (mimicking formality) ‘is not’ or ‘can not’.
Since Data is, as far as the crew of the Enterprise knows (which one would think would be pretty far), the most sophisticated android intelligence anywhere in the galaxy, some poor jerk like me who has a 1MB computer model sitting on my laptop that could probably produce contractions can’t help but wonder WHY Data is deficient in this way—or more specifically, what was going on in the science at the time Datalore was written that made the writers think that a state of the art 24th century android wouldn’t be able to use contractions. After all, Star Trek has a reputation for Getting the Science Right as much as they can within the limitations of what is known in our barbaric centuries.
One obvious answer to questions about why Data has such a surprising inability to produce contractions is that there is no answer at all: maybe Data’s deficiency is simply a plot device to tell him apart from Lore—the TNG equivalent of Captain Kirk’s evil twin’s goatee. However I think a more interesting answer to consider (whether it is true or not) is that Data’s difficulty with contraction formation reflects an extremely contentious and well-publicized debate that was raging at its hottest point in the late eighties between PDP modelers and proponents of classical, rule-based psycholinguistic theories. Specifically, the debate about Past Tense formation.
Past tense formation in English is a problem that is structurally similar to contraction formation in many important respects. In both problems there are a large set of words with so called “regular inflections” that can be formed by the application of a rule—in past tense formation examples of these would be verbs like watch -> watched, type -> typed, jump -> jumped. That is, you can form the past tense of these “regular” words by just adding –ed to the end. There are similar regular contraction formations, for example did not -> didn’t, has not -> hasn’t, is not -> isn’t. In this case the “rule” is that you just get rid of the o in not and squish the words together. Similarly, in both past tense formation and contraction formation there are “irregular inflections,” which don’t follow the rules. For example with the past tense go -> went (not goed), eat -> ate (not eated); or with contractions will not -> won’t (not willn’t) shall not -> shan’t (not shalln’t). Formally, both English past tense and contraction formation are examples of language use that is “quasi-regular,” that is, mostly rule-based with some exceptions. Because the two problems are so similar, a lot of the arguments between PDP modelers and classical psycholinguists about the past tense apply to contraction formation as well.
The Past Tense Debate has been continuously raging now for about 20 years, but the issue that really set it all off during the PDP revolution was this: do people have, somewhere in their brain, some list of inflection rules that they use to form regular past tenses, supplemented by a list of exceptions that they use to form irregulars? The classical theory, which had been around forever and still isn’t really dead, would say “yes.” After all, a classical theorist might say, how else can you ask people the past tense of the made up verb “frak” and have 99/100 people say “frakked” (*not real data) if there is no explicit rule floating around in there? Well, the PDP modelers said “I bet I can make a model that doesn’t have rules in it, but still displays rule like behavior and can produce both regular and exception past tense inflections” (They really did literally say almost exactly that—there is a paper by PDP guru granddaddy Jay McClelland called “Rules or connections in past tense inflections: What does the evidence rule out?”) You see, one of the most remarkable findings of the PDP revolution was that there exist computer systems which can produce rule-like behavior, without any rules. A lot of people at the time found that very exciting, because it wasn’t (and still isn’t) clear how a “rule” would be represented by poor little neurons in the brain, that can’t do anything besides send out little bursts of electrical activity. The fact that PDP models could produce rule-like behavior while using processing units that were a lot more like neurons than anything else available at the time was part of why the advent of PDP modeling was a “revolution,” and not just an “extremely complicated and CPU intensive new methodological technique.”
Ok, so we’re kind of getting into PDP cheerleading “who cares” time here. How does any of this apply to Data’s verbal tic? Well, shortly after saying “I bet I can make a model that doesn’t have any rules in it, but still displays rule like behavior,” some PDP guys attempted to do just such a thing. The model was revolutionary, and pretty good, but there were problems with it, which were immediately pounced upon by classical theorists, which started one of the most hard fought and long lasting debates on any topic ever in the language processing literature. The most relevant result of this 1980s psycholinguistic battleground being, for our purposes, that if some Star Trek writer had asked the 1987 equivalent of D. Girard to ask his psycholinguist friends what an android’s language capacity was like, D. Girard and friends might have cumulatively summed up that a lot was understood about the computational properties of language use, but that quasi-regular domains like past tense inflection and contraction formation are incredibly difficult, and no one can agree to even the most insignificant degree on how it works. Thus, you end up with an android who can instantly calculate how long it will take to travel to an arbitrary point in the universe to an arbitrary level of precision, but still gets regularly beat up by his evil twin who has figured out how to say “can’t.”
Fastf orward to the present day. I like all the cool technological advances in Star Trek (has anyone NOT wished that transporter technology would hurry up and get invented while standing in a TSA line?) but sometimes what I like best about the show is when we 21st century apes have one up on the Starfleet flagship. The iPad, for example, is much better and smaller than its counterpart Starfleet PADD. I love the moments when current technology surpasses Starfleet technology because that technology is what the writers at the time dreamed it would take hundreds of years for a whole galaxy of sophisticated scientists to come up with, the technology of the future, and we are living with it now! I love those moments even more when I get to feel like I am part of the scientific community that is moving us towards and beyond the Future. So, while it is kind of irritating that I really could produce simulations on my laptop that could easily outstrip Data’s contraction use, it’s pretty exciting to be able to do something that, as recently as 1987, the scientific community and Data’s creators thought would still not be possible in the 24th Century.