14 Jul 2025

Q & As with Janet Pierrehumbert

Professor Janet B. Pierrehumbert is an experimental and computational linguist known for her research on prosody and intonation in languages. She has helped unravel the sound structure of English and other languages

Article by Sandeep Ravindran reproduced courtesy of the Proceedings of the National Academy of Sciences of the United States of America (PNAS) vol 122 No. 28

Janet B. Pierrehumbert is an experimental and computational linguist known for her research on prosody—the patterns of stress and rhythm—and intonation in languages. She has helped unravel the sound structure of English and other languages. She has also explored the structure of lexical systems— how words and the relationships among words are learned from experience, represented in the mind, and manifested in linguistic behavior. In her Inaugural Article, Pierrehumbert examines the mechanisms by which large language models (LLMs)—a type of artificial intelligence (AI) model—learn and generalize linguistic patterns (1). A professor of language modeling at the University of Oxford, Pierrehumbert was elected to the National Academy of Sciences in 2019.

PNAS: How did you become interested in linguistic generalization?

Pierrehumbert: I really like languages, and I like math; that’s what got me into generative linguistics. Generative linguistics seeks to explain exactly how finite experience with language allows people to infer an abstract system that can create and process arbitrarily many new forms. Originally, I worked on prosody and intonation, and I later became especially interested in derivational morphology, which creates new words from other words. An easy example is nominalization, in which we put -ity or -ness on the end of an adjective and it becomes an abstract noun referring to a quality or state. For example, serene becomes serenity. I did an experiment in 2006 that showed people were totally happy to create new words that way, even using adjectives that are completely made-up. The Inaugural Article is focused on English nominalization.

PNAS: How did this Inaugural Article come about?

Pierrehumbert: This paper would never have happened without Valentin Hofmann, my DPhil student, and the first author. He did a lot of work on the detailed formulation of the problem as well as programming the calculations. When LLMs came out, many language scientists including me were very skeptical and suspected that they were just memorizing their gigantic training sets and spewing pieces back in a reasonably connected manner. But Valentin was more impressed by LLMs and thought we should really study them and diagnose the underlying principles for their success. So this was a project that brought together his drive to understand LLMs with my extensive background working on humans and psycholinguistic models. It also benefited from Valentin’s collaboration with Leonie Weissweiler.

PNAS: How did you explore the LLMs?

Pierrehumbert: We did a three-way comparison. The first component is data about humans; the second is models built by psycholinguists that summarize statistically significant patterns in human behavior; and the third is LLMs, notably GPT-J, which we used because that’s a public domain GPT. A very important thing about GPT-J is that its entire training set is available, so we could determine which exact words it had seen and how often. We also looked at GPT-4, which is a recent, top-performing LLM, but since that’s a proprietary model, we can’t get into the innards as well.

PNAS: What did you find?

Pierrehumbert: We discovered several different types of frequency effects, coming both from the frequencies of individual words and from the frequencies of patterns. Since we can tabulate the frequencies in the training data, we can look at the GPT-J model and see if it’s reflecting all these frequencies. We also found that the models do generalize, and easily apply their knowledge to forms that are not in the training data. If you didn’t see that ability, the “intelligence” in “AI” would be a total misnomer. The LLMs actually did better than I expected. In previous papers looking at the ability of LLMs to generalize, people have been focused on rules, such as subject–verb agreement. They’ve been neglecting another big theme from the cognitive science literature, which is analogy, where you’re looking at similarities in relationships. An example in the article is moon:planet::planet:sun. Seeing Jupiter’s moons orbit around Jupiter, Copernican astronomers thought, “Well, the Earth’s relation to the sun seems similar to those moons’ relation to Jupiter, so maybe we’re orbiting around the sun.” That is analogical reasoning, and that is known to be used by children in their cognitive development. It’s known to be quite important in scientific reasoning, and it, furthermore, provides the best account of human generalization in the domain of derivational morphology. Our extremely detailed examination of frequency effects showed that LLMs effectively use analogical reasoning. They do not use rules. That makes the LLMs look pretty human-like.

PNAS: How did LLMs differ from humans?

Pierrehumbert: We think of people as having a sort of mental dictionary. We are able to show that the LLMs do not. As a result, they lack the capability to spot known words within complex words. You might have never heard a very rare word like precancellation, but you judge it to be familiar because you recognize all the parts. LLMs absolutely do not do that. That means they have no metalevel word knowledge, which is quite a significant deviation from human intelligence and puts some question marks around the “I” in “AI.”

PNAS: Will understanding the mechanisms underlying LLMs help improve them? Where do you see this research going in the future?

Pierrehumbert: Analogy turned out to be a much bigger factor than people realized in the capabilities of LLMs. One could find ways to exploit their analogical capabilities to strengthen their ability to reason about novel problems and situations. The failure to construct a mental lexicon similar to that of humans was very striking. It probably contributes to the fact that the LLMs need so much training data; they learn very inefficiently, compared to humans. For me, the most interesting direction is figuring out how to build models that can construct metalevels like humans do on the basis of experience. I think that a better understanding of the theory coming from the language sciences might be quite beneficial and result in faster progress.

Reference

V. Hofmann et al., Derivational morphology reveals analogical generalization in large language models. Proc. Natl. Acad. Sci. U.S.A. 122, e2423232122 (2025).

Article by Sandeep Ravindran reproduced courtesy of the Proceedings of the National Academy of Sciences of the United States of America (PNAS) vol 122 No. 28