The researchers examined models based on 43 artificial neural networks--a technology that consists of thousands or millions of interconnected nodes, similar to neurons in the brain. Each node processes data and feeds them to other nodes. Some of the models the M.I.T. team looked at were optimized for next-word prediction, including the well-known Generative Pre-trained Transformer (GPT-2), which has made an impression because of its ability to create humanlike text.





The researchers found that the activity of the neural network nodes was similar to brain activity in humans reading text or listening to stories. They also translated the neural networks' performance into predictions of how brains would perform--such as how long it would take them to read a certain word.





The work lays a foundation for studying higher-level brain tasks. "We view this as a sort of a template or a guideline of how one can take this entire approach of relating models to data," says Martin Schrimpf, a Ph.D. student in brain and cognitive sciences at M.I.T. and lead author of the paper.





The researchers found that the models that were best at guessing the next word were also best at predicting how a human brain would respond to the same tasks. This was especially true for processing single sentences and short paragraphs. The models were significantly worse at predicting words or human responses when it came to longer blocks of text. None of the other tasks reflected what was going on in the brain. The authors argue this is strong evidence that next-word prediction, or something like it, plays a key role in understanding language. "It tells you that, basically, something like optimizing for predictive representation may be the shared objective for both biological systems and these in silico models," Fedorenko says.