Why the human brain may hold the key to cheaper, smarter AI
Artificial intelligence companies spendvast sumsof money training machines to predict the next word, but neuroscientists now say the human brain may save enormous amounts of energy by knowing when not to bother.A recent study inNature Neurosciencesuggests the brain processes language differently from today’slarge language models(LLMs). Instead of constantly trying to predict every possible upcoming word, researchers found the brain appears to ease off on prediction at the boundaries between sentences and major phrases. Scientists say the strategy may help humans process language far more efficiently than modern AI systems.“Our core finding is that the brain sometimes sacrifices next word prediction, especially when a word starts a new sentence or a major phrase,”Nai Ding, a neuroscience professor at Zhejiang University and one of the study’s authors, toldIBM Thinkin an interview. “The brain is optimized not just for predicting the next word, a feature it can use, but also for compressing the memory of linguistic representations.”
The findings come as companies roll out generative AI across customer service, cybersecurity, software development and research, even as they grapple with soaring computing costs,hallucinationsand “context drift,” where AI systems lose track of instructions during long conversations or complex tasks.Ding and other researchers say the human brain may avoid this problem by compressing information into larger conceptual structures, rather than constantly recalculating every possible relationship between words.Transformersystems, the AI architecture behind tools like ChatGPT that learn by finding relationships between words and ideas across huge amounts of data, gain power from their ability to connect information across broad contexts, though the process remains computationally expensive, saidNikolaus Kriegeskorte, a professor of psychology and neuroscience at Columbia University.“The self-attention in transformers provides a powerful way to relate a set of representations,” Kriegeskorte toldIBM Thinkin an interview. “But the computational costs of relating each element to each other element are high.”
Scientists have long known that human working memory remains sharply constrained. Most people can actively hold only a small number of items in mind at once. Yet humans still navigate conversations, stories and abstract reasoning with remarkable speed and flexibility.Researchers behind the study argue that the brain compensates by compressing completed language units into higher-order conceptual representations. Instead of preserving every individual word with equal importance, the brain may summarize completed linguistic structures into a broader meaning.“Compressing a sentence, turning many individual word representations into a single higher-level constituent representation, takes computational resources,” Ding said. “Because of that cost, the brain can deprioritize predicting the next word at constituent boundaries.”Researchers say today’s AI systems handle language very differently. Current LLMs consume vast computing power partly because they try to evaluate too many possible relationships at once, saidStanislaw Wozniak, a research scientist at IBM Research.“LLMs tend to extensively analyze all possible interactions over the entire context, which is the main driver for their intensive computational resources usage,” Wozniak toldIBM Thinkin an interview. “Constraining this based on some criteria, for example, clear constituent boundaries, can thus improve efficiency.”Researchers believe brain-inspired architectures could help models decide which information deserves continued attention and which information can collapse into compressed abstractions.“If such a hierarchical representation can be built, we believe the model size can be significantly reduced,” Ding said.