Contextual feature extraction hierarchies converge in large language models and the brain,Nature Machine Intelligence

当前位置： X-MOL 学术 › Nat. Mach. Intell. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Contextual feature extraction hierarchies converge in large language models and the brain
Nature Machine Intelligence ( IF 18.8 ) Pub Date : 2024-11-26 , DOI: 10.1038/s42256-024-00925-4
Gavin Mischler, Yinghao Aaron Li, Stephan Bickel, Ashesh D. Mehta, Nima Mesgarani

Recent advancements in artificial intelligence have sparked interest in the parallels between large language models (LLMs) and human neural processing, particularly in language comprehension. Although previous research has demonstrated similarities between LLM representations and neural responses, the computational principles driving this convergence—especially as LLMs evolve—remain elusive. Here we used intracranial electroencephalography recordings from neurosurgical patients listening to speech to investigate the alignment between high-performance LLMs and the language-processing mechanisms of the brain. We examined a diverse selection of LLMs with similar parameter sizes and found that as their performance on benchmark tasks improves, they not only become more brain-like, reflected in better neural response predictions from model embeddings, but they also align more closely with the hierarchical feature extraction pathways of the brain, using fewer layers for the same encoding. Additionally, we identified commonalities in the hierarchical processing mechanisms of high-performing LLMs, revealing their convergence towards similar language-processing strategies. Finally, we demonstrate the critical role of contextual information in both LLM performance and brain alignment. These findings reveal converging aspects of language processing in the brain and LLMs, offering new directions for developing models that better align with human cognitive processing.

中文翻译：

上下文特征提取层次结构在大型语言模型和大脑中收敛

人工智能的最新进展引发了人们对大型语言模型（LLMs，尤其是在语言理解方面。尽管先前的研究表明 LLM 表示和神经反应之间存在相似性，但推动这种融合的计算原理（尤其是随着 LLMs仍然难以捉摸。在这里，我们使用神经外科患者听语音的颅内脑电图记录来研究高性能 LLMs大脑语言处理机制之间的对齐。我们检查了具有相似参数大小的 LLMs，发现随着它们在基准测试任务上的性能提高，它们不仅变得更加像大脑，反映在模型嵌入的更好的神经反应预测中，而且它们也更紧密地与大脑的分层特征提取路径保持一致，对相同的编码使用更少的层。此外，我们还确定了高性能 LLMs，揭示了它们向相似语言处理策略的趋同。最后，我们展示了上下文信息在 LLM。这些发现揭示了大脑和 LLMs，为开发更符合人类认知处理的模型提供了新的方向。

更新日期：2024-11-26

点击分享查看原文

点击收藏

阅读更多本刊新发论文