Foundations and Trends in Information Retrieval ( IF 8.3 ) Pub Date : 2023-4-18 , DOI: 10.1561/1500000091 Peng Zhang , Hui Gao , Jing Zhang , Dawei Song
The introduction of Quantum Theory (QT) provides a unified mathematical framework for Information Retrieval (IR). Compared with the classical IR framework, the quantuminspired IR framework is based on user-centered modeling methods to model non-classical cognitive phenomena in human relevance judgment in the IR process. With the increase of data and computing resources, neural IR methods have been applied to the text matching and understanding task of IR. Neural networks have a strong learning ability of effective representation and generalization of matching patterns from raw data. However, these methods show some unavoidable defects, such as the inability to model user cognitive phenomena, large number of model parameters and the “black box” characteristics of network structure. These problems greatly limit the development of neural IR and related fields. Although the quantum-inspired retrieval framework can theoretically solve the above problems, it is faced with problems such as poor model efficiency and difficulty in integrating with neural network, which lead to a huge gap between QT and neural network modeling.
This review gives a systematic introduction to quantuminspired neural IR, including quantum-inspired neural language representation, matching and understanding. This is not only helpful to non-classical phenomena modeling in IR but also to break the theoretical bottleneck of neural networks and design more transparent neural IR models. We introduce the language representation method based on QT and the quantum-inspired text matching and decision making model under neural network, which shows its theoretical advantages in document ranking, relevance matching, multimodal IR, and can be integrated with neural networks to jointly promote the development of IR. The latest progress of quantum language understanding is introduced and further topics on QT and language modeling provide readers with more materials for thinking.
中文翻译:
受量子启发的神经语言表示、匹配和理解
量子理论 (QT) 的引入为信息检索 (IR) 提供了统一的数学框架。与经典IR框架相比,quantuminspired IR框架基于以用户为中心的建模方法,对IR过程中人类相关性判断中的非经典认知现象进行建模。随着数据和计算资源的增加,神经信息检索方法被应用到信息检索的文本匹配和理解任务中。神经网络具有强大的学习能力,可以从原始数据中有效表示和泛化匹配模式。然而,这些方法都存在一些无法避免的缺陷,如无法对用户认知现象进行建模、模型参数过多以及网络结构的“黑盒”特性等。这些问题极大地限制了神经IR及相关领域的发展。虽然量子检索框架在理论上可以解决上述问题,但它面临着模型效率低、难以与神经网络集成等问题,导致QT与神经网络建模存在巨大差距。
这篇综述系统地介绍了受量子启发的神经 IR,包括受量子启发的神经语言表示、匹配和理解。这不仅有助于IR中的非经典现象建模,也有助于打破神经网络的理论瓶颈,设计更透明的神经IR模型。我们介绍了基于QT的语言表示方法和神经网络下的量子启发文本匹配和决策模型,在文档排序、相关性匹配、多模态IR方面显示了其理论优势,并可以与神经网络相结合,共同促进IR的发展。介绍了量子语言理解的最新进展,并进一步探讨了QT和语言建模的话题,为读者提供了更多的思考素材。