Foundations and Trends in Information Retrieval ( IF 8.3 ) Pub Date : 2023-5-14 , DOI: 10.1561/1500000071 Sebastian Bruch , Claudio Lucchese , Franco Maria Nardini
As information retrieval researchers, we not only develop algorithmic solutions to hard problems, but we also insist on a proper, multifaceted evaluation of ideas. The literature on the fundamental topic of retrieval and ranking, for instance, has a rich history of studying the effectiveness of indexes, retrieval algorithms, and complex machine learning rankers, while at the same time quantifying their computational costs, from creation and training to application and inference. This is evidenced, for example, by more than a decade of research on efficient training and inference of large decision forest models in Learning to Rank (LtR). As we move towards even more complex, deep learning models in a wide range of applications, questions on efficiency have once again resurfaced with renewed urgency. Indeed, efficiency is no longer limited to time and space; instead it has found new, challenging dimensions that stretch to resource-, sample- and energy-efficiency with ramifications for researchers, users, and the environment.
This monograph takes a step towards promoting the study of efficiency in the era of neural information retrieval by offering a comprehensive survey of the literature on efficiency and effectiveness in ranking, and to a limited extent, retrieval. This monograph was inspired by the parallels that exist between the challenges in neural network-based ranking solutions and their predecessors, decision forest-based LtR models, as well as the connections between the solutions the literature to date has to offer. We believe that by understanding the fundamentals underpinning these algorithmic and data structure solutions for containing the contentious relationship between efficiency and effectiveness, one can better identify future directions and more efficiently determine the merits of ideas. We also present what we believe to be important research directions in the forefront of efficiency and effectiveness in retrieval and ranking.
中文翻译:
高效且有效的基于树的神经学习排名
作为信息检索研究人员,我们不仅要为难题开发算法解决方案,而且我们还坚持对想法进行适当的、多方面的评估。例如,关于检索和排序这一基本主题的文献在研究索引、检索算法和复杂机器学习排序器的有效性方面有着丰富的历史,同时量化它们从创建和训练到应用的计算成本和推理。例如,十多年来对学习排名 (LtR) 中大型决策森林模型的有效训练和推理的研究证明了这一点。随着我们在广泛的应用中转向更复杂的深度学习模型,关于效率的问题再次以新的紧迫性重新浮出水面。的确,效率不再受时间和空间的限制;相反,它发现了新的、具有挑战性的维度,延伸到资源、样品和能源效率,对研究人员、用户和环境产生影响。
这本专着通过全面调查有关排序效率和有效性的文献,并在有限的范围内,向神经信息检索时代的效率研究迈出了一步。本专着的灵感来自基于神经网络的排名解决方案与其前身、基于决策森林的 LtR 模型之间存在的相似之处,以及迄今为止文献必须提供的解决方案之间的联系。我们相信,通过理解支撑这些算法和数据结构解决方案的基本原理,以包含效率和有效性之间有争议的关系,人们可以更好地确定未来的方向并更有效地确定想法的优点。