当前位置: X-MOL 学术Lobachevskii J. Math. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Approximation of the Meaning for Thematic Subject Headings by Simple Interpretable Representations
Lobachevskii Journal of Mathematics ( IF 0.8 ) Pub Date : 2024-07-19 , DOI: 10.1134/s1995080224600778
R. V. Sulzhenko , B. V. Dobrov

Abstract

The paper studies methods for approximating a user labeled topics by simple representations in a text classification problem. It is assumed that in real information systems the meaning of thematic categories can be approximated by a fairly simple interpreted expression. An algorithm for constructing formulas is considered, which constructs a representation of a text topic in the form of a Boolean formula—in fact, a request to a full-text information system. The algorithm is based on an optimized selection of various logical predicates with words and terms from the thesaurus. The presented algorithm has been compared with modern machine learning techniques on real collections with noisy expert markup. The described method can be used for text classification, expert evaluation of the content of the heading, assessment of the complexity of the description of the topic, and correcting the markup.



中文翻译:


通过简单可解释的表示来近似主题标题的含义


 抽象的


本文研究了在文本分类问题中通过简单表示来近似用户标记主题的方法。假设在真实的信息系统中,主题类别的含义可以通过相当简单的解释表达式来近似。考虑了一种构造公式的算法,该算法以布尔公式的形式构造文本主题的表示——实际上是对全文信息系统的请求。该算法基于对同义词库中的单词和术语的各种逻辑谓词的优化选择。所提出的算法已与带有噪声专家标记的真实集合的现代机器学习技术进行了比较。所描述的方法可用于文本分类、标题内容的专家评估、主题描述的复杂性评估以及标记校正。

更新日期:2024-07-20
down
wechat
bug