当前位置: X-MOL 学术WIREs Data Mining Knowl. Discov. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Lead–lag effect of research between conference papers and journal papers in data mining
WIREs Data Mining and Knowledge Discovery ( IF 6.4 ) Pub Date : 2024-09-24 , DOI: 10.1002/widm.1561
Yue Huang, Runyu Tian

The examination of the lead–lag effect between different publication types, incorporating a temporal dimension, is very significant for assessing research. In this article, we introduce a novel framework to quantify the lead–lag effect between the research topics of conference papers and journal papers. We first identify research topics via the text‐embedding‐based topic modeling technique BERTopic, then extract the research topics of each time slice, construct and visualize the similarity matrix of topics to reveal the time‐lag direction and finally quantify the lead–lag effect by four proposed indicators, as well as by average influence topic similarity comparison maps. We conduct a detailed analysis of 19,166 bibliographic data for top conference papers and journal papers from 2015 to 2019 in the data mining field, calculate the similarity of topics obtained by BERTopic between each time slice divided by quarters. The results show that journal paper topics lag behind conference paper topics in the data mining field. The most significant lead–lag effect is 2.5 years, with approximately 33.45% of topics affected by this lag. The methodology presented here holds potential for broader application in the analysis of lead–lag effects across diverse research areas, offering valuable insights into the state of research development and informing policy decisions.This article is categorized under: Application Areas > Science and Technology

中文翻译:


数据挖掘中会议论文与期刊论文研究的超前滞后效应



检查不同出版物类型之间的超前滞后效应(纳入时间维度)对于评估研究非常重要。在本文中,我们介绍了一种新颖的框架来量化会议论文和期刊论文的研究主题之间的超前滞后效应。我们首先通过基于文本嵌入的主题建模技术 BERTopic 识别研究主题,然后提取每个时间片的研究主题,构建并可视化主题的相似度矩阵以揭示时间滞后方向,最后量化超前滞后效应通过四个提出的指标,以及通过平均影响力主题相似度比较图。我们对数据挖掘领域2015年至2019年的19,166篇顶级会议论文和期刊论文的书目数据进行了详细分析,计算了BERTopic得到的每个时间片除以季度的主题相似度。结果表明,数据挖掘领域期刊论文主题落后于会议论文主题。最显着的超前滞后效应是 2.5 年,大约 33.45% 的主题受到此滞后的影响。这里介绍的方法在分析不同研究领域的超前-滞后效应方面具有更广泛的应用潜力,为研究发展状况提供有价值的见解并为政策决策提供信息。本文分类如下:应用领域 > 科学与技术
更新日期:2024-09-24
down
wechat
bug