Complex & Intelligent Systems ( IF 5.0 ) Pub Date : 2024-07-12 , DOI: 10.1007/s40747-024-01545-6 Richárd Kiss , Gábor Szűcs
Network science has witnessed a surge in popularity, driven by the transformative power of node representation learning for diverse applications like social network analysis and biological modeling. While shallow embedding algorithms excel at capturing network structure, they face a critical limitation—failing to generalize to unseen nodes. This paper addresses this challenge by introducing Inductive Shallow Node Embedding—as a main contribution—pioneering a novel approach that extends shallow embeddings to the realm of inductive learning. It has a novel encoder architecture that captures the local neighborhood structure of each node, enabling effective generalization to unseen nodes. In the generalization, robustness is essential to avoid degradation of performance arising from noise in the dataset. It has been theoretically proven that the covariance of the additive noise term in the proposed model is inversely proportional to the cardinality of a node’s neighbors. Another contribution is a mathematical lower bound to quantify the robustness of node embeddings, confirming its advantage over traditional shallow embedding methods, particularly in the presence of parameter noise. The proposed method demonstrably excels in dynamic networks, consistently achieving over 90% performance on previously unseen nodes compared to nodes encountered during training on various benchmarks. The empirical evaluation concludes that our method outperforms competing methods on the vast majority of datasets in both transductive and inductive tasks.
中文翻译:
具有归纳浅节点嵌入的无监督图表示学习
在节点表示学习对于社交网络分析和生物建模等各种应用的变革力量的推动下,网络科学的普及率激增。虽然浅嵌入算法擅长捕获网络结构,但它们面临着一个关键的限制——无法泛化到看不见的节点。本文通过引入归纳浅层节点嵌入(作为主要贡献)来解决这一挑战,开创了一种将浅层嵌入扩展到归纳学习领域的新颖方法。它具有新颖的编码器架构,可以捕获每个节点的局部邻域结构,从而能够有效地泛化到未见过的节点。概括而言,鲁棒性对于避免数据集中的噪声引起的性能下降至关重要。理论上已经证明,所提出的模型中加性噪声项的协方差与节点邻居的基数成反比。另一个贡献是量化节点嵌入鲁棒性的数学下界,证实了其相对于传统浅嵌入方法的优势,特别是在存在参数噪声的情况下。所提出的方法在动态网络中表现明显出色,与在各种基准训练期间遇到的节点相比,在以前未见过的节点上始终实现了 90% 以上的性能。实证评估得出的结论是,在传导和归纳任务中,我们的方法在绝大多数数据集上都优于竞争方法。