当前位置: X-MOL 学术IEEE Trans. Fuzzy Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Fuzzy Neural Tangent Kernel Model for Identifying DNA N4-methylcytosine Sites
IEEE Transactions on Fuzzy Systems ( IF 10.7 ) Pub Date : 7-9-2024 , DOI: 10.1109/tfuzz.2024.3425616
Yijie Ding 1 , Prayag Tiwari 2 , Fei Guo 3 , Quan Zou 4 , Weiping Ding 5
Affiliation  

DNA N4-methylcytosine (4mC) site identification is a crucial field in bioinformatics, where machine learning methods have been effectively utilized. Due to the presence of noise, the existing deep learning methods for detecting 4mC have consistently low recognition rates in positive samples. With fuzzy rules and membership functions, fuzzy systems can achieve good results in processing noisy signals. In contrast to traditional fuzzy systems that lack deep feature representation and sample measurement, we introduce novel techniques to enhance generalization and feature representation. By incorporating the neural tangent kernel (NTK) and kernel learning algorithm into the fuzzy system, we propose the fuzzy neural tangent kernel (FNTK) model and the radius-based FNTK (R-FNTK) model to predict DNA 4mC sites. To achieve better generalization performance than traditional kernel functions, we first train the NTK for feature representation learning and sample measurement. Based on the membership function and NTK matrix, different fuzzy kernel matrices are constructed for each fuzzy subset of the fuzzy system. Finally, we utilize two types of iterative kernel optimization algorithms to effectively fuse multiple NTK-based fuzzy kernels and obtain the final prediction model. Rigorous testing using 6 benchmark datasets demonstrates the superiority of our approach, yielding significant improvements in the experiment's performance.

中文翻译:


用于识别 DNA N4-甲基胞嘧啶位点的模糊神经切线核模型



DNA N4-甲基胞嘧啶(4mC)位点识别是生物信息学的一个关键领域,机器学习方法已被有效利用。由于噪声的存在,现有的检测4mC的深度学习方法在正样本中识别率一直较低。借助模糊规则和隶属函数,模糊系统在处理噪声信号时可以取得良好的效果。与缺乏深度特征表示和样本测量的传统模糊系统相比,我们引入了新技术来增强泛化和特征表示。通过将神经正切核(NTK)和核学习算法结合到模糊系统中,我们提出了模糊神经正切核(FNTK)模型和基于半径的FNTK(R-FNTK)模型来预测DNA 4mC位点。为了获得比传统核函数更好的泛化性能,我们首先训练 NTK 进行特征表示学习和样本测量。基于隶属函数和NTK矩阵,为模糊系统的每个模糊子集构造不同的模糊核矩阵。最后,我们利用两种迭代核优化算法有效地融合多个基于NTK的模糊核并获得最终的预测模型。使用 6 个基准数据集进行的严格测试证明了我们方法的优越性,显着提高了实验性能。
更新日期:2024-08-22
down
wechat
bug