当前位置: X-MOL 学术J. Cheminfom. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Data mining of PubChem bioassay records reveals diverse OXPHOS inhibitory chemotypes as potential therapeutic agents against ovarian cancer
Journal of Cheminformatics ( IF 7.1 ) Pub Date : 2024-10-07 , DOI: 10.1186/s13321-024-00906-0
Sejal Sharma, Liping Feng, Nicha Boonpattrawong, Arvinder Kapur, Lisa Barroilhet, Manish S. Patankar, Spencer S. Ericksen

Focused screening on target-prioritized compound sets can be an efficient alternative to high throughput screening (HTS). For most biomolecular targets, compound prioritization models depend on prior screening data or a target structure. For phenotypic or multi-protein pathway targets, it may not be clear which public assay records provide relevant data. The question also arises as to whether data collected from disparate assays might be usefully consolidated. Here, we report on the development and application of a data mining pipeline to examine these issues. To illustrate, we focus on identifying inhibitors of oxidative phosphorylation, a druggable metabolic process in epithelial ovarian tumors. The pipeline compiled 8415 available OXPHOS-related bioassays in the PubChem data repository involving 312,093 unique compound records. Application of PubChem assay activity annotations, PAINS (Pan Assay Interference Compounds), and Lipinski-like bioavailability filters yields 1852 putative OXPHOS-active compounds that fall into 464 clusters. These chemotypes are diverse but have relatively high hydrophobicity and molecular weight but lower complexity and drug-likeness. These chemotypes show a high abundance of bicyclic ring systems and oxygen containing functional groups including ketones, allylic oxides (alpha/beta unsaturated carbonyls), hydroxyl groups, and ethers. In contrast, amide and primary amine functional groups have a notably lower than random prevalence. UMAP representation of the chemical space shows strong divergence in the regions occupied by OXPHOS-inactive and -active compounds. Of the six compounds selected for biological testing, 4 showed statistically significant inhibition of electron transport in bioenergetics assays. Two of these four compounds, lacidipine and esbiothrin, increased in intracellular oxygen radicals (a major hallmark of most OXPHOS inhibitors) and decreased the viability of two ovarian cancer cell lines, ID8 and OVCAR5. Finally, data from the pipeline were used to train random forest and support vector classifiers that effectively prioritized OXPHOS inhibitory compounds within a held-out test set (ROCAUC 0.962 and 0.927, respectively) and on another set containing 44 documented OXPHOS inhibitors outside of the training set (ROCAUC 0.900 and 0.823). This prototype pipeline is extensible and could be adapted for focus screening on other phenotypic targets for which sufficient public data are available. Scientific contribution Here, we describe and apply an assay data mining pipeline to compile, process, filter, and mine public bioassay data. We believe the procedure may be more broadly applied to guide compound selection in early-stage hit finding on novel multi-protein mechanistic or phenotypic targets. To demonstrate the utility of our approach, we apply a data mining strategy on a large set of public assay data to find drug-like molecules that inhibit oxidative phosphorylation (OXPHOS) as candidates for ovarian cancer therapies.

中文翻译:


PubChem 生物测定记录的数据挖掘揭示了多种 OXPHOS 抑制化疗型作为卵巢癌的潜在治疗药物



对靶向优先化合物组进行集中筛选可能是高通量筛选 (HTS) 的有效替代方案。对于大多数生物分子靶标,化合物优先级模型取决于先前的筛选数据或靶标结构。对于表型或多蛋白通路靶标,可能不清楚哪些公开检测记录提供了相关数据。问题还在于,从不同分析中收集的数据是否可以有效地合并。在这里,我们报告了数据挖掘管道的开发和应用,以研究这些问题。为了说明这一点,我们专注于鉴定氧化磷酸化的抑制剂,氧化磷酸化是上皮性卵巢肿瘤中的一种可药物治疗的代谢过程。该管道在 PubChem 数据存储库中编译了 8415 个可用的 OXPHOS 相关生物测定,涉及 312,093 条独特的化合物记录。应用 PubChem 检测活性注释、PAINS(泛检测干扰化合物)和类似 Lipinski 的生物利用度过滤器,可产生 1852 种推定的 OXPHOS 活性化合物,分为 464 个簇。这些化学型多种多样,但具有相对较高的疏水性和分子量,但复杂性和药物相似性较低。这些化学型显示出高丰度的双环系统和含氧官能团,包括酮、烯丙基氧化物(α/β 不饱和羰基)、羟基和醚。相比之下,酰胺和伯胺官能团的普遍性明显低于随机。化学空间的 UMAP 表示在 OXPHOS 非活性和活性化合物占据的区域显示出强烈的差异。在选择用于生物测试的 6 种化合物中,4 种在生物能量学分析中显示出具有统计学意义的电子传递抑制。 这四种化合物中的两种,拉西地平和艾司生物菊酯,在细胞内氧自由基(大多数 OXPHOS 抑制剂的主要标志)中增加,并降低了两种卵巢癌细胞系 ID8 和 OVCAR5 的活力。最后,来自管道的数据用于训练随机森林和支持向量分类器,这些分类器有效地优先考虑保留测试集(分别为 ROCAUC 0.962 和 0.927)和另一个包含 44 种记录在案的 OXPHOS 抑制剂的训练集之外的 OXPHOS 抑制剂(ROCAUC 0.900 和 0.823)。该原型管道是可扩展的,可以适用于对有足够公共数据的其他表型靶标进行重点筛选。科学贡献 在这里,我们描述并应用了检测数据挖掘管道来编译、处理、过滤和挖掘公共生物检测数据。我们相信该程序可能更广泛地应用于指导新的多蛋白机制或表型靶标的早期苗头化合物发现中的化合物选择。为了证明我们方法的实用性,我们对大量公共检测数据应用数据挖掘策略,以寻找抑制氧化磷酸化 (OXPHOS) 的药物样分子作为卵巢癌治疗的候选者。
更新日期:2024-10-08
down
wechat
bug