Information Systems Frontiers ( IF 6.9 ) Pub Date : 2024-11-22 , DOI: 10.1007/s10796-024-10555-1 Antonios Karvelas, Yannis Foufoulas, Alkis Simitsis, Yannis Ioannidis
A recent trend in data management research investigates whether machine learning techniques could improve or replace core components of traditional database architectures, such as the query optimizer or selectivity and cardinality cost estimators. The preliminary approaches leverage cost-based optimizers and cost models to avoid a cold-start as they train and build learning models. In this work, we investigate whether learning could also be beneficial in rule-based optimizers, which instead of driving query execution decisions via a cost model they rely on a set of fixed rules and pre-defined heuristics. Our experimental testbed employs MonetDB, an open-source, column-store analytics data engine, and explore whether a learning model using Graph Neural Networks (GNNs) that is trained on a cost-based engine, such as PostgreSQL, could improve MonetDB optimizer’s decisions. Our initial findings reveal that our approach could improve significantly MonetDB’s query execution plans, especially as the query complexity increases whet it involves many join operators.
中文翻译:
研究基于规则的数据引擎的学习连接顺序优化策略
数据管理研究的最新趋势是调查机器学习技术是否可以改进或替换传统数据库架构的核心组件,例如查询优化器或选择性和基数成本估算器。初步方法利用基于成本的优化器和成本模型来避免在训练和构建学习模型时出现冷启动。在这项工作中,我们研究了学习是否也适用于基于规则的优化器,它们不是通过成本模型驱动查询执行决策,而是依赖于一组固定规则和预定义的启发式方法。我们的实验测试平台采用了 MonetDB(一种开源的列存储分析数据引擎),并探索了使用在基于成本的引擎(如 PostgreSQL)上训练的图形神经网络 (GNN) 的学习模型是否可以改进 MonetDB 优化器的决策。我们的初步发现表明,我们的方法可以显著改善 MonetDB 的查询执行计划,尤其是在涉及许多连接运算符的查询复杂性增加的情况下。