当前位置: X-MOL 学术npj Digit. Med. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Development, deployment and scaling of operating room-ready artificial intelligence for real-time surgical decision support
npj Digital Medicine ( IF 12.4 ) Pub Date : 2024-09-03 , DOI: 10.1038/s41746-024-01225-2
Sergey Protserov 1, 2, 3 , Jaryd Hunter 1 , Haochi Zhang 1 , Pouria Mashouri 1 , Caterina Masino 4 , Michael Brudno 1, 2, 3 , Amin Madani 4, 5
Affiliation  

Deep learning for computer vision can be leveraged for interpreting surgical scenes and providing surgeons with real-time guidance to avoid complications. However, neither generalizability nor scalability of computer-vision-based surgical guidance systems have been demonstrated, especially to geographic locations that lack hardware and infrastructure necessary for real-time inference. We propose a new equipment-agnostic framework for real-time use in operating suites. Using laparoscopic cholecystectomy and semantic segmentation models for predicting safe/dangerous (“Go”/”No-Go”) zones of dissection as an example use case, this study aimed to develop and test the performance of a novel data pipeline linked to a web-platform that enables real-time deployment from any edge device. To test this infrastructure and demonstrate its scalability and generalizability, lightweight U-Net and SegFormer models were trained on annotated frames from a large and diverse multicenter dataset from 136 institutions, and then tested on a separate prospectively collected dataset. A web-platform was created to enable real-time inference on any surgical video stream, and performance was tested on and optimized for a range of network speeds. The U-Net and SegFormer models respectively achieved mean Dice scores of 57% and 60%, precision 45% and 53%, and recall 82% and 75% for predicting the Go zone, and mean Dice scores of 76% and 76%, precision 68% and 68%, and recall 92% and 92% for predicting the No-Go zone. After optimization of the client-server interaction over the network, we deliver a prediction stream of at least 60 fps and with a maximum round-trip delay of 70 ms for speeds above 8 Mbps. Clinical deployment of machine learning models for surgical guidance is feasible and cost-effective using a generalizable, scalable and equipment-agnostic framework that lacks dependency on hardware with high computing performance or ultra-fast internet connection speed.



中文翻译:


开发、部署和扩展手术室就绪的人工智能,以支持实时手术决策



计算机视觉的深度学习可用于解释手术场景并为外科医生提供实时指导以避免并发症。然而,基于计算机视觉的手术引导系统的通用性和可扩展性都尚未得到证明,特别是对于缺乏实时推理所需的硬件和基础设施的地理位置。我们提出了一种新的与设备无关的框架,用于在操作套件中实时使用。使用腹腔镜胆囊切除术和语义分割模型来预测解剖的安全/危险(“Go”/“No-Go”)区域作为示例用例,本研究旨在开发和测试连接到网络的新型数据管道的性能- 支持从任何边缘设备进行实时部署的平台。为了测试该基础设施并证明其可扩展性和通用性,轻量级 U-Net 和 SegFormer 模型在来自 136 个机构的大型多样化多中心数据集的带注释框架上进行了训练,然后在单独的前瞻性收集的数据集上进行了测试。创建了一个网络平台,可以对任何手术视频流进行实时推理,并对各种网络速度进行性能测试和优化。 U-Net 和 SegFormer 模型在预测围棋区方面分别实现了平均 Dice 分数 57% 和 60%、精确度 45% 和 53%、召回率 82% 和 75%、平均 Dice 分数 76% 和 76%,预测禁区的准确率分别为 68% 和 68%,召回率分别为 92% 和 92%。在优化网络上的客户端-服务器交互后,我们提供了至少 60 fps 的预测流,并且对于超过 8 Mbps 的速度,最大往返延迟为 70 毫秒。 使用可通用、可扩展且与设备无关的框架,在临床上部署用于手术指导的机器学习模型是可行且具有成本效益的,该框架不依赖于具有高计算性能或超快互联网连接速度的硬件。

更新日期:2024-09-04
down
wechat
bug