基于多模态融合的鼻咽癌复发风险预测

陈彩洪; 唐业欢; 覃茂昌; 林伟龙; 甘瑞静; 宾翔; 黄代政

doi:10.16190/j.cnki.45-1211/r.2026.02.008

基于多模态融合的鼻咽癌复发风险预测

Prediction of recurrence risk in nasopharyngeal carcinoma based on multimodal fusion

摘要

摘要: 目的：整合临床、病理全切片图像（whole slide image，WSI）和磁共振成像（magnetic resonance imaging，MRI）数据，构建多模态机器学习模型，评估鼻咽癌患者术后复发风险。方法：回顾性收集168例鼻咽癌患者的临床资料、WSI及多序列MRI。通过影像组学及CTransPath+CLAM框架分别提取MRI特征和WSI肿瘤区域特征，并采用随机森林方法比较单模态与多模态模型性能。模型通过分层5折交叉验证进行训练与评估，以受试者工作特征曲线下面积（the area under the receiver operating characteristic curve，AUC）为主要评价指标，并结合决策曲线分析评估临床净获益。结果：融合临床、WSI与MRI的多模态模型预测效能最佳（AUC=0.794）。相较于单一临床预测因子模型（AUC=0.579），提升0.215（P=0.109）；相较于AJCC解剖分期模型（AUC=0.611），提升0.183（P=0.015）；而临床与分期联合模型（AUC=0.660）较多模态模型仍有显著差距（ΔAUC=0.134，P=0.015）。在单模态对比中，其性能亦优于MRI模型（AUC=0.769，P＞0.05）及WSI模型（AUC=0.511，P＜0.001）。决策曲线分析显示，多模态模型在大多数风险阈值范围内净获益最高。模型解析表明，其预测能力主要来源于反映肿瘤异质性的MRI纹理特征。结论：成功构建并验证多模态机器学习模型，通过融合临床、WSI与MRI信息，展现出鼻咽癌复发预测的临床应用潜力。

Abstract: Objective: To integrate clinical and pathological whole slide image (WSI) with magnetic resonance imaging (MRI) data to construct a multimodal machine learning model for assessing postoperative recurrence risk in nasopharyngeal carcinoma patients. Methods: Retrospective collection of clinical data, WSI, and multi-sequence MRI from 168 nasopharyngeal carcinoma patients. MRI features and tumor region features of WSI were extracted separately via radiomics and the CTransPath+CLAM framework, respectively. The performance of unimodal and multimodal prediction models was compared using the random forest method. All models were trained and evaluated via 5-fold stratified cross-validation. The area under the receiver operating characteristic curve (AUC) served as the primary performance metric, and clinical net benefit was assessed using decision curve analysis. Results: The multimodal model integrating clinical data, WSI, and MRI demonstrated the best predictive performance, with an AUC of 0.794, representing an improvement of 0.215 compared with the clinical indicators model (AUC=0.579, P=0.109) and an increase of 0.183 compared with the AJCC anatomic staging model (AUC=0.611, P=0.015); however, the combined model of clinical indicators and staging (AUC = 0.660) still showed a significant deficit compared with the multimodal model (ΔAUC=0.134, P=0.015). In head-to-head comparisons, the multimodal model also outperformed the MRI model (AUC=0.769, P＞0.05) and the WSI model (AUC=0.511, P＜0.001). Decision curve analysis (DCA) indicated that the multimodal model yielded the highest net benefit across most risk threshold ranges. Model interpretation revealed that its predictive power primarily stems from MRI textural features reflecting tumor heterogeneity. Conclusion: A multimodal machine learning model is successfully constructed and validated. By integrating clinical data, WSI and MRI information, it demonstrates promising clinical application potential for recurrence prediction of nasopharyngeal carcinoma.

HTML全文

参考文献(16)

施引文献

资源附件(0)