当证据建模遇上知识蒸馏:面向非对比医学图像分割的可靠对比增强知识蒸馏|文献速递-医学影像算法文献分享

Title

题目

When evidence modeling meets knowledge distillation: Towards reliablecontrast-enhanced knowledge distillation for non-contrast medical imagesegmentation

当证据建模遇上知识蒸馏:面向非对比医学图像分割的可靠对比增强知识蒸馏

01

文献速递介绍

对比增强知识蒸馏概述与研究背景   对比增强知识蒸馏旨在将对比增强知识(即从对比增强医学图像中提取的知识)迁移至非对比医学图像,以支持肿瘤分析。这种方法有望改进医学诊断,并为非对比医学图像的肿瘤分析提供创新手段(Savage,2024;Hatsutani 等,2023)。然而,现有医学图像分割严重依赖对比增强医学图像,这不可避免地带来了临床检查中注射造影剂的局限性(Huynh 等,2020)。例如,在 MRI 检查中,钆基造影剂被广泛用于增强脑、肝、肾成像中正常组织与病变组织的信号差异(Kinno 和 Sutter,2023);而在 CT 中,碘化造影剂通常用于改善血管和组织的可视化,助力病变的准确检测与表征(Zopfs 等,2021)。对比增强成像已成为临床实践中肿瘤分割和诊断的标准工具(Wang 等,2024;Stollmayer 等,2025)。通过增强肿瘤边界的可见性,造影剂提供了关键信息,辅助准确分割、治疗规划和手术评估(Zhao 等,2020),这使得对比增强成像在肿瘤检测或边界勾勒中极具价值。但造影剂注射存在耗时、健康风险增加和成本较高等问题,限制了其在常规临床实践中的应用(Zhao 等,2021)。造影剂的使用可能引发过敏反应和肾脏并发症,对肾功能受损患者尤为严重(Marckmann 等,2006)。非对比成像为造影剂禁忌患者提供了更安全的替代方案,但其虽在安全性和可及性上有优势,却常缺乏精确分割所需的对比度。因此,为临床诊断和治疗提供省时、安全且经济的工具,AI 造影剂生成技术受到日益关注(Savage,2024;Zhao 等,2020)。    ### 对比增强知识蒸馏的挑战   对比增强知识蒸馏仍面临挑战,主要源于造影剂注射导致的复杂像素增强和额外噪声(Wang 等,2022b)。具体而言,造影剂常使肿瘤与周围组织呈现异质的强度模式,受血管动力学和成像时机影响,这增加了特征表示的一致性难度(Bône 等,2022)。此外,增强过程可能引入冗余或误导性信号,导致向非对比图像迁移知识时出现变异性和潜在的错位(Bône 等,2022)。如图 1(A)所示,造影剂注射带来的复杂增强易在特征提取中导致肿瘤表示模糊,额外噪声则易引发过度自信的决策。这些缺陷使基于 AI 生成的对比增强知识难以应用于非对比医学图像的肿瘤处理,这一局限在手术规划、肿瘤监测和资源有限环境中的诊疗中尤为关键——非对比成像在此类场景中是更安全或更易获取的替代方案(Zhao 等,2021)。如图 1(B)所示,尽管已有研究尝试迁移对比增强知识以指导非对比医学图像处理(Hatsutani 等,2023;Liang 等,2021;Xiao 等,2019),但这些工作仍存在局限,未探索具有不确定性感知的对比增强知识。此外,现有造影剂生成相关研究仅依赖视觉信息,忽略了医学语言中丰富的语义特征。例如,Zhang 等(2021)采用不确定性筛选自集成方法探索对比增强知识迁移,但其仅考虑图像信息,忽略互补的文本数据,且不确定性建模局限于基于熵的伪标签筛选,缺乏明确且可解释的不确定性量化。    ### 证据建模与跨模态学习的潜力   证据建模和跨模态学习有望克服过度自信决策和复杂像素增强的挑战。一方面,得益于文本中丰富的语义信息(Radford 等,2021),跨模态学习的兴起为从文本特征中学习丰富语义知识以理解复杂医学图像提供了巨大机遇(Shrestha 等,2023;Zhao 等,2023)。另一方面,提升医学图像处理的可靠性(如对分布偏移的鲁棒性、将不确定性估计纳入决策)在跨域医学图像数据准确学习、确保可靠肿瘤分析方面取得了显著进展(Gao 等,2023;Zhao 和 Li,2023)。受这些研究启发,本研究旨在通过跨模态学习增强肿瘤表示以建模复杂像素增强,并通过证据建模提升像素表示的置信度,实现可靠知识蒸馏,如图 1(C)所示。具体而言,在跨模态学习中,对比增强与非对比图像的共同文本特征具有域一致性,嵌入从共同文本中提取的语义特征可增强两种模态医学图像的域一致性;同时,证据建模通过量化预测不确定性实现可信的像素表示;最终,将对比增强域的证据蒸馏以指导非对比域,实现具有不确定性感知的 AI 驱动对比增强知识迁移。    ### 现有方法的局限性   现有跨模态学习和知识蒸馏方法在对比增强知识蒸馏中仍存在诸多局限。当前跨模态学习方法(如视觉语言模型,VLM)受限于图像级的图文匹配(Radford 等,2021),导致空间信息丢失,缺乏像素感知的文本特征嵌入。此外,知识蒸馏方法依赖通过 𝑠𝑜𝑓𝑡𝑚𝑎𝑥 函数获得的 logits,易导致过度自信,损害对比增强知识学习的可靠性(Han 等,2022)。例如,许多基于医学 VLM 的研究尝试应用于分类(Lin 等,2023a;You 等,2023)、检测(Liu 等,2023)和分割(Liu 等,2023;Aleem 等,2024)任务,但直接迁移现有 VLM(如对比语言 - 图像预训练模型 CLIP(Radford 等,2021))用于医学图像分割仍面临局限。现有 VLM 的图文匹配算法通常基于全局平均图像特征,通过乘法(Radford 等,2021)或融合模块(Liu 等,2023;Lüddecke 和 Ecker,2022)在图像级嵌入文本特征,这种全局平均特征的使用导致空间信息丢失(而空间信息对复杂医学图像表示至关重要),并在肿瘤分割等像素预测任务中产生差距。在通过知识蒸馏进行知识迁移方面(Gou 等,2021),现有医学图像处理研究(Gao 等,2019;Zhai 等,2023)主要聚焦于将知识蒸馏应用于各类临床场景,但均忽略了跨域知识迁移过程中跨域知识学习的可靠性(即未将不确定性估计纳入决策)。    ### 本研究方法:EGTA - KD   本研究提出一种证据引导的肿瘤感知知识蒸馏(EGTA - KD) 方法,利用可靠的 AI 造影剂实现非对比医学图像的肿瘤分割。具体而言,为通过嵌入文本语义特征实现肿瘤感知跨模态学习,首先引入上下文感知提示学习以捕捉医学文本中的上下文信息;随后,如图 1(D)所示,提出肿瘤感知跨模态同步器(TACMS),计算肿瘤分数图以匹配像素级图像与文本特征。为实现具有不确定性感知的 AI 造影剂蒸馏以迁移对比增强知识,创新性的不确定性量化证据单元(UQEU)在主观逻辑框架内对概率分布进行参数化,收集对比增强知识的可靠证据,同时量化预测不确定性。最后,新设计的双层次知识蒸馏(DLKD)通过在特征层(即肿瘤分数图)和响应层(即证据)蒸馏跨模态对比增强知识,实现可靠的 AI 造影剂生成。本研究提出的 EGTA - KD 需要成对的对比增强与非对比医学图像(标注有肿瘤分割标签)以促进跨域知识迁移。用于验证的数据集(如 BraTS 2021 脑 MRI 数据集(1251 名受试者)、肝脏 MRI 数据集(238 名受试者)和肾脏 CT 数据集(113 名受试者))的详细信息见第 4.1 节。    ### 研究贡献   本研究的贡献总结如下:   - 首次提出一种新型证据引导知识蒸馏框架 EGTA - KD,对对比增强知识的证据进行建模以实现蒸馏,为非对比医学图像肿瘤分割提供可靠工具。   - 提出的 UQEU 采用狄利克雷分布和登普斯特 - 谢弗理论在主观逻辑框架内参数化概率分布,量化不确定性信息以实现具有不确定性感知的对比增强知识建模。   - 创新性的 TACMS 利用定制化矩阵运算计算像素级肿瘤分数图,支持肿瘤感知跨模态学习,填补了医学图像分析中图文融合的空白。   - DLKD 通过最小化肿瘤分数图误差并利用 KL 散度匹配证据分布,确保非对比域的不确定性量化概率与对比增强域对齐,实现可靠的造影剂蒸馏。

Abatract

摘要

Contrast-enhanced knowledge distillation promises to transform medical diagnostics and reveal promisingapproaches for tumor segmentation on non-contrast medical images. However, existing methods related tocontrast-enhanced knowledge distillation still make it hard to distill reliable contrast-enhanced knowledgefor tumor segmentation due to the limitations of (1) unable to quantify uncertainty information for reliablecontrast-enhanced and non-contrast knowledge modeling, which leads to an over-confidence cross-domainadaptation for transferring contrast-enhanced knowledge; (2) using vision information only ignores richsemantic features in medical language, which make it hard to model complex tumor enhancement feature. Inthis study, we propose an evidence-guided and tumor-aware knowledge distillation (EGTA-KD) for transferringcontrast-enhanced domain knowledge to non-contrast domain knowledge. Specifically, to achieve tumorawareness by embedding semantic features from text, the tumor-aware cross-modal synchronizer (TACMS)is proposed to calculate tumor score maps for matching pixel wise image and text features. To achievereliable cross-domain modeling for transferring contrast-enhanced knowledge, the innovative uncertaintyquantified evidence unit (UQEU) parameterizes the probability distribution within subjective logic to gatherreliable evidence of contrast-enhanced knowledge while quantifying the uncertainty of prediction. Lastly, newlydesigned dual-level knowledge distillation (DLKD) minimizes tumor score map errors and matches evidencedistribution for uncertainty-aware contrast-enhanced knowledge distillation. Extensive experiments of tumorsegmentation on non-contrast medical images are performed using multi-modality medical image datasets(i.e., Brain MRI dataset, Liver MRI dataset, and Kidney CT dataset). Experimental results demonstrate theproposed EGTA-KD outperforms the other compared state-of-the-art methods, revealing its superiority of tumorsegmentation on non-contrast medical images via uncertainty-aware contrast-enhanced knowledge distillation.

对比增强知识蒸馏有望变革医学诊断,并为非对比医学图像上的肿瘤分割提供前景广阔的方法。然而,现有对比增强知识蒸馏相关方法仍难以提炼出可靠的对比增强知识用于肿瘤分割,原因在于存在以下局限:(1)无法量化不确定性信息以实现可靠的对比增强与非对比知识建模,导致在迁移对比增强知识时出现过度自信的跨域适配;(2)仅使用视觉信息忽略了医学语言中丰富的语义特征,难以对复杂的肿瘤增强特征进行建模。   在本研究中,我们提出一种证据引导的肿瘤感知知识蒸馏(EGTA-KD)方法,用于将对比增强域知识迁移至非对比域知识。具体而言,为通过嵌入文本语义特征实现肿瘤感知,我们设计了肿瘤感知跨模态同步器(TACMS),用于计算肿瘤分数图,以匹配像素级图像与文本特征。为实现可靠的跨域建模以迁移对比增强知识,创新性的不确定性量化证据单元(UQEU)在主观逻辑框架内对概率分布进行参数化,从而收集对比增强知识的可靠证据,同时量化预测的不确定性。最后,新设计的双层次知识蒸馏(DLKD)通过最小化肿瘤分数图误差和匹配证据分布,实现具有不确定性感知的对比增强知识蒸馏。   我们在多模态医学图像数据集(即脑部MRI数据集、肝脏MRI数据集和肾脏CT数据集)上开展了非对比医学图像肿瘤分割的大量实验。实验结果表明,所提出的EGTA-KD方法性能优于其他对比的最先进方法,彰显了其通过不确定性感知的对比增强知识蒸馏在非对比医学图像肿瘤分割任务中的优越性。

Method

方法

3.1. Overview

EGTA-KD is a comprehensive framework designed for reliablecontrast-enhanced knowledge distillation. It transfers domain knowledge from contrast-enhanced to non-contrast medical images usingevidence modeling to address predictive uncertainty. This approach incorporates cross-modal interaction mechanisms and employs Dirichletdistributions to represent evidence, facilitating effective knowledge distillation while quantifying uncertainty for improved segmentation. Fig.2 illustrates the overall workflow of the proposed EGTA-KD framework,showcasing its key modules (tumor-aware cross-modal synchronizer(TACMS), uncertainty-quantified evidence unit (UQEU), and dual-levelknowledge distillation (DLKD)) and their interactions:

• TACMS for tumor-aware cross-modal learning : TACMS is a crossmodal (vision–language) fusion module designed for pixel-levelalignment of image and text features. It calculates tumor scoremaps by fusing spatial features from medical images with contextual information from text prompts, enabling tumor-aware feature integration. By leveraging text embeddings through contextaware prompt learning, TACMS facilitates segmentation tasks byembedding text information at the pixel level.

• UQEU for reliable evidence modeling while quantifying uncertainty:UQEU serves as the decision-making layer in EGTA-KD, quantifying uncertainty in pixel-level segmentation tasks. By utilizingDirichlet distributions to model evidence and Dempster–Shafertheory to assign belief masses, UQEU enables precise quantification of prediction uncertainty. This approach allows the framework to capture both the confidence in accurate predictions andthe ambiguity in uncertain regions, enhancing the ability to modelnon-contrast and contrast-enhanced knowledge.

• DLKD for contrast-enhanced knowledge transfer: DLKD is a novelknowledge distillation strategy designed to transfer knowledgefrom the teacher network (contrast-enhanced images) to the student network (non-contrast images) by aligning representationsat both the feature level and output level. It minimizes tumorscore map discrepancies and matches evidence distributions using Kullback–Leibler divergence, ensuring reliable knowledgetransfer through spatial and evidence-level alignment betweencontrast-enhanced and non-contrast domains.By utilizing these three modules, the EGTA-KD is performed via thefollowing three stages:

• Pretraining teacher network (T-Net) with TACMS and UQEU: The

T-Net is fed with contrast-enhanced medical images 𝐶𝐸 ∈Rℎ×𝑤×𝑁 and learnable medical prompt [𝐩*,* 𝐞*𝑘* ], where ℎ, 𝑤, 𝑁 arethe height, width, and number of the input medical images. Next,after extracting vision features and encoding learnable prompts,TACMS is proposed to calculate tumor score maps via fusingimage features 𝐱*𝑐𝑒 𝑓* ∈ R𝐻×𝑊 ×𝐶 and prompt features 𝐭 ∈ R𝐾×𝐶,where 𝐻, 𝑊 , 𝐶 are the height, width, and channel number offeature maps. Then, the yielded tumor score maps and featuremaps are concatenated for evidence modeling. After the operationof 𝑠𝑜𝑓 𝑡𝑝𝑙𝑢𝑠, the evidence modeling by UQEU mainly containstwo layers: the evidence layer and the Dirichlet distribution layerfor final segmentation and uncertainty estimation. Finally, undersupervised learning using ground truth and optimization via backpropagation, the T-Net is trained to learn the uncertainty-awareknowledge in the contrast-enhanced domain.

• S-Net training with DLKD: The S-Net is first fed with non-contrastmedical images 𝑁𝐶 ∈ Rℎ×𝑤×𝑁 for image features extraction.Next, similar to the procedure in T-Net, proposed TACMS andUQEU are employed for further feature processing and decisionmaking. Where TACMS calculates tumor score maps by feedingimage features of 𝐱*𝑛𝑐 𝑓* ∈ R𝐻×𝑊 ×𝐶 and pre-stored prompt features 𝐭from T-Net. Finally, in addition to the supervision of ground truth,the T-Net is trained by the contrast-enhanced knowledge distillation from DLKD, where DLKD minimizes tumor score map errorsand matches evidence distribution within the contrast-enhanceddomain and non-contrast domain.

• S-Net inference: S-Net inference is performed by using the welltrained S-Net and pre-stored prompt features for the non-contrastdomain knowledge learning, inputting non-contrast medical images and pre-stored prompt features, and outputting tumor segmentation results with uncertainty estimation. This processachieves evidence-guided and tumor-aware vision–language learning for tumor segmentation on non-contrast medical images,providing a contrast agent-free tool to assist clinical diagnosis.

 3.1. 概述   EGTA-KD 是一个用于可靠对比增强知识蒸馏的综合框架。它通过证据建模解决预测不确定性,将对比增强医学图像的域知识迁移至非对比医学图像。该方法融合跨模态交互机制,并采用狄利克雷分布表示证据,在量化不确定性以改进分割性能的同时,实现高效的知识蒸馏。图 2 展示了所提出的 EGTA-KD 框架的整体工作流程,包括其核心模块(肿瘤感知跨模态同步器 TACMS、不确定性量化证据单元 UQEU 和双层次知识蒸馏 DLKD)及其相互作用:   - TACMS(肿瘤感知跨模态学习):TACMS 是一个跨模态(视觉-语言)融合模块,用于图像与文本特征的像素级对齐。它通过融合医学图像的空间特征与文本提示的上下文信息计算肿瘤分数图,实现肿瘤感知的特征整合。借助上下文感知提示学习提取文本嵌入,TACMS 将文本信息嵌入像素级,为分割任务提供支持。   - UQEU(可靠证据建模与不确定性量化):UQEU 作为 EGTA-KD 的决策层,量化像素级分割任务中的不确定性。它利用狄利克雷分布建模证据,并通过登普斯特-谢弗理论分配信任质量,实现预测不确定性的精确量化。这种方法使框架既能捕捉准确预测的置信度,又能反映不确定区域的模糊性,提升对非对比和对比增强知识的建模能力。   - DLKD(对比增强知识迁移):DLKD 是一种新型知识蒸馏策略,通过在特征层和输出层对齐表示,将知识从教师网络(对比增强图像)迁移至学生网络(非对比图像)。它利用 Kullback–Leibler 散度最小化肿瘤分数图差异并匹配证据分布,通过对比增强域与非对比域的空间和证据层面对齐,确保可靠的知识迁移。    通过上述三个模块,EGTA-KD 的执行分为以下三个阶段:   - 带 TACMS 和 UQEU 的教师网络(T-Net)预训练:    向 T-Net 输入对比增强医学图像 ( \mathbb{X}{CE} \in \mathbb{R}^{h \times w \times N} ) 和可学习医学提示 ([\mathbf{p}, \mathbf{e}k])(其中 (h, w, N) 分别为输入医学图像的高度、宽度和数量)。在提取视觉特征并编码可学习提示后,TACMS 通过融合图像特征 ( \mathbf{x}{ce}^f \in \mathbb{R}^{H \times W \times C} ) 和提示特征 ( \mathbf{t} \in \mathbb{R}^{K \times C} )(其中 (H, W, C) 为特征图的高度、宽度和通道数)计算肿瘤分数图。随后,将生成的肿瘤分数图与特征图拼接用于证据建模。经过 (softplus) 操作后,UQEU 的证据建模主要包含两层:证据层和狄利克雷分布层,用于最终分割和不确定性估计。最后,在真值监督和反向传播优化下,T-Net 被训练以学习对比增强域中具有不确定性感知的知识。   - 带 DLKD 的学生网络(S-Net)训练:    首先向 S-Net 输入非对比医学图像 ( \mathbb{X}{NC} \in \mathbb{R}^{h \times w \times N} ) 以提取图像特征。与 T-Net 流程类似,采用 TACMS 和 UQEU 进行进一步特征处理和决策——TACMS 通过输入图像特征 ( \mathbf{x}_{nc}^f \in \mathbb{R}^{H \times W \times C} ) 和 T-Net 中预存储的提示特征 ( \mathbf{t} ) 计算肿瘤分数图。最终,除真值监督外,S-Net 还通过 DLKD 进行对比增强知识蒸馏训练,其中 DLKD 最小化对比增强域与非对比域的肿瘤分数图误差并匹配证据分布。   - S-Net 推理:    S-Net 推理通过训练良好的 S-Net 和预存储的提示特征实现非对比域知识学习,输入非对比医学图像和预存储提示特征,输出带不确定性估计的肿瘤分割结果。该过程实现了证据引导的肿瘤感知视觉-语言学习,用于非对比医学图像肿瘤分割,为临床诊断提供无造影剂辅助工具。

Conclusion

结论

The proposed EGTA-KD provides a novel knowledge distillationframework with evidence modeling and cross-modal learning for noncontrast medical image segmentation. The innovative TACMS introduces tumor score maps for tumor-aware cross-modal learning, achieving pixel-wise text feature embedding while incorporating sufficientspatial information from images. The DLKD distills contrast-enhancedknowledge from the contrast-enhanced to the non-contrast domainsby minimizing tumor score map errors and matching evidence distribution. Additionally, the proposed UQEU quantifies the uncertaintyof contrast-enhanced features to gather reliable evidence in the tumor region, marking the first time the reliability of contrast-enhancedknowledge has been considered in AI contrast agent generation. Furthermore, context-aware prompt learning captures contextual information in diverse medical prompts, enhancing the extraction and robustness of medical prompt features. Ablation studies indicated that eachcomponent in our EGTA-KD contributes to tumor segmentation on noncontrast medical images. Extensive experimental results demonstratedthat our proposed EGTA-KD outperforms the compared state-of-theart methods in non-contrast medical image segmentation. Therefore,EGTA-KD provides a reliable tool for contrast-enhanced knowledgedistillation, having great potential to provide a contrast agent-freemethod with safe, time-saving, and cost-effective advantages for tumorsegmentation.

所提出的EGTA-KD为非对比医学图像分割提供了一种融合证据建模与跨模态学习的新型知识蒸馏框架。创新的肿瘤感知跨模态同步器(TACMS)引入肿瘤分数图用于肿瘤感知跨模态学习,在整合图像丰富空间信息的同时,实现了像素级文本特征嵌入。双层次知识蒸馏(DLKD)通过最小化肿瘤分数图误差和匹配证据分布,将对比增强知识从对比增强域迁移至非对比域。此外,所提出的不确定性量化证据单元(UQEU)对对比增强特征的不确定性进行量化,以在肿瘤区域聚合可靠证据,这也是首次在人工智能造影剂生成领域考虑对比增强知识的可靠性。同时,上下文感知提示学习捕捉不同医学提示中的上下文信息,提升了医学提示特征的提取效果和鲁棒性。消融实验表明,EGTA-KD中的每个组件均对非对比医学图像肿瘤分割性能有贡献。大量实验结果证实,所提出的EGTA-KD在非对比医学图像分割任务中性能优于对比的现有先进方法。因此,EGTA-KD为对比增强知识蒸馏提供了可靠工具,有望成为一种无造影剂的肿瘤分割方法,具备安全、省时、经济高效的优势。

Results

结果

The EGTA-KD is validated in the brain, liver, and kidney tumorsegmentation on non-contrast medical images. The visual segmentationresults are shown in Fig. 6 and the quantitative analysis results ofDSC and HD95 are shown in Tables 2 and 3, respectively. Experimental results indicate that EGTA-KD accurately segments tumors, whichachieves an average DSC value of 0.908 for brain tumor segmentation,an average DSC value of 0.905 for liver tumor segmentation, and a DSCvalue of 0.763 for kidney tumor segmentation. For HD95 evaluationcriteria, EGTA-KD achieves an average HD95 value of 6.23 mm forbrain tumor segmentation, an average HD95 value of 3.52 mm for livertumor segmentation, and an HD95 value of 3.05 mm for kidney tumorsegmentation.

EGTA-KD在非对比医学图像的脑肿瘤、肝肿瘤和肾肿瘤分割任务中进行了验证。视觉分割结果如图6所示,Dice相似系数(DSC)和95%豪斯多夫距离(HD95)的定量分析结果分别如表2和表3所示。实验结果表明,EGTA-KD能够准确分割肿瘤:在脑肿瘤分割中达到0.908的平均DSC值,在肝肿瘤分割中达到0.905的平均DSC值,在肾肿瘤分割中达到0.763的DSC值。在HD95评价指标上,EGTA-KD在脑肿瘤分割中的平均HD95值为6.23毫米,在肝肿瘤分割中的平均HD95值为3.52毫米,在肾肿瘤分割中的HD95值为3.05毫米。

Figure

图片

Fig. 1. Overview of the significance of reliable contrast-enhanced knowledge distillation. (A) Challenges of contrast-enhanced knowledge distillation, (B) Limitations of existingknowledge distillation framework for contrast-enhanced knowledge distillation, and (C) Advantages of our proposed EGTA-KD.

图1可靠对比增强知识蒸馏的意义概述。(A)对比增强知识蒸馏面临的挑战;(B)现有知识蒸馏框架在对比增强知识蒸馏中的局限性;(C)我们提出的EGTA-KD方法的优势。 

图片

Fig. 2. The simplified workflow of EGTA-KD presents the relationship betweenTeacher-Net, Student-Net, and the core modules (TACMS, UQEU, and DLKD).

图2 EGTA-KD的简化工作流程展示了教师网络(Teacher-Net)、学生网络(Student-Net)与核心模块(肿瘤感知跨模态同步器TACMS、不确定性量化证据单元UQEU、双层次知识蒸馏DLKD)之间的关系。

图片

Fig. 3. Overview of our proposed EGTA-KD. It mainly contains three innovative parts for reliable contrast-enhanced knowledge distillation: TACMS for tumor-aware cross-modallearning, UQEU for reliable contrast-enhanced knowledge modeling with uncertainty quantification, and DLKD for reliable contrast agent distillation at both tumor score map andevidence level.

图3 我们提出的EGTA-KD方法概述。该方法主要包含三个创新部分,用于实现可靠的对比增强知识蒸馏:肿瘤感知跨模态同步器(TACMS)用于肿瘤感知跨模态学习;不确定性量化证据单元(UQEU)用于带不确定性量化的可靠对比增强知识建模;双层次知识蒸馏(DLKD)用于在肿瘤分数图和证据层面实现可靠的造影剂蒸馏。

图片

Fig. 4. Algorithm comparison: classic text feature embedding, cross-attention, and proposed TACMS. (A) Classic text feature embedding performs global image-level fusion ordirect concatenation, which loses spatial information critical for segmentation. (B) Cross-attention mechanisms align features through query–key–value projections but introducesignificant computational overhead. (C) The proposed TACMS generates tumor score maps, preserving rich spatial details and enabling pixel-level text embedding tailored formedical image segmentation tasks.

图4 算法对比:经典文本特征嵌入、交叉注意力机制与所提出的TACMS。(A)经典文本特征嵌入采用全局图像级融合或直接拼接方式,丢失了分割任务关键的空间信息。(B)交叉注意力机制通过查询-键-值投影实现特征对齐,但引入了显著的计算开销。(C)所提出的TACMS生成肿瘤分数图,保留丰富的空间细节,实现专为医学图像分割任务设计的像素级文本嵌入。

图片

Fig. 5. Algorithm comparison between classic knowledge distillation in vision languagemodel and our DLKD. (A) The knowledge distillation in the classic method is performedbetween independent T-Net and S-Net on the response level. (B) Our DLKD concernsthe feature level distillation using tumor score map and the response level distillationusing evidence. Moreover, to further improve distillation and reduce computation, wereuse the common text features corresponding to the paired contrast-enhanced medicalimages and non-contrast medical images.

图5 经典视觉语言模型知识蒸馏与我们提出的DLKD的算法对比。(A)经典方法中的知识蒸馏在独立的教师网络(T-Net)和学生网络(S-Net)之间于响应层面执行。(B)我们的DLKD同时关注基于肿瘤分数图的特征层面蒸馏和基于证据的响应层面蒸馏。此外,为进一步改进蒸馏效果并减少计算量,我们复用了与成对对比增强医学图像和非对比医学图像对应的共同文本特征。

图片

Fig. 6. The visual examples for tumor segmentation on non-contrast medical images.Subject 1 and subject 2 are brain tumor segmentation on multi-modality non-contrastmedical images of T1, T2, and Flair. Subject 3 and subject 4 are liver tumorsegmentation on non-contrast medical images of T1 and T2. Subject 5 and subject6 are kidney tumor segmentation on non-contrast CT. Where the zoomed local patchesare the brain tumor region, liver region, and kidney region on these three types ofsubjects. Our EGTA-KD gains the best overlap with ground truth among these comparedmethods.

图6 非对比医学图像肿瘤分割的视觉示例。   受试者1和受试者2为多模态非对比医学图像(T1、T2和Flair)的脑肿瘤分割结果;受试者3和受试者4为非对比医学图像(T1和T2)的肝肿瘤分割结果;受试者5和受试者6为非对比CT图像的肾肿瘤分割结果。其中,放大的局部区域分别为这三类受试者的脑肿瘤区域、肝脏区域和肾脏区域。在所有对比方法中,我们提出的EGTA-KD与真值的重叠度最高。

图片

Fig. 7. Structures of the (A) our proposed EGTA-KD, (B) EGTA-KD without DLKD, (C)EGTA-KD without UQEU, and (D) EGTA-KD without TACMS.

图7不同模型结构对比:(A)我们提出的EGTA-KD完整结构;(B)移除双层次知识蒸馏(DLKD)的EGTA-KD结构;(C)移除不确定性量化证据单元(UQEU)的EGTA-KD结构;(D)移除肿瘤感知跨模态同步器(TACMS)的EGTA-KD结构。 

图片

Fig. 8. Salient maps were obtained by our EGTA-KD and ablation study frameworks of𝑤∕𝑜* DLKD, 𝑤∕𝑜 UQEU, and 𝑤∕𝑜 TACMS. Salient maps provide intuitive visualization forevaluating the contribution of DLKD, UQEU, and TACMS. Meanwhile, it also providesa confidence evaluation of tumor segmentation.

图8 显著性图由我们提出的EGTA-KD及去除DLKD(w∕o DLKD)、去除UQEU(w∕o UQEU)、去除TACMS(w∕o TACMS)的消融实验框架生成。显著性图为评估DLKD、UQEU和TACMS的贡献提供了直观可视化依据,同时也为肿瘤分割的置信度评估提供了参考。

图片

Fig. 9. Segmentation results with uncertainty estimation were obtained by using thecompared method of MC-dropout and our UQUE. Our UQEU outperforms MC-dropoutfor tumor segmentation and uncertainty estimation.

图9 分割结果及不确定性估计由对比方法(MC-dropout)和我们提出的UQEU生成。在肿瘤分割和不确定性估计任务中,我们的UQEU性能优于MC-dropout。

图片

Fig. 10. Performance comparison for ablation study of context-aware prompt learning among three different medical prompt templates. This figure presents a comparative analysisof the DSC and HD95 metrics across eight tumor classes. The top row illustrates results for brain MRI tumors (WT, ET, TC, Avg𝐵𝑟𝑎𝑖𝑛), while the bottom row shows results for liverand kidney tumors (hemangioma, HCC, Avg𝐿𝑖𝑣𝑒𝑟 , kidney tumor). The left 𝑦-axis (blue) represents DSC values, and the right 𝑦-axis (olive) represents HD95 values in millimeters. Bluebars ( ) indicate DSC values, and olive bars ( ) indicate HD95 values, both accompanied by error bars to show standard deviations. Each subplot highlights the highest DSCwith a green dashed line (- - -) labeled ‘‘Highest DSC’’ and the lowest HD95 with a red dashed line (- - -) labeled ‘‘Lowest HD95’’, providing a clear comparison of segmentationperformance across different methods.

图10 上下文感知提示学习在三种不同医学提示模板间的消融实验性能对比。该图展示了八种肿瘤类别的Dice相似系数(DSC)和95%豪斯多夫距离(HD95)指标对比分析。上排为脑部MRI肿瘤结果(WT、ET、TC、脑肿瘤平均值Avg₈ᵣₐᵢₙ),下排为肝脏和肾脏肿瘤结果(血管瘤、肝细胞癌HCC、肝脏肿瘤平均值Avgₗᵢᵥₑᵣ、肾肿瘤)。左侧纵轴(蓝色)表示DSC值,右侧纵轴(橄榄色)表示HD95值(单位:毫米)。蓝色柱状图( )代表DSC值,橄榄色柱状图( )代表HD95值,两者均配有误差线以展示标准差。每个子图中,绿色虚线(- - -)标注“最高DSC(Highest DSC)”以突出最高DSC值,红色虚线(- - -)标注“最低HD95(Lowest HD95)”以突出最低HD95值,为不同方法的分割性能提供清晰对比。

Table

图片

Table 1Template-based textual data for cross-modal learning.

表1 用于跨模态学习的模板化文本数据。

图片

Table 2The quantitative evaluation of DSC with respect to tumor segmentation on non-contrast medical images and the paired t-test results between our EGTA-KD and other methods.The ± numbers represent the standard deviation calculated by 5-fold cross-validation. The p-values indicate statistical significance.

表2   非对比医学图像肿瘤分割的Dice相似系数(DSC)定量评估结果,以及本研究提出的EGTA-KD与其他方法的配对t检验结果。   ±后的数值表示通过5折交叉验证计算得到的标准差,p值表示统计显著性。

图片

Table 3The quantitative evaluation of HD95 with respect to tumor segmentation on non-contrast medical images and the paired t-test results between our EGTA-KD and other methods.The ± numbers represent the standard deviation calculated by 5-fold cross-validation. The p-values indicate statistical significance

表3   非对比医学图像肿瘤分割的95%豪斯多夫距离(HD95)定量评估结果,以及本研究提出的EGTA-KD与其他方法的配对t检验结果。   ±后的数值表示通过5折交叉验证计算得到的标准差,p值表示统计显著性。 

图片

Table 4The quantitative evaluation of DSC and the corresponding paired t-test results for the ablation studies in non-contrast medical image segmentation. The ± numbers represent thestandard deviation calculated by 5-fold cross-validation. The p-values indicate the statistical significance of the ablation components

表4   非对比医学图像分割消融实验的Dice相似系数(DSC)定量评估结果及相应的配对t检验结果。   ±后的数值表示通过5折交叉验证计算得到的标准差,p值表示各消融组件的统计显著性。 

图片

Table 5The quantitative evaluation of HD95 and the corresponding paired t-test results for the ablation studies in non-contrast medical image segmentation. The ± numbers represent thestandard deviation calculated by 5-fold cross-validation. The p-values indicate the statistical significance of the ablation components.

表5   非对比医学图像分割消融实验的95%豪斯多夫距离(HD95)定量评估结果及相应的配对t检验结果。   ±后的数值表示通过5折交叉验证计算得到的标准差,p值表示各消融组件的统计显著性。

图片

Table 6DSC metrics for UCSF and UPENN test datasets under varying training data

表6   不同训练数据分布下,UCSF和UPENN测试数据集的Dice相似系数(DSC)指标。 distributions

图片

Table 7HD95 (mm) metrics for UCSF and UPENN test datasets under varying training data distributions

表7   不同训练数据分布下,UCSF和UPENN测试数据集的95%豪斯多夫距离(HD95,单位:毫米)指标。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值