[MIA 2025]CLIP in medical imaging: A survey

论文网址:CLIP in medical imaging: A survey - ScienceDirect

项目页面:github.com

英文是纯手打的!论文原文的summarizing and paraphrasing。可能会出现难以避免的拼写错误和语法错误,若有发现欢迎评论指正!文章偏向于笔记,谨慎食用

目录

1. 心得

2. 论文逐段精读

2.1. Abstract

2.2. Introduction

2.3. Background

2.3.1. Contrastive language-image pre-training

2.3.2. Variants of CLIP

2.3.3. Medical image–text dataset

2.4. CLIP in medical image–text pre-training

2.4.1. Challenges of CLIP pre-training

2.4.2. Multi-scale contrast

2.4.3. Data-efficient contrast

2.4.4. Explicit knowledge enhancement

2.4.5. Others

2.4.6. Summary

2.5. CLIP-driven applications

2.5.1. Classification

2.5.2. Dense prediction

2.5.3. Cross-modal tasks

2.5.4. Summary

2.6. Comparative analysis

2.7. Discussions and future directions

2.8. Conclusion

1. 心得

(1)我这可能只记录这篇文章比较不同的地方,基础CLIP和医学影像就不记录了,可以参考原文。主要是太长了没必要全搬运

(2)怎么全文画图风格还不一样,每个人画一张拼的?

(3)偏记录一点,介绍了不同的特别多模型

2. 论文逐段精读

2.1. Abstract

        ①就说CLIP在医学成像领域有意义然后要探索一下

2.2. Introduction

        ①Limitations: poor performance on out-of-distribution performance

        ②The trend of CLIP relevant papers (left) and medical image contained in thosed papers (right):

        ③How CLIP be used:

2.3. Background

2.3.1. Contrastive language-image pre-training

        ①How CLIP works(如果没看过可以去找CLIP原文,很清晰易懂的):

        ②Performance of CLIP in medical field:

2.3.2. Variants of CLIP

        ①介绍了一些变体,但因为没画图很难记住或者一眼知道有啥区别

2.3.3. Medical image–text dataset

        ①Open medical dataset:

2.4. CLIP in medical image–text pre-training

        ①Representative CILP based medical models:

2.4.1. Challenges of CLIP pre-training

        ①Challenges of CLIP in medical image field: 

Modality-influenced, local and global image/text analysis needed

Scarse data(不是说零样本泛化性都很好了吗为什么又说数据稀缺

Need professional kownledge

2.4.2. Multi-scale contrast

        ①GLoRIA matches text with subgraph:

        ②LoVT further assigns different weights on different sentence

2.4.3. Data-efficient contrast

        ①Blindly push all negative pairs away might reduce the relevance of similar disease:

        ②Add description or shuffle sentences

        ③Using medical image video

2.4.4. Explicit knowledge enhancement

        ①Combined with graph or kownledge graph(KG):

2.4.5. Others

        ~

2.4.6. Summary

        ~

2.5. CLIP-driven applications

2.5.1. Classification

        ①CLIP based models on image classification:

(1)Zero-shot classification

        ①Diagnosis example(我靠还能这样,,做二分类):

        ②How Xplainer works(我靠牛呗啊CLIP现在都酱紫玩的):

(2)Context optimization

        ①Example of context optimization:

这没什么解释,不能让人快速上手啊哈哈

2.5.2. Dense prediction

        ①Methods:

(1)Detection

        ①Lists relevant models

(2)2D medical image segmentation

        ①fine tune CLIP to 2D medical image dataset

(3)3D medical image segmentation

        ①Examples:

(4)Others

2.5.3. Cross-modal tasks

        ①Repesentitive models:

(1)Generation

        ①Automatically generate medical report or medical image

(2)Medical visual question answering

        ①Example(这构造奇奇怪怪的):

(3)Image–text retrieval

        ①Current models focus on global image feature

        ②X-TRA:

2.5.4. Summary

        ~

2.6. Comparative analysis

        ①How Multi-modality Large Language Model (MLLM) different from CLIP:

        ②Performance of CLIP on different image sets:

2.7. Discussions and future directions

        ①Inter-disease similarity:

        ②Challenges: inconsistency between pre-training and application, incomprehensive evaluation of refined pre-training, challenges of volumetric imaging, limited scope of refined CLIP pre-training, debiasing in CLIP Models, enhancing adversarial robustness of CLIP, exploring the potential of metadata, incorporation of high-order correlations, beyond image–text alignment

2.8. Conclusion

        ~

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值