论文网址:Reconstructing the Mind's Eye: fMRI-to-Image with Contrastive Learning and Diffusion Priors
论文代码:GitHub - MedARC-AI/fMRI-reconstruction-NSD: fMRI-to-image reconstruction on the NSD dataset.
英文是纯手打的!论文原文的summarizing and paraphrasing。可能会出现难以避免的拼写错误和语法错误,若有发现欢迎评论指正!文章偏向于笔记,谨慎食用
目录
2.3.1. High-Level (Semantic) Pipeline
2.3.2. Low-Level (Perceptual) Pipeline
2.4.2. fMRI-to-Image Reconstruction
1. 心得
(1)emm,是概念上比较简单的论文可以速速扫一遍。
2. 论文逐段精读
2.1. Abstract
①MindEye maps fMRI signals to CLIP image space
2.2. Introduction
①Exampled reconstructed image:
2.3. MindEye
①Framework:
2.3.1. High-Level (Semantic) Pipeline
①Components of MLP backbone: a linear layer, 4 residual blocks and a linear layer
②Loss: MSE and bidirectional CLIP loss
③They changed CLIP loss to MixCo:
where is sampled from Beta distribution with
,
where is temperature hyperparameter and
denotes batch size
④They stop using mixup and switch from a hard contrastive loss to a soft contrastive loss one-third of training, which will get better performance (retrieve ↑↑ but reconstruction ↓, so they aim to balance)
⑤Soft contrastive loss:
⑥Total loss:
where is come from DALL-E 2,
⑦Experiment settings: 240 epoch with 32 batch size
2.3.2. Low-Level (Perceptual) Pipeline
①Employ img2img
2.4. Results
①Dataset: the Natural Scenes Dataset (NSD)
②Subject: 4
③Training and testing: 24980/982, where test processes are averaged by 3 times repeating
2.4.1. Image/Brain Retrieval
①Image retrieval performance:
(对于少量样本的检索,如第一行,MindEye可以准确定位到被试真正看的图片。同时可以扩展到更大的数据集LAION-5B,在五十亿图像中检索最为相近的图片)
②Performance:
2.4.2. fMRI-to-Image Reconstruction
①Reconstruction performance:
2.4.3. Ablations
(1)Architectural Improvements
①MLP size ablation:
(2)Training Strategies (Losses and Data Augmentations)
①Loss ablation:
(3)Reconstruction Strategies
①Effects of diffusion prior and MLP projector on reconstruction and retrieval metrics:
2.5. Related Work
①我就不转述了,有点太多了
2.6. Conclusion
①干嘛写这么多现状分析在结论里,不如甩进附录,把一些实验放在这来。什么信息泄露数据安全的
②建立了讨论社区:MedARC
3. Reference
@inproceedings{NEURIPS2023_4ddab70b,
author = {Scotti, Paul and Banerjee, Atmadeep and Goode, Jimmie and Shabalin, Stepan and Nguyen, Alex and cohen, ethan and Dempster, Aidan and Verlinde, Nathalie and Yundler, Elad and Weisberg, David and Norman, Kenneth and Abraham, Tanishq},
booktitle = {Advances in Neural Information Processing Systems},
editor = {A. Oh and T. Naumann and A. Globerson and K. Saenko and M. Hardt and S. Levine},
pages = {24705--24728},
publisher = {Curran Associates, Inc.},
title = {Reconstructing the Mind\textquotesingle s Eye: fMRI-to-Image with Contrastive Learning and Diffusion Priors},
url = {https://proceedings.neurips.cc/paper_files/paper/2023/file/4ddab70bf41ffe5d423840644d3357f4-Paper-Conference.pdf},
volume = {36},
year = {2023}
}