Recurrent Saliency Transformation Network: Incorporating Multi-Stage Visual Cues for Small Organ Segmentation

Yu, Qihang; Xie, Lingxi; Wang, Yan; Zhou, Yuyin; Fishman, Elliot K.; Yuille, Alan L.

Computer Science > Computer Vision and Pattern Recognition

arXiv:1709.04518v4 (cs)

[Submitted on 13 Sep 2017 (v1), last revised 8 Apr 2018 (this version, v4)]

Title:Recurrent Saliency Transformation Network: Incorporating Multi-Stage Visual Cues for Small Organ Segmentation

Authors:Qihang Yu, Lingxi Xie, Yan Wang, Yuyin Zhou, Elliot K. Fishman, Alan L. Yuille

View PDF

Abstract:We aim at segmenting small organs (e.g., the pancreas) from abdominal CT scans. As the target often occupies a relatively small region in the input image, deep neural networks can be easily confused by the complex and variable background. To alleviate this, researchers proposed a coarse-to-fine approach, which used prediction from the first (coarse) stage to indicate a smaller input region for the second (fine) stage. Despite its effectiveness, this algorithm dealt with two stages individually, which lacked optimizing a global energy function, and limited its ability to incorporate multi-stage visual cues. Missing contextual information led to unsatisfying convergence in iterations, and that the fine stage sometimes produced even lower segmentation accuracy than the coarse stage.
This paper presents a Recurrent Saliency Transformation Network. The key innovation is a saliency transformation module, which repeatedly converts the segmentation probability map from the previous iteration as spatial weights and applies these weights to the current iteration. This brings us two-fold benefits. In training, it allows joint optimization over the deep networks dealing with different input scales. In testing, it propagates multi-stage visual information throughout iterations to improve segmentation accuracy. Experiments in the NIH pancreas segmentation dataset demonstrate the state-of-the-art accuracy, which outperforms the previous best by an average of over 2%. Much higher accuracies are also reported on several small organs in a larger dataset collected by ourselves. In addition, our approach enjoys better convergence properties, making it more efficient and reliable in practice.

Comments:	Accepted to CVPR 2018 (10 pages, 6 figures)
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:1709.04518 [cs.CV]
	(or arXiv:1709.04518v4 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.1709.04518

Submission history

From: Lingxi Xie [view email]
[v1] Wed, 13 Sep 2017 19:54:56 UTC (1,206 KB)
[v2] Sun, 17 Sep 2017 13:25:58 UTC (1,206 KB)
[v3] Sat, 18 Nov 2017 21:12:22 UTC (1,463 KB)
[v4] Sun, 8 Apr 2018 01:14:34 UTC (1,452 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Recurrent Saliency Transformation Network: Incorporating Multi-Stage Visual Cues for Small Organ Segmentation

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Recurrent Saliency Transformation Network: Incorporating Multi-Stage Visual Cues for Small Organ Segmentation

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators