SCAN: Sliding Convolutional Attention Network for Scene Text Recognition

Wu, Yi-Chao; Yin, Fei; Zhang, Xu-Yao; Liu, Li; Liu, Cheng-Lin

Computer Science > Computer Vision and Pattern Recognition

arXiv:1806.00578 (cs)

[Submitted on 2 Jun 2018]

Title:SCAN: Sliding Convolutional Attention Network for Scene Text Recognition

Authors:Yi-Chao Wu, Fei Yin, Xu-Yao Zhang, Li Liu, Cheng-Lin Liu

View PDF

Abstract:Scene text recognition has drawn great attentions in the community of computer vision and artificial intelligence due to its challenges and wide applications. State-of-the-art recurrent neural networks (RNN) based models map an input sequence to a variable length output sequence, but are usually applied in a black box manner and lack of transparency for further improvement, and the maintaining of the entire past hidden states prevents parallel computation in a sequence. In this paper, we investigate the intrinsic characteristics of text recognition, and inspired by human cognition mechanisms in reading texts, we propose a scene text recognition method with sliding convolutional attention network (SCAN). Similar to the eye movement during reading, the process of SCAN can be viewed as an alternation between saccades and visual fixations. Compared to the previous recurrent models, computations over all elements of SCAN can be fully parallelized during training. Experimental results on several challenging benchmarks, including the IIIT5k, SVT and ICDAR 2003/2013 datasets, demonstrate the superiority of SCAN over state-of-the-art methods in terms of both the model interpretability and performance.

Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:1806.00578 [cs.CV]
	(or arXiv:1806.00578v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.1806.00578

Submission history

From: Cheng-Lin Liu [view email]
[v1] Sat, 2 Jun 2018 03:28:43 UTC (866 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:SCAN: Sliding Convolutional Attention Network for Scene Text Recognition

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:SCAN: Sliding Convolutional Attention Network for Scene Text Recognition

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators