IMP lab. publication database: Detail of Publication

Detail of Publication

Text Language	English
Authors	Rina Buoy, Masakazu Iwamura, Sovila Srun, Koichi Kise
Title	Towards Reduced-Complexity Scene Text Recognition (RCSTR) Through A Novel Salient Feature Selection
Journal	International Journal on Document Analysis and Recognition (IJDAR)
Vol.	27
Pages	pp.289-302
Number of Pages	14 pages
Publisher	Springer
Address	Berlin, Germany
Reviewed or not	Reviewed
Month & Year	May 2024
Abstract	The integration of an attention mechanism has played a crucial role in many recent scene text recognition (STR) methods. It enables the capture of spatial feature dependencies (known as self-attention) and the identification of relevant features while predicting a character (known as cross-attention). However, computations and memory requirements in the self-attention and cross-attention layers increase quadratically and linearly with the feature map size, respectively, leading to a computational bottleneck in low-resource environments. But, is it necessary to attend to the entire feature maps? On the other hand, text in a natural scene is continuous and oriented in a specific direction, and it does not occupy the entire image. Therefore, utilizing only a small salient subset of features in text regions is sufficient for accurately predicting characters. Based on this salient feature selection, we propose a reduced-complexity scene text recognition framework that significantly reduces model complexities and memory requirements in the self-attention and cross-attention layers. We validate the proposed framework by employing a convolutional STR architecture with both connectionist temporal classification and transformer decoders. Through the model complexity and performance analyses on public benchmark datasets, we demonstrate that the proposed method can substantially reduce model complexities while still maintaining reasonably robust recognition accuracy.
DOI	10.1007/s10032-024-00474-x

Entry for BibTeX

@Article{Buoy2024,
  author =	{Rina Buoy and Masakazu Iwamura and Sovila Srun and Koichi Kise},
  title =	{Towards Reduced-Complexity Scene Text Recognition (RCSTR) Through A Novel Salient Feature Selection},
  journal =	{International Journal on Document Analysis and Recognition (IJDAR)},
  year =	2024,
  month =	may,
  volume =	{27},
  pages =	{289--302},
  numpages =	{14},
  DOI =		{10.1007/s10032-024-00474-x},
  publisher =	{Springer},
  address =	{Berlin, Germany}
}

Back to list

Homepage
-------
List of Publications
-------
Search for Publications
=======
Page for Management (Only for lab members)