- ALL COMPUTER, ELECTRONICS AND MECHANICAL COURSES AVAILABLE…. PROJECT GUIDANCE SINCE 2004. FOR FURTHER DETAILS CALL 9443117328
Projects > ELECTRONICS > 2017 > IEEE > DIGITAL IMAGE PROCESSING
This work focuses on fine-grained object classification using recognized scene text in natural images. While the state-of-the-art relies on visual cues only, this paper is the first work which proposes to combine textual and visual cues. Another novelty is the textual cue extraction. Unlike the state-of-the art text detection methods, we focus more on the background instead of text regions. Once text regions are detected, they are further processed by two methods to perform text recognition i.e. ABBYY commercial OCR engine and a state-of-the-art character recognition algorithm. Then, to perform textual cue encoding, bi-and trigrams are formed between the recognized characters by considering the proposed spatial pairwise constraints. Finally, extracted visual and textual cues are combined for fine-grained classification.
Learned Text Concept, SVM.
In this paper, we address the problem of fine-grained object classification by combining textual and visual cues. In particular, we focus on the classification of Buildings into their sub-classes such as Cafe, Tavern, Diner, etc. The reason to use textual cues for such task is that text adds semantics beyond visual cues. The aim is to classify the three images based on their semantics. In this case, visual cues are not sufficient or even misleading as the first two images have similar scene appearances. Textual cues are useful to recognize that the two (right) images belong to the same category since they contain the same brand name Starbucks. Therefore, we propose to use both textual and visual cues. Further, we propose a method for textual cue extraction. The success of the proposed fine-grained object classification method (fusion of visual and text modalities) highly depends on the completeness of the extracted textual image cues. Therefore, a robust character localization and a textual cue encoding method is proposed.
BLOCK DIAGRAM