A probabilistic topic model using deep visual word representation for simultaneous image classification and annotation
Publication date: Available online 8 January 2019
Source: Journal of Visual Communication and Image Representation
Author(s): Seyed Navid Mohammadi Foumani, Ahmad Nickabadi
Researches have shown that holistic examination of an image provides better understanding of the image compared to separate processes each devoted to a single task like annotation, classification or segmentation. During the past decades, there have been several efforts for simultaneous image classification and annotation using probabilistic or neural network based topic models. Despite their relative success, most of these models suffer from the poor visual word representation and the imbalance between the number of visual and annotation words in the training data. This paper proposes a novel model for simultaneous image classification and annotation model based on SupDocNADE, a neural network based topic model for image classification and annotation. The proposed model, named wSupDocNADE, addresses the above shortcomings by using a new coding and introducing a weighting mechanism for the SupDocNADE model. In the coding step of the model, several patches extracted from the input image are first fed to a deep convolutional neural network and the feature vectors obtained from this network are coded using the LLC coding. These vectors are then aggregated in a final descriptor through sum pooling. To overcome the imbalance between the visual and annotation words, a weighting factor is considered for each visual or annotation word. The weights of the visual words are set based on their frequencies obtained from the pooling method and the weights of the annotation words are learned from the training data. The experimental results on three benchmark datasets show the superiority of the proposed model in both image classification and annotation tasks over state-of-the-art models.
Keeping up-to-date with research can feel impossible, with papers being published faster than you'll ever be able to read them. That's where Researcher comes in: we're simplifying discovery and making important discussions happen. With over 19,000 sources, including peer-reviewed journals, preprints, blogs, universities, podcasts and Live events across 10 research areas, you'll never miss what's important to you. It's like social media, but better. Oh, and we should mention - it's free.
Researcher displays publicly available abstracts and doesn’t host any full article content. If the content is open access, we will direct clicks from the abstracts to the publisher website and display the PDF copy on our platform. Clicks to view the full text will be directed to the publisher website, where only users with subscriptions or access through their institution are able to view the full article.