3 years ago

Ontological Optimization for Latent Semantic Indexing of Arabic Corpus

Aya M. Al-zoghby, Khaled Shaalan

Publication date: 2018

Source: Procedia Computer Science, Volume 142

Author(s): Aya M. Al-Zoghby, Khaled Shaalan

Abstract

The dimensionality reduction is a critical problem in the information retrieval process. The higher dimensions directly affect the search performance in terms of Recall and Precision. The dimensionality reduction enabling the search to be semantically based instead of lexically based as the dimensions are defined in terms of the semantic concepts instead of traditional terms or keywords. Latent Semantic Indexing (LSI) is a mathematical extension of the classical Vector Space Model (VSM). LSI is used to discover the latent semantic in the search space by extracting concepts from the original terms in the space. LSI is based on the Singular Value Decomposition (SVD) to reduce the dimension of the term space into a lower dimensional LSI space. In this paper, we propose a methodology for extra optimal LSI dimension reduction via two reduction levels. The first reduction level is based on an ontological conceptualization process. The Universal Wordnet ontology (UWN) is used to develop an ontological based concept space instead of the term space. As a second reduction level, the SVD is applied to the extracted concept space for getting an optimal LSI conceptualization. The experimental results of this research indicate an improvement in the search results in terms of both Precision and Recall as the proposed methodology addresses the Synonymy and Polysemy problems effectively.

You might also like
Discover & Discuss Important Research

Keeping up-to-date with research can feel impossible, with papers being published faster than you'll ever be able to read them. That's where Researcher comes in: we're simplifying discovery and making important discussions happen. With over 19,000 sources, including peer-reviewed journals, preprints, blogs, universities, podcasts and Live events across 10 research areas, you'll never miss what's important to you. It's like social media, but better. Oh, and we should mention - it's free.

  • Download from Google Play
  • Download from App Store
  • Download from AppInChina

Researcher displays publicly available abstracts and doesn’t host any full article content. If the content is open access, we will direct clicks from the abstracts to the publisher website and display the PDF copy on our platform. Clicks to view the full text will be directed to the publisher website, where only users with subscriptions or access through their institution are able to view the full article.