3 years ago

# Closing in on Time and Space Optimal Construction of Compressed Indexes.

Dominik Kempa

Fast and space-efficient construction of compressed indexes such as compressed suffix array (CSA) and compressed suffix tree (CST) has been a major open problem until recently, when Belazzougui [STOC 2014] described an algorithm able to build both of these data structures in $O(n)$ (randomized; later improved by the same author to deterministic) time and $O(n/\log_{\sigma}n)$ words of space, where $n$ is the length of the string and $\sigma$ is the alphabet size. Shortly after, Munro et al. [SODA 2017] described another deterministic construction using the same time and space based on different techniques. It has remained an elusive open problem since then whether these bounds are optimal or, assuming non-wasteful text encoding, the construction achieving $O(n / \log_{\sigma}n)$ time and space is possible. In this paper we provide a first algorithm that can achieve these bounds. We show a deterministic algorithm that constructs CSA and CST using $O(n / \log_{\sigma} n + r \log^{11} n)$ time and $O(n / \log_{\sigma} n + r \log^{10} n)$ working space, where $r$ is the number of runs in the Burrows-Wheeler transform of the input text. As one of the applications of our techniques we show how to compute the LZ77 parsing in $O(n/\log_{\sigma}n + r\log^{11}n+z\log^{10}n)$ time and $O(n/\log_{\sigma}n + r\log^{9}n)$ space, which is optimal for highly repetitive strings.

Publisher URL: http://arxiv.org/abs/1712.04886

DOI: arXiv:1712.04886v3

You might also like
Discover & Discuss Important Research

Keeping up-to-date with research can feel impossible, with papers being published faster than you'll ever be able to read them. That's where Researcher comes in: we're simplifying discovery and making important discussions happen. With over 19,000 sources, including peer-reviewed journals, preprints, blogs, universities, podcasts and Live events across 10 research areas, you'll never miss what's important to you. It's like social media, but better. Oh, and we should mention - it's free.

Researcher displays publicly available abstracts and doesn’t host any full article content. If the content is open access, we will direct clicks from the abstracts to the publisher website and display the PDF copy on our platform. Clicks to view the full text will be directed to the publisher website, where only users with subscriptions or access through their institution are able to view the full article.