Please use this identifier to cite or link to this item: http://repositorio.unicamp.br/jspui/handle/REPOSIP/328017
Type: Artigo
Title: An Improved Algorithm For The All-pairs Suffix-prefix Problem
Author: Tustumi
William H. A.; Gog
Simon; Telles
Guilherme P.; Louza
Felipe A.
Abstract: Finding all longest suffix-prefix matches for a collection of strings is known as the all pairs suffix-prefix match problem and its main application is de novo genome assembly. This problem is well studied in stringology and has been solved optimally in 1992 by Gusfield etal.[8] using suffix trees. In 2010, Ohlebusch and Gog[13] proposed an alternative solution based on enhanced suffix arrays which has also optimal time complexity but is faster in practice. In this article, we present another optimal algorithm based on enhanced suffix arrays which further improves the practical running time. Our new solution solves the problem locally for each string, scanning the enhanced suffix array backwards to avoid the processing of suffixes that are no suffix-prefix matching candidates. In an empirical evaluation we observed that the new algorithm is over two times faster and more space-efficient than the method proposed by Ohlebusch and Gog. When compared to Readjoiner[5], a good practical solution, our algorithm is faster for small overlap lengths, in pace with theoretical optimality. (C) 2016 Elsevier B.V. All rights reserved.
Subject: Suffix-prefix Matching
Suffix Array
Lcp Array
Editor: Elsevier Science BV
Amsterdam
Rights: fechado
Identifier DOI: 10.1016/j.jda.2016.04.002
Address: http://www.sciencedirect.com/science/article/pii/S1570866716300053
Date Issue: 2016
Appears in Collections:Unicamp - Artigos e Outros Documentos

Files in This Item:
File SizeFormat 
000379021500004.pdf509.7 kBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.