Please use this identifier to cite or link to this item:
|Title:||An Improved Algorithm For The All-pairs Suffix-prefix Problem|
William H. A.; Gog
Guilherme P.; Louza
|Abstract:||Finding all longest suffix-prefix matches for a collection of strings is known as the all pairs suffix-prefix match problem and its main application is de novo genome assembly. This problem is well studied in stringology and has been solved optimally in 1992 by Gusfield etal. using suffix trees. In 2010, Ohlebusch and Gog proposed an alternative solution based on enhanced suffix arrays which has also optimal time complexity but is faster in practice. In this article, we present another optimal algorithm based on enhanced suffix arrays which further improves the practical running time. Our new solution solves the problem locally for each string, scanning the enhanced suffix array backwards to avoid the processing of suffixes that are no suffix-prefix matching candidates. In an empirical evaluation we observed that the new algorithm is over two times faster and more space-efficient than the method proposed by Ohlebusch and Gog. When compared to Readjoiner, a good practical solution, our algorithm is faster for small overlap lengths, in pace with theoretical optimality. (C) 2016 Elsevier B.V. All rights reserved.|
|Editor:||Elsevier Science BV|
|Appears in Collections:||Unicamp - Artigos e Outros Documentos|
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.