Skip to main content Accessibility help
×
Hostname: page-component-cd9895bd7-8ctnn Total loading time: 0 Render date: 2024-12-23T05:30:17.430Z Has data issue: false hasContentIssue false

References

Published online by Cambridge University Press:  28 September 2023

Veli Mäkinen
Affiliation:
University of Helsinki
Djamal Belazzougui
Affiliation:
Centre de Recherche sur l’Information Scientifique et Technique (CERIST), Algiers
Fabio Cunial
Affiliation:
Broad Institute, Massachusetts
Alexandru I. Tomescu
Affiliation:
University of Helsinki
Get access

Summary

Image of the first page of this content. For PDF version, please use the ‘Save PDF’ preceeding this image.'
Type
Chapter
Information
Genome-Scale Algorithm Design
Bioinformatics in the Era of High-Throughput Sequencing
, pp. 414 - 438
Publisher: Cambridge University Press
Print publication year: 2023

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Abouelhoda, Mohamed Ibrahim. A chaining algorithm for mapping cDNA sequences to multiple genomic sequences. In 14th International Symposium on String Processing and Information Retrieval, volume 4726 of Lecture Notes in Computer Science, pages 113. Springer, 2007.Google Scholar
Adelson Velskii, Georgy and Landis, Evgenii M.. An algorithm for the organization of information. In Proceedings of the USSR Academy of Sciences, volume 146, pages 263266, 1962. (Russian) English translation by Myron J. Ricci in Soviet Mathematics – Doklady, 3:1259–1263, 1962.Google Scholar
Ahmed, Omar, Rossi, Massimiliano, Kovaka, Sam, et al. Pan-genomic matching statistics for targeted nanopore sequencing. iScience, 24(6):102696, 2021.CrossRefGoogle ScholarPubMed
Aho, Alfred V. and Corasick, Margaret J.. Efficient string matching: An aid to bibliographic search. Communications of the ACM, 18(6):333340, 1975.CrossRefGoogle Scholar
Ahuja, Ravindra K., Goldberg, Andrew V., Orlin, James B., and Tarjan, Robert E.. Finding minimum-cost flows by double scaling. Mathematical Programming, 53:243266, 1992.CrossRefGoogle Scholar
Ahuja, Ravindra K., Magnanti, Thomas L., and Orlin, James B.. Network Flows: Theory, Algorithms, and Applications. Prentice Hall, 1993.Google Scholar
Alanko, Jarno, Cunial, Fabio, Belazzougui, Djamal, and Mäkinen, Veli. A framework for space-efficient read clustering in metagenomic samples. BMC Bioinformatics, 18(3): 4960, 2017.CrossRefGoogle ScholarPubMed
Alanko, Jarno, D’Agostino, Giovanna, Policriti, Alberto, and Prezza, Nicola. Wheeler languages. Information and Computation, 281:104820, 2021.CrossRefGoogle Scholar
Alipanahi, Bahar, Kuhnle, Alan, and Boucher, Christina. Recoloring the colored de Bruijn graph. In International Symposium on String Processing and Information Retrieval, pages 111. Springer, 2018.Google Scholar
Almodaresi, Fatemeh, Pandey, Prashant, Ferdman, Michael, Johnson, Rob, and Patro, Rob. An efficient, scalable, and exact representation of high-dimensional color information enabled using de Bruijn graph search. Journal of Computational Biology, 27(4):485499, 2020.CrossRefGoogle Scholar
Alneberg, Johannes, Bjarnason, Brynjar Smári, de Bruijn, Ino, et al. Binning metagenomic contigs by coverage and composition. Nature Methods, 11(11):11441146, 2014.CrossRefGoogle ScholarPubMed
Alstrup, Stephen, Husfeldt, Thore, and Rauhe, Theis. Marked ancestor problems. In Foundations of Computer Science, 1998. Proceedings. 39th Annual Symposium on, pages 534543. IEEE, 1998.Google Scholar
Apostolico, Alberto. Maximal words in sequence comparisons based on subword composition. In Algorithms and Applications, pages 3444. Springer, 2010.CrossRefGoogle Scholar
Apostolico, Alberto and Bejerano, Gill. Optimal amnesic probabilistic automata or how to learn and classify proteins in linear time and space. Journal of Computational Biology, 7(3–4): 381393, 2000.CrossRefGoogle ScholarPubMed
Apostolico, Alberto and Denas, Olgert. Fast algorithms for computing sequence distances by exhaustive substring composition. Algorithms for Molecular Biology, 3:13, 2008.CrossRefGoogle ScholarPubMed
Arge, Lars, Fischer, Johannes, Sanders, Peter, and Sitchinava, Nodari. On (dynamic) range minimum queries in external memory. In Algorithms and Data Structures, pages 3748. Springer, 2013.CrossRefGoogle Scholar
Arlazarov, Vladimir, Dinic, E., Kronrod, M., and Faradzev, Igor. On economic construction of the transitive closure of a directed graph. Soviet Mathematics Doklady, 11:12091210, 1975.Google Scholar
Baaijens, Jasmijn A., Bonizzoni, Paola, Boucher, Christina, et al. Computational graph pangenomics: A tutorial on data structures and their applications. Natural Computing: An International Journal, 21(1):81108, 2022.CrossRefGoogle ScholarPubMed
Babenko, Maxim A., Gawrychowski, Pawel, Kociumaka, Tomasz, and Starikovskaya, Tatiana. Wavelet trees meet suffix trees. In Indyk, Piotr, editor, Proceedings of the Twenty-Sixth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2015, San Diego, CA, USA, January 4–6, 2015, pages 572591. SIAM, 2015.Google Scholar
Backurs, Arturs and Indyk, Piotr. Edit distance cannot be computed in strongly subquadratic time (unless SETH is false). SIAM Journal on Computing, 47(3):10871097, 2018.CrossRefGoogle Scholar
Baeza-Yates, Ricardo A. and Ribeiro-Neto, Berthier A.. Modern Information Retrieval – The Concepts and Technology behind Search, 2nd edition. Pearson Education Ltd., 2011.Google Scholar
Baier, Uwe, Beller, Timo, and Ohlebusch, Enno. Graphical pan-genome analysis with compressed suffix trees and the Burrows–Wheeler transform. Bioinformatics, 32(4):497504, 2016.CrossRefGoogle ScholarPubMed
Baker, Brenda S. On finding duplication in strings and software. Technical report, AT&T Bell Laboratories, New Jersey, 1993.Google Scholar
Bang-Jensen, Jørgen and Gutin, Gregory Z.. Digraphs: Theory, Algorithms and Applications, 2nd edition. Springer Monographs in Mathematics. Springer, 2008.Google Scholar
Bankevich, Anton, Bzikadze, Andrey V., Kolmogorov, Mikhail, Antipov, Dmitry, and Pevzner, Pavel A.. Multiplex de Bruijn graphs enable genome assembly from long, high-fidelity reads. Nature Biotechnology, 40(7):10751081, 2022.CrossRefGoogle Scholar
Bao, Ergude, Jiang, Tao, and Girke, Thomas. BRANCH: Boosting RNA-Seq assemblies with partial or related genomic sequences. Bioinformatics, 29(10):12501259, 2013.CrossRefGoogle ScholarPubMed
Baran, Yael and Halperin, Eran. Joint analysis of multiple metagenomic samples. PLoS Computational Biology, 8(2):e1002373, 2012.CrossRefGoogle ScholarPubMed
Baum, Leonard E. An inequality and associated maximization technique in statistical estimation for probabilistic functions of Markov processes. In Inequalities III: Proceedings of the Third Symposium on Inequalities, pages 18. Academic Press, 1972.Google Scholar
Bayer, Rudolf. Symmetric binary b-trees: Data structure and maintenance algorithms. Acta Informatica, 1:290306, 1972.CrossRefGoogle Scholar
Beerenwinkel, Niko, Beretta, Stefano, Bonizzoni, Paola, Dondi, Riccardo, and Pirola, Yuri. Covering pairs in directed acyclic graphs. In Language and Automata Theory and Applications, volume 8370 of Lecture Notes in Computer Science, pages 126137, Springer, 2014.Google Scholar
Behnam, Ehsan, Waterman, Michael S., and Smith, Andrew D.. A geometric interpretation for local alignment-free sequence comparison. Journal of Computational Biology, 20(7):471485, 2013.CrossRefGoogle ScholarPubMed
Bejerano, Gill, Pheasant, Michael, Makunin, Igor, et al. Ultraconserved elements in the human genome. Science, 304(5675):13211325, 2004.CrossRefGoogle ScholarPubMed
Belazzougui, Djamal. Linear time construction of compressed text indices in compact space. In Symposium on Theory of Computing, STOC 2014, New York, NY, USA, May 31–June 03, 2014, pages 148193, 2014.Google Scholar
Belazzougui, Djamal and Cunial, Fabio. A framework for space-efficient string kernels. Algorithmica, 79(3):857883, 2017.CrossRefGoogle Scholar
Belazzougui, Djamal and Puglisi, Simon J.. Range predecessor and Lempel–Ziv parsing. In Proceedings of the Twenty-Seventh Annual ACM-SIAM Symposium on Discrete Algorithms, pages 20532071. SIAM, 2016.CrossRefGoogle Scholar
Belazzougui, Djamal and Venturini, Rossano. Compressed static functions with applications. In Proceedings of the Twenty-Fourth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 229240. SIAM, 2013.Google Scholar
Belazzougui, Djamal and Zhang, Qin. Edit distance: Sketching, streaming, and document exchange. In 2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS), pages 5160, 2016.CrossRefGoogle Scholar
Belazzougui, Djamal, Boldi, Paolo, Pagh, Rasmus, and Vigna, Sebastiano. Monotone minimal perfect hashing: Searching a sorted table with O(1) accesses. In Proceedings of the Twentieth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 785794. SIAM, 2009.Google Scholar
Belazzougui, Djamal, Cunial, Fabio, Kärkkäinen, Juha, and Mäkinen, Veli. Versatile succinct representations of the bidirectional Burrows–Wheeler transform. In 21st Annual European Symposium on Algorithms (ESA 2013), volume 8125 of Lecture Notes in Computer Science, pages 133144. Springer, 2013.Google Scholar
Belazzougui, Djamal, Cunial, Fabio, and Denas, Olgert. Fast matching statistics in small space. In Proceedings of the 17th International Symposium on Experimental Algorithms (SEA 2018), LIPIcs 103, Schloss Dagstuhl–Leibniz-Zentrum für Informatik, 17:117:14, 2018.Google Scholar
Belazzougui, Djamal, Cunial, Fabio, Kärkkäinen, Juha, and Mäkinen, Veli. Linear-time string indexing and analysis in small space. ACM Transactions on Algorithms, 16(2):17:117:54, 2020.CrossRefGoogle Scholar
Beller, Timo and Ohlebusch, Enno. A representation of a compressed de Bruijn graph for pan-genome analysis that enables search. Algorithms for Molecular Biology, 11(1):117, 2016.Google Scholar
Beller, Timo, Berger, Katharina, and Ohlebusch, Enno. Space-efficient computation of maximal and supermaximal repeats in genome sequences. In 19th International Symposium on String Processing and Information Retrieval (SPIRE 2012), volume 7608 of Lecture Notes in Computer Science, pages 99110. Springer, 2012.Google Scholar
Beller, Timo, Gog, Simon, Ohlebusch, Enno, and Schnattinger, Thomas. Computing the longest common prefix array based on the Burrows–Wheeler transform. Journal of Discrete Algorithms, 18:2231, 2013.CrossRefGoogle Scholar
Bellman, Richard. On a routing problem. Quarterly of Applied Mathematics 16:8790, 1958.CrossRefGoogle Scholar
Bender, Michael A. and Farach-Colton, Martin. The LCA problem revisited. In 4th Latin American Symposium on Theoretical Informatics (LATIN 2000), volume 1776 of Lecture Notes in Computer Science, pages 8894. Springer, 2000.Google Scholar
Bermond, Jean-Claude, Homobono, Nathalie, and Peyrat, Claudine. Connectivity of Kautz networks. Discrete Mathematics, 114(1–3):5162, 1993.CrossRefGoogle Scholar
Bernard, Elsa, Jacob, Laurent, Mairal, Julien, and Vert, Jean-Philippe. Efficient RNA isoform identification and quantification from RNA-Seq data with network flows. Bioinformatics, 30(17):24472455, 2014.CrossRefGoogle ScholarPubMed
Bernstein, Aaron, Nanongkai, Danupon, and Wulff-Nilsen, Christian. Negative-weight single-source shortest paths in near-linear time. In 63rd IEEE Annual Symposium on Foundations of Computer Science, FOCS 2022, Denver, CO, USA, October 31–November 3, 2022, pages 600611. IEEE, 2022.Google Scholar
Blazewicz, Jacek and Kasprzak, Marta. Complexity of DNA sequencing by hybridization. Theoretical Computer Science, 290(3):14591473, 2003.CrossRefGoogle Scholar
Bloom, Burton H. Space/time trade-offs in hash coding with allowable errors. Communications of the ACM, 13(7):422426, 1970.CrossRefGoogle Scholar
Blumer, Anselm, Blumer, Janet, Haussler, David, Ehrenfeucht, Andrzej, Chen, Mu-Tian, and Seiferas, Joel. The smallest automaton recognizing the subwords of a text. Theoretical Computer Science, 40:3155, 1985.CrossRefGoogle Scholar
Blumer, Anselm, Blumer, Janet, Haussler, David, McConnell, Ross, and Ehrenfeucht, Andrzej. Complete inverted files for efficient text retrieval and analysis. Journal of the ACM, 34(3): 578595, 1987.CrossRefGoogle Scholar
Boucher, Christina, Gagie, Travis, Kuhnle, Alan, Langmead, Ben, Manzini, Giovanni, and Mun, Taher. Prefix-free parsing for building big BWTs. Algorithms for Molecular Biology, 14(1): 13:113:15, 2019.CrossRefGoogle ScholarPubMed
Bowe, Alexander, Onodera, Taku, Sadakane, Kunihiko, and Shibuya, Tetsuo. Succinct de Bruijn graphs. In Algorithms in Bioinformatics, pages 225235. Springer, 2012.CrossRefGoogle Scholar
Brockhurst, Michael A., Harrison, Ellie, Hall, James P.J., Richards, Thomas, McNally, Alan, and MacLean, Craig. The ecology and evolution of pangenomes. Current Biology, 29(20): R1094R1103, 2019.CrossRefGoogle ScholarPubMed
Brodal, Gerth Stølting, Davoodi, Pooya, and Rao, S. Srinivasa. Path minima queries in dynamic weighted trees. In Algorithms and Data Structures, pages 290301. Springer, 2011.CrossRefGoogle Scholar
Broder, Andrei Z. On the resemblance and containment of documents. In Proceedings. Compression and Complexity of SEQUENCES 1997 (Cat. No.97TB100171), pages 2129, 1997.Google Scholar
Büchler, Thomas and Ohlebusch, Enno. An improved encoding of genetic variation in a Burrows–Wheeler transform. Bioinformatics, 36(5):14131419, 2020.CrossRefGoogle Scholar
Burrows, Michael and Wheeler, David. A block sorting lossless data compression algorithm. Technical Report 124, Digital Equipment Corporation, 1994.Google Scholar
Cáceres, Manuel, Cairo, Massimo, Mumey, Brendan, Rizzi, Romeo, and Tomescu, Alexandru I.. Sparsifying, shrinking and splicing for minimum path cover in parameterized linear time. In Naor, Joseph (Seffi) and Buchbinder, Niv, editors, Proceedings of the 2022 ACM-SIAM Symposium on Discrete Algorithms, SODA 2022, Virtual Conference / Alexandria, VA, USA, January 9–12, 2022, pages 359376. SIAM, 2022.Google Scholar
Cameron, Daniel L., Schröder, Jan, Penington, Jocelyn Sietsma, et al. GRIDSS: Sensitive and specific genomic rearrangement detection using positional de Bruijn graph assembly. Genome Research, 27(12):20502060, 2017.CrossRefGoogle ScholarPubMed
Cancedda, Nicola, Gaussier, Eric, Goutte, Cyril, and Renders, Jean Michel. Word sequence kernels. The Journal of Machine Learning Research, 3:10591082, 2003.Google Scholar
Carter, J. Lawrence and Wegman, Mark N.. Universal classes of hash functions. In Proceedings of the Ninth Annual ACM Symposium on Theory of Computing, pages 106112, 1977.CrossRefGoogle Scholar
Chairungsee, Supaporn and Crochemore, Maxime. Using minimal absent words to build phylogeny. Theoretical Computer Science, 450:109116, 2012.CrossRefGoogle Scholar
Chaisson, Mark, Pevzner, Pavel A., and Tang, Haixu. Fragment assembly with short reads. Bioinformatics, 20(13):20672074, 2004.CrossRefGoogle ScholarPubMed
Chan, Ho-Leung, Lam, Tak-Wah, Sung, Wing-Kin, Tam, Siu-Lung, and Wong, Swee-Seong. A linear size index for approximate pattern matching. In Proceedings of the Annual Symposium on Combinatorial Pattern Matching, LNCS 4009, pages 4959. Springer-Verlag, 2006.CrossRefGoogle Scholar
Chan, Raymond H., Chan, Tony H., Yeung, Hau Man, and Wang, Roger Wei. Composition vector method based on maximum entropy principle for sequence comparison. Computational Biology and Bioinformatics, IEEE/ACM Transactions on, 9(1):7987, 2012.CrossRefGoogle ScholarPubMed
Charikar, Moses, Lehman, Eric, Liu, Ding, et al. Approximating the smallest grammar: Kolmogorov complexity in natural models. In Proceedings of the Thiry-Fourth Annual ACM Symposium on Theory of Computing, pages 792801. ACM, 2002.CrossRefGoogle Scholar
Chazelle, B. A functional approach to data structures and its use in multidimensional searching. SIAM Journal on Computing, 17(3):427462, 1988.CrossRefGoogle Scholar
Chen, Li, Kyng, Rasmus, Liu, Yang P., Peng, Richard, Gutenberg, Maximilian Probst, and Sachdeva, Sushant. Maximum flow and minimum-cost flow in almost-linear time. In 63rd IEEE Annual Symposium on Foundations of Computer Science, FOCS 2022, Denver, CO, USA, October 31–November 3, 2022, pages 612623. IEEE, 2022.Google Scholar
Chen, Sai, Krusche, Peter, Dolzhenko, Egor, et al. Paragraph: A graph-based structural variant genotyper for short-read sequence data. Genome Biology, 20(1):113, 2019.CrossRefGoogle ScholarPubMed
Chikhi, Rayan and Rizk, Guillaume. Space-efficient and exact de Bruijn graph representation based on a Bloom filter. Algorithms for Molecular Biology, 8:22, 2013.CrossRefGoogle Scholar
Chikhi, Rayan, Limasset, Antoine, Jackman, Shaun, Simpson, Jared T., and Medvedev, Paul. On the representation of de Bruijn graphs. Journal of Computational Biology, 22(5):336352, 2015.CrossRefGoogle ScholarPubMed
Chikhi, Rayan, Limasset, Antoine, and Medvedev, Paul. Compacting de Bruijn graphs from sequencing data quickly and in low memory. Bioinformatics, 32(12):i201i208, 2016.CrossRefGoogle Scholar
Cilibrasi, Rudi and Vitányi, Paul M. B.. Clustering by compression. IEEE Transactions on Information Theory, 51(4):15231545, 2005.CrossRefGoogle Scholar
Cilibrasi, Rudi, Iersel, Leo, Kelk, Steven, and Tromp, John. On the complexity of several haplotyping problems. In Casadio, Rita and Myers, Gene, editors, Algorithms in Bioinformatics, volume 3692 of Lecture Notes in Computer Science, pages 128139. Springer, 2005.Google Scholar
Clark, D. Compact PAT trees. PhD thesis, University of Waterloo, Canada, 1996.Google Scholar
Claude, Francisco and Navarro, Gonzalo. The wavelet matrix. In International Symposium on String Processing and Information Retrieval, pages 167179. Springer, 2012.CrossRefGoogle Scholar
Claude, Francisco, Navarro, Gonzalo, and Ordónez, Alberto. The wavelet matrix: An efficient wavelet tree for large alphabets. Information Systems, 47:1532, 2015.Google Scholar
Cole, Richard, Gottlieb, Lee-Ad, and Lewenstein, Moshe. Dictionary matching and indexing with errors. In Proceedings of the Symposium on Theory of Computing, pages 91100, 2004.CrossRefGoogle Scholar
Coleman, J. Robert, Papamichail, Dimitris, Skiena, Steven, Futcher, Bruce, Wimmer, Eckard, and Mueller, Steffen. Virus attenuation by genome-scale changes in codon pair bias. Science, 320 (5884):17841787, 2008.CrossRefGoogle ScholarPubMed
The Computational Pan-Genomics Consortium. Computational pan-genomics: Status, promises and challenges. Briefings in Bioinformatics, 19(1):118135, 2016.Google Scholar
Coombe, Lauren, Nikolić, Vladimir, Chu, Justin, Birol, Inanc, and Warren, René L.. ntJoin: Fast and lightweight assembly-guided scaffolding using minimizer graphs. Bioinformatics, 36(12):38853887, 2020.Google Scholar
Cormen, Thomas H., Leiserson, Charles E., Rivest, Ronald L., and Stein, Clifford. Introduction to Algorithms, 4th edition. MIT Press, 2022.Google Scholar
Cotumaccio, Nicola and Prezza, Nicola. On indexing and compressing finite automata. In Proceedings of the 2021 ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 25852599. SIAM, 2021.CrossRefGoogle Scholar
Cristianini, Nello and Shawe-Taylor, John. An Introduction to Support Vector Machines. Cambridge University Press, 2000.Google Scholar
Crochemore, Maxime and Vérin, Renaud. Direct construction of compact directed acyclic word graphs. In Proceedings of the 8th Annual Symposium on Combinatorial Pattern Matching (CPM), LNCS v. 1264, pages 116129, 1997a.CrossRefGoogle Scholar
Crochemore, Maxime and Vérin, Renaud. On compact directed acyclic word graphs. In Structures in Logic and Computer Science, pages 192211. Springer, 1997b.CrossRefGoogle Scholar
Crochemore, Maxime and Rytter, Wojciech. Jewels of Stringology. World Scientific, 2002.CrossRefGoogle Scholar
Crochemore, Maxime, Mignosi, Filippo, and Restivo, Antonio. Automata and forbidden words. Information Processing Letters, 67(3):111117, 1998.CrossRefGoogle Scholar
Crochemore, Maxime, Mignosi, Filippo, Restivo, Antonio, and Salemi, Sergio. Data compression using antidictionaries. Proceedings of the IEEE, 88(11):17561768, 2000.Google Scholar
Crochemore, Maxime, Landau, Gad, and Ziv-Ukelson, Michal. A sub-quadratic sequence alignment algorithm for unrestricted cost matrices. In Proceedings of the 13th ACM-SIAM Symposium on Discrete Algorithms (SODA’2002), pages 679688, 2002.Google Scholar
Cunial, Fabio, Alanko, Jarno, and Belazzougui, Djamal. A framework for space-efficient variable-order Markov models. Bioinformatics, 35(22): 46074616, 2019.CrossRefGoogle ScholarPubMed
Cunial, Fabio, Denas, Olgert, and Belazzougui, Djamal. Fast and compact matching statistics analytics. Bioinformatics, 38(7):18381845, 2022.CrossRefGoogle ScholarPubMed
Davoodi, Pooya. Data structures: Range queries and space efficiency. PhD thesis, Department of Computer Science, Aarhus University, 2011.Google Scholar
Dayarian, Adel, Michael, Todd P., and Sengupta, Anirvan M.. Sopra: Scaffolding algorithm for paired reads via statistical optimization. BMC Bioinformatics, 11:345, 2010.CrossRefGoogle ScholarPubMed
de Berg, Mark, van Kreveld, Marc, Overmars, Mark, and Schwarzkopf, Otfried. Computational Geometry – Algorithms and Applications. Springer-Verlag, 2000.CrossRefGoogle Scholar
Delcher, Arthur L., Kasif, Simon, Fleischmann, Robert D., Peterson, Jeremy, White, Owen, and Salzberg, Steven L.. Alignment of whole genomes. Nucleic Acids Research, 27(11):23692376, 1999.CrossRefGoogle ScholarPubMed
Delorme, Charles and Tillich, Jean-Pierre. The spectrum of de Bruijn and Kautz graphs. European Journal of Combinatorics, 19(3):307319, 1998.CrossRefGoogle Scholar
Deorowicz, Sebastian and Grabowski, Szymon. Robust relative compression of genomes with random access. Bioinformatics, 27(21):29792986, 2011.CrossRefGoogle ScholarPubMed
Dias, Fernando H. C., Williams, Lucia, Mumey, Brendan, and Tomescu, Alexandru I.. Fast, flexible, and exact minimum flow decompositions via ILP. In Pe’er, Itsik, editor, Research in Computational Molecular Biology – 26th Annual International Conference, RECOMB 2022, San Diego, CA, USA, May 22–25, 2022, Proceedings, volume 13278 of Lecture Notes in Computer Science, pages 230245. Springer, 2022.Google Scholar
Díaz-Domínguez, Diego. An index for sequencing reads based on the colored de Bruijn graph. In International Symposium on String Processing and Information Retrieval, pages 304321. Springer, 2019.CrossRefGoogle Scholar
Dietz, Paul F. Optimal algorithms for list indexing and subset rank. In Algorithms and Data Structures (WADS 89), pages 3946. Springer, 1989.CrossRefGoogle Scholar
Dilthey, Alexander, Cox, Charles, Iqbal, Zamin, Nelson, Matthew R., and McVean, Gil. Improved genome inference in the MHC using a population reference graph. Nature Genetics, 47(6): 682688, 2015.CrossRefGoogle ScholarPubMed
Dilworth, R. P. A decomposition theorem for partially ordered sets. The Annals of Mathematics, 51(1), 1950.CrossRefGoogle Scholar
Do, Huy Hoang, Jansson, Jesper, Sadakane, Kunihiko, and Sung, Wing-Kin. Fast relative Lempel–Ziv self-index for similar sequences. In Joint International Conference on Frontiers in Algorithmics and Algorithmic Aspects in Information and Management (FAW-AAIM), volume 7285 of Lecture Notes in Computer Science, pages 291302. Springer, 2012.Google Scholar
Donmez, Nilgun and Brudno, Michael. Scarpa: Scaffolding reads with practical algorithms. Bioinformatics, 29(4):428434, 2013.Google Scholar
Drineas, Petros, Mahoney, Michael W., Muthukrishnan, S., and Sarlós, Tamás. Faster least squares approximation. Numerische Mathematik, 117(2):219249, 2011.CrossRefGoogle Scholar
Durbin, Richard. Efficient haplotype matching and storage using the positional Burrows– Wheeler transform (PBWT). Bioinformatics, 30(9):12661272, 2014.CrossRefGoogle ScholarPubMed
Durbin, Richard, Eddy, Sean R., Krogh, Anders, and Mitchison, Graeme. Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids. Cambridge University Press, 1998.Google Scholar
Ebler, Jana, Ebert, Peter, Clarke, Wayne E., et al. Pangenome-based genome inference allows efficient and accurate genotyping across a wide spectrum of variant classes. Nature Genetics, 54(4):518525, 2022.Google Scholar
Eddy, Sean R. Accelerated profile HMM searches. PLoS Computational Biology, 7(10), 2011.CrossRefGoogle ScholarPubMed
Edmonds, Jack and Karp, Richard M.. Theoretical improvements in algorithmic efficiency for network flow problems. Journal of the Association for Computing Machinery, 19:248264, 1972.CrossRefGoogle Scholar
Edwards, Robert A., Olson, Robert, Disz, Terry, et al. Real time metagenomics: Using k-mers to annotate metagenomes. Bioinformatics, 28(24):33163317, 2012.CrossRefGoogle ScholarPubMed
Eggertsson, Hannes P., Kristmundsdottir, Snaedis, Beyter, Doruk. GraphTyper2 enables population-scale genotyping of structural variation using pangenome graphs. Nature Communications, 10(1):5402, 2019.CrossRefGoogle ScholarPubMed
Eizenga, Jordan M., Novak, Adam M., Sibbesen, Jonas A., et al. Pangenome graphs. Annual Review of Genomics and Human Genetics, 21(1):139162, 2020.CrossRefGoogle ScholarPubMed
Ekim, Barış, Berger, Bonnie, and Chikhi, Rayan. Minimizer-space de Bruijn graphs: Whole-genome assembly of long reads in minutes on a personal computer. Cell Systems, 12(10): 958968, 2021.CrossRefGoogle Scholar
Elias, Peter. Universal codeword sets and representations of the integers. IEEE Transactions on Information Theory, 21(2):194203, 1975.CrossRefGoogle Scholar
Ellinghaus, David, Kurtz, Stefan, and Willhoeft, Ute. LTRharvest, an efficient and flexible software for de novo detection of LTR retrotransposons. BMC Bioinformatics, 9:18, 02 2008.CrossRefGoogle Scholar
Eppstein, David, Galil, Zvi, Giancarlo, Raffaele, and Italiano, Giuseppe F.. Sparse dynamic programming i: Linear cost functions. Journal of the ACM, 39(3):519545, 1992a.CrossRefGoogle Scholar
Eppstein, David, Galil, Zvi, Giancarlo, Raffaele, and Italiano, Giuseppe F.. Sparse dynamic programming ii: Convex and concave cost functions. Journal of the ACM, 39(3):546567, 1992b.CrossRefGoogle Scholar
Equi, Massimo, Norri, Tuukka, Alanko, Jarno, Cazaux, Bastien, Tomescu, Alexandru I., and Mäkinen, Veli. Algorithms and complexity on indexing founder graphs. Algorithmica, 2022. doi: 10.1007/s00453-022-01007-w.CrossRefGoogle Scholar
Farach, Martin. Optimal suffix tree construction with large alphabets. In Proceedings of the 38th IEEE Symposium on Foundations of Computer Science (FOCS), pages 137143, 1997.Google Scholar
Farach, Martin, Noordewier, Michiel, Savari, Serap, Shepp, Larry, Wyner, Abraham, and Ziv, Jacob. On the entropy of DNA: Algorithms and measurements based on memory and rapid convergence. In Proceedings of the Sixth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 4857. Society for Industrial and Applied Mathematics, 1995.Google Scholar
Feng, Jianxing, Li, Wei, and Jiang, Tao. Inference of isoforms from short sequence reads. Journal of Computational Biology, 18(3):305321, 2011.Google Scholar
Ferrada, Héctor, Gagie, Travis, Hirvola, Tommi, and Puglisi, Simon J.. Hybrid indexes for repetitive datasets. Philosophical Transactions of the Royal Society A, 2014. In press.CrossRefGoogle Scholar
Ferragina, Paolo, Pearls of Algorithm Engineering. Cambridge University Press, 2023.CrossRefGoogle Scholar
Ferragina, Paolo and Manzini, Giovanni. Opportunistic data structures with applications. In Proceedings of the 41st IEEE Symposium on Foundations of Computer Science (FOCS), pages 390398, 2000.Google Scholar
Ferragina, Paolo and Manzini, Giovanni. Indexing compressed texts. Journal of the ACM, 52(4): 552581, 2005.CrossRefGoogle Scholar
Ferragina, Paolo, Luccio, Fabrizio, Manzini, Giovanni, and Muthukrishnan, S.. Compressing and indexing labeled trees, with applications. Journal of the ACM, 57(1), 2009.CrossRefGoogle Scholar
Ferragina, Paolo, Nitto, Igor, and Venturini, Rossano. On the bit-complexity of Lempel-Ziv compression. SIAM Journal on Computing (SICOMP), 42:15211541, 2013.CrossRefGoogle Scholar
Fischer, Johannes and Heun, Volker. Space-efficient preprocessing schemes for range minimum queries on static arrays. SIAM Journal on Computing (SICOMP), 40(2):465492, 2011.CrossRefGoogle Scholar
Fischer, Johannes, Köppl, Dominik, Sadakane, Kunihiko, et al. Lempel–Ziv factorization powered by space efficient suffix trees. Algorithmica, 80(7):20482081, 2018.CrossRefGoogle Scholar
Fischer, Johannes, Mäkinen, Veli, and Välimäki, Niki. Space-efficient string mining under frequency constraints. In Proceedings of the Eighth IEEE International Conference on Data Mining (ICDM 2008), pages 193202. IEEE Computer Society, 2008.CrossRefGoogle Scholar
Ford, Lester R. Network flow theory, Technical Report Paper P-923, The RAND Corporation, 1956.Google Scholar
Ford, Lester R. and Fulkerson, Delbert R.. Maximal flow through a network. Canadian Journal of Mathematics, 8:399404, 1956.CrossRefGoogle Scholar
Fredman, Michael L. On computing the length of longest increasing subsequences. Discrete Mathematics, 11(1):2935, 1975.CrossRefGoogle Scholar
Fredman, Michael L. and Willard, Dan E.. Trans-dichotomous algorithms for minimum spanning trees and shortest paths. Journal of Computer and System Sciences, 48(3):533551, 1994.CrossRefGoogle Scholar
Fredman, Michael L., Komlós, János, and Szemerédi, Endre. Storing a sparse table with 0 (1) worst case access time. Journal of the ACM (JACM), 31(3):538544, 1984.CrossRefGoogle Scholar
Fulkerson, Delbert R. Note on Dilworth’s decomposition theorem for partially ordered sets. Proceedings of the American Mathematical Society, 7(4):701702, 1956.Google Scholar
Gabow, Harold N. Data structures for weighted matching and nearest common ancestors with linking. In Proceedings of the First Annual ACM-SIAM Symposium on Discrete Algorithms, SODA ’90, pages 434443, Philadelphia, PA, USA, 1990. Society for Industrial and Applied Mathematics.Google Scholar
Gabow, Harold N. and Tarjan, Robert Endre. Faster scaling algorithms for network problems. SIAM Journal on Computing, 18(5):10131036, 1989.CrossRefGoogle Scholar
Gabow, Harold N., Bentley, Jon Louis, and Tarjan, Robert Endre. Scaling and related techniques for geometry problems. In Proceedings of the 16th Annual ACM Symposium on Theory of Computing (STOC 1984), pages 135143. ACM, 1984.Google Scholar
Gagie, Travis, Gawrychowski, Pawel, Kärkkäinen, Juha, Nekrich, Yakov, and Puglisi, Simon J.. A faster grammar-based self-index. In 6th International Conference on Language and Automata Theory and Applications (LATA 2012), volume 7183 of Lecture Notes in Computer Science, pages 240251. Springer, 2012.Google Scholar
Gagie, Travis, Manzini, Giovanni, and Sirén, Jouni. Wheeler graphs: A framework for BWT-based data structures. Theoretical Computer Science, 698:6778, 2017.CrossRefGoogle ScholarPubMed
Gagie, Travis, Navarro, Gonzalo, and Prezza, Nicola. Fully functional suffix trees and optimal text searching in BWT-runs bounded space. Journal of the ACM, 67(1):2:12:54, 2020.CrossRefGoogle Scholar
Gallé, Matthias. Searching for compact hierarchical structures in DNA by means of the Smallest Grammar Problem. PhD thesis, Université Rennes 1, 2011.Google Scholar
Ganesh, Arun and Sy, Aaron. Near-linear time edit distance for indel channels. In Kingsford, Carl and Pisanti, Nadia, editors, 20th International Workshop on Algorithms in Bioinformatics, WABI 2020, September 7–9, 2020, Pisa, Italy (Virtual Conference), volume 172 of LIPIcs, pages 17:117:18. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2020.Google Scholar
Gao, Song, Sung, Wing-Kin, and Nagarajan, Niranjan. Opera: Reconstructing optimal genomic scaffolds with high-throughput paired-end sequences. Journal of Computational Biology, 18(11):16811691, 2011.CrossRefGoogle ScholarPubMed
Garcia, Sara P. and Pinho, Armando J.. Minimal absent words in four human genome assemblies. PLoS One, 6(12):e29344, 2011.CrossRefGoogle ScholarPubMed
Garcia, Sara P., Pinho, Armando J., Rodrigues, Joao M.O.S., Bastos, Carlos A.C., and Ferreira, Paulo J.S.G.. Minimal absent words in prokaryotic and eukaryotic genomes. PLoS One, 6(1):e16065, 2011.Google Scholar
Garey, Michael R. and Johnson, David S.. Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman & Co., 1979.Google Scholar
Garrison, Erik, Sirn, Jouni, Novak, Adam, et al. Variation graph toolkit improves read mapping by representing genetic variation in the reference. Nature Biotechnology, 36, 2018. doi: 10.1038/nbt.4227.CrossRefGoogle ScholarPubMed
Ghaffaari, Ali and Marschall, Tobias. Fully-sensitive seed finding in sequence graphs using a hybrid index. Bioinformatics, 35(14):i81i89, 2019.CrossRefGoogle ScholarPubMed
Giancarlo, Raffaele, Rombo, Simona E., and Utro, Filippo. Compressive biological sequence analysis and archival in the era of high-throughput sequencing technologies. Briefings in Bioinformatics, 15(3):390406, 2014.CrossRefGoogle ScholarPubMed
Gibney, Daniel and Thankachan, Sharma V.. On the hardness and inapproximability of recognizing Wheeler graphs. In Bender, Michael A., Svensson, Ola, and Herman, Grzegorz, editors, Proceedings of the ESA, volume 144 of LIPIcs, pages 51:151:16. Schloss Dagstuhl, 2019.Google Scholar
Gibney, Daniel, Thankachan, Sharma V., and Aluru, Srinivas. Feasibility of flow decomposition with subpath constraints in linear time. In Boucher, Christina and Rahmann, Sven, editors, 22nd International Workshop on Algorithms in Bioinformatics, WABI 2022, September 5–7, 2022, Potsdam, Germany, volume 242 of LIPIcs, pages 17:117:16. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2022.Google Scholar
Göke, Jonathan, Schulz, Marcel H., Lasserre, Julia, and Vingron, Martin. Estimation of pairwise sequence similarity of mammalian enhancers with word neighbourhood counts. Bioinformatics, 28(5):656663, 2012.Google Scholar
Goldberg, Andrew V. and Tarjan, Robert Endre. Solving minimum-cost flow problems by successive approximation. In Proceedings of the 19th Annual ACM Symposium on Theory of Computing, 1987, New York, USA, pages 718. ACM, 1987.Google Scholar
Goldberg, Andrew V. and Tarjan, Robert Endre. Finding minimum-cost circulations by canceling negative cycles. In Proceedings of the 20th Annual ACM Symposium on Theory of Computing, May 2–4, 1988, Chicago, Illinois, USA, pages 388397. ACM, 1988.Google Scholar
Goldberg, Andrew V. and Tarjan, Robert Endre. Finding minimum-cost circulations by canceling negative cycles. Journal of the ACM, 36(4):873886, 1989.CrossRefGoogle Scholar
Goldberg, Andrew V. and Tarjan, Robert Endre. Finding minimum-cost circulations by successive approximation. Mathematics of Operations Research, 15:430466, 1990.CrossRefGoogle Scholar
Gonnet, Gaston, Baeza-Yates, Ricardo, and Snider, Tim. Information Retrieval: Data Structures and Algorithms, chapter 3: New indices for text: Pat trees and Pat arrays, pages 6682. Prentice Hall, 1992.Google Scholar
Gori, Fabio, Folino, Gianluigi, Jetten, Mike S. M., and Marchiori, Elena. MTR: Taxonomic annotation of short metagenomic reads using clustering at multiple taxonomic ranks. Bioinformatics, 27(2):196203, 2011.CrossRefGoogle ScholarPubMed
Gotoh, Osamu. An improved algorithm for matching biological sequences. Journal of Molecular Biology, 162(3):705708, 1982.CrossRefGoogle ScholarPubMed
Grice, J. Alicia, Hughey, Richard, and Speck, Don. Reduced space sequence alignment. Bioinformatics, 13(1):4553, 1997.CrossRefGoogle ScholarPubMed
Grigorescu, Elena, Azer, Erfan Sadeqi, and Zhou, Samson. Longest alignment with edits in data streams. In 2017 55th Annual Allerton Conference on Communication, Control, and Computing (Allerton), pages 405412, 2017.CrossRefGoogle Scholar
Grossi, Roberto and Vitter, Jeffrey. Compressed suffix arrays and suffix trees with applications to text indexing and string matching. In Proceedings of the 32nd ACM Symposium on Theory of Computing (STOC), pages 397406, 2000.CrossRefGoogle Scholar
Grossi, Roberto and Vitter, Jeffrey. Compressed suffix arrays and suffix trees with applications to text indexing and string matching. SIAM Journal on Computing, 35(2):378407, 2006.CrossRefGoogle Scholar
Grossi, Roberto, Gupta, Ankur, and Vitter, Jeffrey. High-order entropy-compressed text indexes. In Proceedings of the 14th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 841850, 2003.Google Scholar
Grytten, Ivar, Rand, Knut Dagestad, and Sandve, Geir Kjetil. KAGE: Fast alignment-free graph-based genotyping of SNPs and short indels. Genome Biology, 23(1):115, 2022.CrossRefGoogle ScholarPubMed
Guisewite, G. M. and Pardalos, Panos M.. Minimum concave-cost network flow problems: Applications, complexity, and algorithms. Annals of Operations Research, 25(1):7599, 1990.CrossRefGoogle Scholar
Guo, Hongzhe, Fu, Yilei, Gao, Yan, Li, Junyi, Wang, Yadong, and Liu, Bo. deGSM: Memory scalable construction of large scale de Bruijn graph. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 18(6):21572166, 2021.CrossRefGoogle Scholar
Gusfield, Dan. Algorithms on Strings, Trees and Sequences: Computer Science and Computational Biology. Cambridge University Press, 1997.CrossRefGoogle Scholar
Gusfield, Dan. Integer Linear Programming in Computational and Systems Biology: an Entry-Level Text and Course. Cambridge University Press, 2019.CrossRefGoogle Scholar
Gusfield, Dan, Landau, Gad M., and Schieber, Baruch. An efficient algorithm for the all pairs suffix-prefix problem. Information Processing Letters, 41(4):181185, 1992.Google Scholar
Hampikian, Gregory and Andersen, Tim. Absent sequences: Nullomers and primes. In Pacific Symposium on Biocomputing, volume 12, pages 355366, 2007.Google Scholar
Haque, M. Monzoorul, Ghosh, Tarini Shankar, Komanduri, Dinakar, and Mande, Sharmila S.. SOrt-ITEMS: Sequence orthology based approach for improved taxonomic estimation of metagenomic sequences. Bioinformatics, 25(14):17221730, 2009.CrossRefGoogle Scholar
Harel, Dov and Tarjan, Robert Endre. Fast algorithms for finding nearest common ancestors. SIAM Journal on Computing (SICOMP), 13(2):338355, 1984.CrossRefGoogle Scholar
Hartman, Tzvika, Hassidim, Avinatan, Kaplan, Haim, Raz, Danny, and Segalov, Michal. How to split a flow? In Greenberg, Albert G. and Sohraby, Kazem, editors, INFOCOM, pages 828836. IEEE, 2012.Google Scholar
Haubold, Bernhard, Pierstorff, Nora, Möller, Friedrich, and Wiehe, Thomas. Genome comparison without alignment using shortest unique substrings. BMC Bioinformatics, 6(1):123, 2005.CrossRefGoogle ScholarPubMed
Haussler, David. Convolution kernels on discrete structures. Technical report, UC Santa Cruz, 1999.Google Scholar
Herold, Julia, Kurtz, Stefan, and Giegerich, Robert. Efficient computation of absent words in genomic sequences. BMC Bioinformatics, 9(1):167, 2008.CrossRefGoogle ScholarPubMed
Hickey, Glenn, Heller, David, Monlong, Jean, et al. Genotyping structural variants in pangenome graphs using the vg toolkit. Genome Biology, 21(1):117, 2020.CrossRefGoogle ScholarPubMed
Hirschberg, Daniel S. A linear space algorithm for computing maximal common subsequences. Communications of the ACM, 18(6):341343, 1975.CrossRefGoogle Scholar
Hoare, C. A. R. Quicksort. The Computer Journal, 5(1):1015, 1962.CrossRefGoogle Scholar
Holley, Guillaume and Melsted, Páll. Bifrost: Highly parallel construction and indexing of colored and compacted de Bruijn graphs. Genome Biology, 21(1):249, 2020.CrossRefGoogle ScholarPubMed
Hon, Wing-Kai and Sadakane, Kunihiko. Space-economical algorithms for finding maximal unique matches. In Proceedings of the 13th Annual Symposium on Combinatorial Pattern Matching (CPM’02), volume 2373 of Lecture Notes in Computer Science, pages 144152. Springer-Verlag, 2002.Google Scholar
Hon, Wing-Kai, Sadakane, Kunihiko, and Sung, Wing-Kin. Breaking a time-and-space barrier in constructing full-text indices. In Proceedings of the 44th IEEE Symposium on Foundations of Computer Science (FOCS), pages 251260, 2003.Google Scholar
Hoobin, Christopher, Puglisi, Simon J., and Zobel, Justin. Relative Lempel–Ziv factorization for efficient storage and retrieval of web collections. Proceedings of the VLDB Endowment, 5(3):265273, 2011.CrossRefGoogle Scholar
Hopcroft, John E. and Karp, Richard M.. An n5/2 algorithm for maximum matchings in bipartite graphs. SIAM Journal on Computing (SICOMP), 2(4):225231, 1973.CrossRefGoogle Scholar
Hormozdiari, Fereydoun, Alkan, Can, Eichler, Evan E., and Sahinalp, S. Cenk. Combinatorial algorithms for structural variation detection in high-throughput sequenced genomes. Genome Research, 19(7):12701278, 2009.CrossRefGoogle ScholarPubMed
Hu, T. C. Minimum-cost flows in convex-cost networks. Naval Research Logistics Quarterly, 13(1):19, 1966.CrossRefGoogle Scholar
Huang, Katherine, Brady, Arthur, Mahurkar, Anup, White, Owen, Gevers, Dirk, Huttenhower, Curtis, and Segata, Nicola. Metaref: A pan-genomic database for comparative and community microbial genomics. Nucleic Acids Research, page gkt1078, 2013a.Google Scholar
Huang, Lin, Popic, Victoria, and Batzoglou, Serafim. Short read alignment with populations of genomes. Bioinformatics, 29(13):361370, 2013b.CrossRefGoogle ScholarPubMed
Hui, Lucas Chi Kwong. Color set size problem with application to string matching. In Proceedings of the Annual Symposium on Combinatorial Pattern Matching (CPM), volume 644 of LNCS, pages 230243. Springer, 1992.Google Scholar
Hunt, James W., and Szymanski, Thomas G.. A fast algorithm for computing longest common subsequences. Communications of the ACM, 20(5):350353, 1977.Google Scholar
Hunt, Martin, Letcher, Brice, Malone, Kerri M., et al. Minos: Variant adjudication and joint genotyping of cohorts of bacterial genomes. Genome Biology, 23(1):123, 2022.CrossRefGoogle ScholarPubMed
Huson, Daniel H., Reinert, Knut, and Myers, Eugene W.. The greedy path-merging algorithm for sequence assembly. In RECOMB, pages 157163, 2001.CrossRefGoogle Scholar
Huson, Daniel H., Mitra, Suparna, Ruscheweyh, Hans-Joachim, Weber, Nico, and Schuster, Stephan C.. Integrative analysis of environmental sequences using megan4. Genome Research, 21(9):15521560, 2011.CrossRefGoogle ScholarPubMed
Hyyrö, Heikki. Explaining and extending the bit-parallel approximate string matching algorithm of Myers. Technical report, Department of Computer and Information Sciences, University of Tampere, Finland, 2001. Technical report A2001-10.Google Scholar
Ikeno, Nobuichi. A limit of crosspoint number. IRE Transactions on Circuit Theory, 6(5):187196, 1959.CrossRefGoogle Scholar
Iqbal, Zamin, Caccamo, Mario, Turner, Isaac, Flicek, Paul, and McVean, Gil. De novo assembly and genotyping of variants using colored de Bruijn graphs. Nature Genetics, 44(2):226232, 2012.CrossRefGoogle ScholarPubMed
Isenbarger, Thomas Anthony, Carr, Christopher E., Johnson, Sarah Stewart, et al. The most conserved genome segments for life detection on earth and other planets. Origins of Life and Evolution of Biospheres, 38:517533, 2008.CrossRefGoogle ScholarPubMed
Ivanov, Pesho, Bichsel, Benjamin, and Vechev, Martin. Fast and optimal sequence-to-graph alignment guided by seeds. In Pe’er, Itsik, editor, Research in Computational Molecular Biology, pages 306325. Springer, 2022.CrossRefGoogle Scholar
Jacobson, Guy. Space-efficient static trees and graphs. In Proceedings of the 30th IEEE Symposium on Foundations of Computer Science (FOCS), pages 549554, 1989.CrossRefGoogle Scholar
Jain, Chirag, Tavakoli, Neda, and Aluru, Srinivas. A variant selection framework for genome graphs. Bioinformatics, 37(Supplement 1):i460i467, 2021.CrossRefGoogle ScholarPubMed
Jain, Chirag, Gibney, Daniel, and Thankachan, Sharma V.. Colinear chaining with overlaps and gap costs. In Pe’er, Itsik, editor, Research in Computational Molecular Biology, pages 246262. Springer, 2022.Google Scholar
Jänicke, Stefan, Büchler, Marco, and Scheuermann, Gerik. Improving the layout for text variant graphs. In VisLR Workshop at LREC Conference, pages 4148, 2014.Google Scholar
Jenkins, Paul, Lyngsø, Rune B., and Hein, Jotun. How many transcripts does it take to reconstruct the splice graph? In Algorithms in Bioinformatics, 6th International Workshop, WABI 2006, Zurich, Switzerland, September 11–13, 2006, Proceedings, pages 103114, 2006.CrossRefGoogle Scholar
Jiang, Bai, Song, Kai, Ren, Jie, Deng, Minghua, Sun, Fengzhu, and Zhang, Xuegong. Comparison of metagenomic samples using sequence signatures. BMC Genomics, 13(1):730, 2012.Google Scholar
Kärkkäinen, Juha and Na, Joong Chae. Faster filters for approximate string matching. In Proceedings of the 9th Workshop on Algorithm Engineering and Experiments (ALENEX07), pages 8490. SIAM, 2007.Google Scholar
Kärkkäinen, Juha, Sanders, Peter, and Burkhardt, Stefan. Linear work suffix array construction. Journal of the ACM, 53(6):918936, 2006.Google Scholar
Karp, Richard M. and Rabin, Michael O.. Efficient randomized pattern-matching algorithms. IBM Journal of Research and Development, 31(2):249260, 1987.CrossRefGoogle Scholar
Kasai, Toru, Lee, Gunho, Arimura, Hiroki, Arikawa, Setsuo, and Park, Kunsoo. Linear-time longest-common-prefix computation in suffix arrays and its applications. In Proceedings of the Annual Symposium on Combinatorial Pattern Matching (CPM), volume 2089 of LNCS, pages 181192. Springer, 2001.Google Scholar
Kautz, W. Bounds on directed (d,k) graphs. Theory of Cellular Logic Networks and Machines, Stanford Research Institute, pages 2028, 1968.Google Scholar
Kececioglu, John. Exact and approximation algorithms for DNA sequence reconstruction. PhD thesis, The University of Arizona, USA, 1991.Google Scholar
Kececioglu, John D. and Myers, Eugene W.. Combinatorial algorithms for DNA sequence assembly. Algorithmica, 13:751, 1993.CrossRefGoogle Scholar
Khan, Jamshed, Kokot, Marek, Deorowicz, Sebastian, and Patro, Rob. Scalable, ultra-fast, and low-memory construction of compacted de Bruijn graphs with Cuttlefish 2. Genome Biology, 23(1):132, 2022.CrossRefGoogle Scholar
Kim, Dong Kyue, Sim, Jeong Seop, Park, Heejin, and Park, Kunsoo. Constructing suffix arrays in linear time. Journal of Discrete Algorithms, 3(2–4):126142, 2005.CrossRefGoogle Scholar
Kim, Daehwan, Paggi, Joseph M., Park, Chanhee, Bennett, Christopher, and Salzberg, Steven L.. Graph-based genome alignment and genotyping with hisat2 and hisat-genotype. Nature Biotechnology, 37:907915, 2019.CrossRefGoogle ScholarPubMed
Kloster, Kyle, Kuinke, Philipp, O’Brien, Michael P., et al. A practical FPT algorithm for flow decomposition and transcript assembly. In Pagh, Rasmus and Venkatasubramanian, Suresh, editors, Proceedings of the Twentieth Workshop on Algorithm Engineering and Experiments, ALENEX 2018, New Orleans, LA, USA, January 7–8, 2018, pages 7586. SIAM, 2018.Google Scholar
Knuth, Donald E., Morris, James H., and Pratt, Vaughan R.. Fast pattern matching in strings. SIAM Journal of Computing, 6(2):323350, 1977.CrossRefGoogle Scholar
Ko, Pang and Aluru, Srinivas. Space efficient linear time construction of suffix arrays. Journal of Discrete Algorithms, 3(2–4):143156, 2005.CrossRefGoogle Scholar
Koch, Philipp, Platzer, Matthias, and Downie, Bryan R.. RepARK–de novo creation of repeat libraries from whole-genome NGS reads. Nucleic Acids Research, 42(9):e80e80, 03 2014.CrossRefGoogle ScholarPubMed
Kovaka, Sam, Fan, Yunfan, Ni, Bohan, Timp, Winston, and Schatz, Michael C.. Targeted nanopore sequencing by real-time mapping of raw electrical signal with UNCALLED. Nature Biotechnology, 39(4):431441, 2021.CrossRefGoogle ScholarPubMed
Krogh, Anders, Brown, Michael, Mian, I. Saira, Sjölander, Kimmen, and Haussler, David. Hidden Markov models in computational biology: Applications to protein modeling. Journal of Molecular Biology, 235(5):15011531, 1994.CrossRefGoogle ScholarPubMed
Kuksa, Pavel P. and Pavlovic, Vladimir. Efficient evaluation of large sequence kernels. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 759767. ACM, 2012.Google Scholar
Kuksa, Pavel P., Huang, Pai-Hsi, and Pavlovic, Vladimir. Scalable algorithms for string kernels with inexact matching. In NIPS, pages 881888, 2008.Google Scholar
Kulekci, M. Oguzhan, Vitter, Jeffrey Scott, and Xu, Bojian. Efficient maximal repeat finding using the Burrows–Wheeler transform and wavelet tree. IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB), 9(2):421429, 2012.Google Scholar
Kurtz, Stefan, Choudhuri, Jomuna V., Ohlebusch, Enno, Schleiermacher, Chris, Stoye, Jens, and Giegerich, Robert. REPuter: the manifold applications of repeat analysis on a genomic scale. Nucleic Acids Research, 29(22):46334642, 11 2001.CrossRefGoogle ScholarPubMed
Kuruppu, Shanika, Puglisi, Simon J., and Zobel, Justin. Optimized relative Lempel–Ziv compression of genomes. In Proceedings of the Thirty-Fourth Australasian Computer Science Conference-Volume 113, pages 9198. Australian Computer Society, Inc., 2011a.Google Scholar
Kuruppu, Shanika, Puglisi, Simon J., and Zobel, Justin. Reference sequence construction for relative compression of genomes. In Grossi, Roberto, Sebastiani, Fabrizio, and Silvestri, Fabrizio, editors, String Processing and Information Retrieval, pages 420425. Springer, 2011b.CrossRefGoogle Scholar
Lam, Tak Wah, Sung, Wing-Kin, Tam, Siu-Lung, Wong, Chi-Kwong, and Yiu, Siu-Ming. Compressed indexing and local alignment of DNA. Bioinformatics, 24(6):791797, 2008.Google Scholar
Landau, Gad M. and Vishkin, Uzi. Fast string matching with k differences. Journal of Computer and System Sciences, 37:6378, 1988.CrossRefGoogle Scholar
Landau, Gad M. and Vishkin, Uzi. Fast parallel and serial approximate string matching. Journal of Algorithms, 10(2):157169, 1989.CrossRefGoogle Scholar
Langmead, Ben, Trapnell, Cole, Pop, Maihai, and Salzberg, Steven L.. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biology, 10(3): R25, 2009.Google Scholar
Lecompte, Lolita, Peterlongo, Pierre, Lavenier, Dominique, and Lemaitre, Claire. SVJedi: Genotyping structural variations with long reads. Bioinformatics, 36(17):45684575, 2020.CrossRefGoogle ScholarPubMed
Lee, Christopher, Grasso, Catherine, and Sharlow, Mark F.. Multiple sequence alignment using partial order graphs. Bioinformatics, 18(3):45264, 2002.CrossRefGoogle ScholarPubMed
Lehman, Eric and Shelat, Abhi. Approximation algorithms for grammar-based compression. In Proceedings of the Thirteenth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 205212. Society for Industrial and Applied Mathematics, 2002.Google Scholar
Leslie, Christina and Kuang, Rui. Fast kernels for inexact string matching. In Learning Theory and Kernel Machines, pages 114128. Springer, 2003.CrossRefGoogle Scholar
Letcher, Brice, Hunt, Martin, and Iqbal, Zamin. Gramtools enables multiscale variation analysis with genome graphs. Genome Biology, 22(1):127, 2021.CrossRefGoogle ScholarPubMed
Levenshtein, Vladimir I. Binary codes capable of correcting deletions, insertions and reversals. Soviet Physics Doklady, 10:707, 1966.Google Scholar
Li, Gaoyang, Jiang, Tao, Li, Junyi, and Wang, Yadong. PanSVR: Pan-genome augmented short read realignment for sensitive detection of structural variations. Frontiers in Genetics, Volume 12, 2021.Google ScholarPubMed
Li, Heng. Minimap2: Pairwise alignment for nucleotide sequences. Bioinformatics, 34(18): 30943100, 2018.CrossRefGoogle ScholarPubMed
Li, Heng and Durbin, Richard. Fast and accurate short read alignment with BurrowsWheeler transform. Bioinformatics, 25(14):17541760, 2009.CrossRefGoogle ScholarPubMed
Li, Heng, Feng, Xiaowen, and Chu, Chong. The design and construction of reference pangenome graphs with minigraph. Genome Biology, 21(1):119, 2020.Google Scholar
Li, Jingyi Jessica, Jiang, Ci-Ren, Brown, James B., Huang, Haiyan, and Bickel, Peter J.. Sparse linear modeling of next-generation mRNA sequencing (RNA-Seq) data for isoform discovery and abundance estimation. Proceedings National Academy of Sciences, 108(50): 1986719872, 2011.CrossRefGoogle ScholarPubMed
Li, Ming and Vitányi, Paul M. B.. An Introduction to Kolmogorov Complexity and Its Applications. Springer, 2009.Google Scholar
Li, Ming, Chen, Xin, Li, Xin, Ma, Bin, and Vitányi, Paul M. B.. The similarity metric. IEEE Transactions on Information Theory, 50(12):32503264, 2004.Google Scholar
Li, Ruiqiang, Yu, Chang, Li, Yingrui, et al. Soap2. Bioinformatics, 25(15):19661967, 2009.CrossRefGoogle ScholarPubMed
Li, Wei, Feng, Jianxing, and Jiang, Tao. IsoLasso: A LASSO regression approach to RNA-Seq based transcriptome assembly. Journal of Computational Biology, 18(11):1693707, 2011.CrossRefGoogle ScholarPubMed
Lifshits, Yury, Mozes, Shay, Weimann, Oren, and Ziv-Ukelson, Michal. Speeding up HMM decoding and training by exploiting sequence repetitions. Algorithmica, 54(3):379399, 2009.CrossRefGoogle Scholar
Lin, Jie, Adjeroh, Donald, and Jiang, Bing-Hua. Probabilistic suffix array: Efficient modeling and prediction of protein families. Bioinformatics, 28(10):13141323, 04 2012a.CrossRefGoogle ScholarPubMed
Lin, Yen-Yi, Dao, Phuong, Hach, Faraz, et al. CLIIQ: Accurate comparative detection and quantification of expressed isoforms in a population. In Raphael, Benjamin J. and Tang, Jijun, editors, WABI, volume 7534 of Lecture Notes in Computer Science, pages 178189. Springer, 2012b.Google Scholar
Lindsay, James, Salooti, Hamed, Zelikovsky, Alex, and MȈandoiu, Ion. Scalable genome scaffolding using integer linear programming. In Proceedings of the ACM Conference on Bioinformatics, Computational Biology and Biomedicine, BCB ’12, New York, NY, USA, pages 377383. ACM, 2012.CrossRefGoogle Scholar
Liu, Bo, Gibbons, Theodore, Ghodsi, Mohammad, Treangen, Todd, and Pop, Mihai. Accurate and fast estimation of taxonomic profiles from metagenomic shotgun sequences. BMC Genomics, 12(Suppl 2):S4, 2011.CrossRefGoogle ScholarPubMed
Lo, Christine, Kim, Sangwoo, Zakov, Shay, and Bafna, Vineet. Evaluating genome architecture of a complex region via generalized bipartite matching. BMC Bioinformatics, 14(Suppl 5): S13, 2013.Google Scholar
Lodhi, Huma, Saunders, Craig, Shawe-Taylor, John, Cristianini, Nello, and Watkins, Chris. Text classification using string kernels. The Journal of Machine Learning Research, 2:419444, 2002.Google Scholar
Löytynoja, Ari, Vilella, Albert J., and Goldman, Nick. Accurate extension of multiple sequence alignments using a phylogeny-aware graph algorithm. Bioinformatics, 28(13):16841691, 2012.CrossRefGoogle ScholarPubMed
Maciuca, Sorina, del Ojo Elias, Carlos, McVean, Gil, and Iqbal, Zamin. A natural encoding of genetic variation in a Burrows–Wheeler transform to enable mapping and genome inference. In International Workshop on Algorithms in Bioinformatics, pages 222233. Springer, 2016.CrossRefGoogle Scholar
Maier, David. The complexity of some problems on subsequences and supersequences. Journal of the ACM, 25(2):322336, 1978.Google Scholar
Maillet, Nicolas, Lemaitre, Claire, Chikhi, Rayan, Lavenier, Dominique, and Peterlongo, Pierre. Compareads: comparing huge metagenomic experiments. BMC Bioinformatics, 13(Suppl 19):S10, 2012.CrossRefGoogle ScholarPubMed
Mäkinen, Veli and Navarro, Gonzalo. Succinct suffix arrays based on run-length encoding. Nordic Journal of Computing, 12(1):4066, 2005.Google Scholar
Mäkinen, Veli and Navarro, Gonzalo. Rank and select revisited and extended. Theoretical Computer Science, 387(3):332347, 2007.CrossRefGoogle Scholar
Mäkinen, Veli and Norri, Tuukka. Applying the positional Burrows–Wheeler transform to all-pairs Hamming distance. Information Processing Letters, 146:1719, 2019.CrossRefGoogle Scholar
Mäkinen, Veli and Sahlin, Kristoffer. Chaining with overlaps revisited. In Gørtz, Inge Li and Weimann, Oren, editors, 31st Annual Symposium on Combinatorial Pattern Matching, CPM 2020, June 17-19, 2020, Copenhagen, Denmark, volume 161 of LIPIcs, pages 25:125:12. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2020.Google Scholar
Mäkinen, Veli, Navarro, Gonzalo, and Ukkonen, Esko. Algorithms for transposition invariant string matching. In Proceedings of the 20th International Symposium on Theoretical Aspects of Computer Science (STACS’03), volume 2607 of LNCS, pages 191202. Springer-Verlag, 2003.Google Scholar
Mäkinen, Veli, Navarro, Gonzalo, Síren, Jouni, and Välimäki, Niko. Storage and retrieval of individual genomes. In Proceedings of the 13th Annual International Conference on Research in Computational Molecular Biology (RECOMB 2009), volume 5541 of LNBI, pages 121137. Springer-Verlag, 2009.Google Scholar
Mäkinen, Veli, Navarro, Gonzalo, Sirén, Jouni, and Välimäki, Niko. Storage and retrieval of highly repetitive sequence collections. Journal of Computational Biology, 17(3):281308, 2010.CrossRefGoogle ScholarPubMed
Mäkinen, Veli, Salmela, Leena, and Ylinen, Johannes. Normalized N50 assembly metric using gap-restricted co-linear chaining. BMC Bioinformatics, 13:255, 2012.CrossRefGoogle ScholarPubMed
Mäkinen, Veli, Tomescu, Alexandru I., Kuosmanen, Anna, Paavilainen, Topi, Gagie, Travis, and Chikhi, Rayan. Sparse dynamic programming on DAGs with small width. ACM Transactions on Algorithms, 15(2):29:129:21, 2019.CrossRefGoogle Scholar
Mäklin, Tommi, Kallonen, Teemu, Alanko, Jarno, et al. Bacterial genomic epidemiology with mixed samples. Microbial Genomics, 7(11), 2021.CrossRefGoogle ScholarPubMed
Manber, Udi and Myers, Gene. Suffix arrays: A new method for on-line string searches. SIAM Journal on Computing, 22(5):935948, 1993.CrossRefGoogle Scholar
Marco-Sola, Santiago, Moure, Juan Carlos, Moreto, Miquel, and Espinosa, Antonio. Fast gap-affine pairwise alignment using the wavefront algorithm. Bioinformatics, 37(4):456463, 2020.CrossRefGoogle Scholar
Marcus, Shoshana, Lee, Hayan, and Schatz, Michael C.. SplitMEM: A graphical algorithm for pan-genome analysis with suffix skips. Bioinformatics, 30(24):34763483, 2014.CrossRefGoogle ScholarPubMed
Marschall, Tobias, Costa, Ivan G., Canzar, Stefan, et al. Clever: Clique-enumerating variant finder. Bioinformatics, 28(22):28752882, 2012.CrossRefGoogle ScholarPubMed
Masek, William and Paterson, Michael. A faster algorithm for computing string edit distances. Journal of Computer and System Sciences, 20(1):1831, 1980.CrossRefGoogle Scholar
Mathé, Catherine, Sagot, Marie-France, Schiex, Thomas, and Rouzé, Pierre. Current methods of gene prediction, their strengths and weaknesses. Nucleic Acids Research, 30(19):41034117, 2002.CrossRefGoogle ScholarPubMed
McCreight, Edward M. A space-economical suffix tree construction algorithm. Journal of the ACM, 23(2):262272, 1976.CrossRefGoogle Scholar
Medvedev, Paul, Georgiou, Konstantinos, Myers, Gene, and Brudno, Michael. Computability of models for sequence assembly. In Workshop on Algorithms in Bioinformatics (WABI 2007), pages 289301, 2007.Google Scholar
Medvedev, Paul, Fiume, Marc, Dzamba, Misko, Smith, Tim, and Brudno, Michael. Detecting copy number variation with mated short reads. Genome Research, 20(11):16131622, 2010.Google Scholar
Minkin, Ilia, Pham, Son, and Medvedev, Paul. TwoPaCo: An efficient algorithm to build the compacted de Bruijn graph from many complete genomes. Bioinformatics, 33(24):40244032, 2017.CrossRefGoogle Scholar
Minoux, Michel. Solving integer minimum cost flows with separable convex cost objective polynomially. In Gallo, Giorgio and Sandi, Claudio, editors, Netflow at Pisa, volume 26 of Mathematical Programming Studies, pages 237239. Springer, 1986.Google Scholar
Mitzenmacher, Michael and Upfal, Eli. Probability and Computing: Randomization and Probabilistic Techniques in Algorithms and Data Analysis. Cambridge University Press, 2017.Google Scholar
Moore, Edward. The shortest path through a maze, in Aiken, Howard, ed., Proceedings of an International Symposium on the Theory of Switching, 2–5 April 1957, Part II, pages 285292. Harvard University Press, 1959.Google Scholar
Morris, James H. and Pratt, Vaughan R.. A linear pattern-matching algorithm. Technical Report 40, University of California, Berkeley, 1970.Google Scholar
Muggli, Martin D., Bowe, Alexander, Noyes, Noelle R., et al. Succinct colored de Bruijn graphs. Bioinformatics, 33(20):31813187, 2017.CrossRefGoogle ScholarPubMed
Munro, I. Tables. In Proceedings of the 16th Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS), LNCS v. 1180, pages 3742, 1996.CrossRefGoogle Scholar
Myers, Eugene W. The fragment assembly string graph. Bioinformatics, 21(suppl 2):ii79ii85, 2005.Google Scholar
Myers, Gene. A fast bit-vector algorithm for approximate string matching based on dynamic programming. Journal of the ACM, 46(3):395415, 1999.Google Scholar
Na, Joong Chae, Kim, Hyunjoon, Park, Heejin, et al. FM-index of alignment: A compressed index for similar strings. Theoretical Computer Science, 638:159170, 2016.CrossRefGoogle Scholar
Na, Joong Chae, Kim, Hyunjoon, Min, Seunghwan, et al. Fm-index of alignment with gaps. Theoretical Computer Science, 710:148157, 2018.CrossRefGoogle Scholar
Nagarajan, Niranjan and Pop, Mihai. Parametric complexity of sequence assembly: Theory and applications to next generation sequencing. Journal of Computational Biology, 16(7):897908, 2009.CrossRefGoogle Scholar
Navarro, Gonzalo. Compact Data Structures: A Practical Approach. Cambridge University Press, 2016.CrossRefGoogle Scholar
Navarro, Gonzalo. Improved approximate pattern matching on hypertext. Theoretical Computer Science, 237(1-2):455463, 2000.Google Scholar
Navarro, Gonzalo. A guided tour to approximate string matching. ACM Computing Surveys, 33(1):3188, 2001.Google Scholar
Navarro, Gonzalo. Wavelet trees for all. In Kärkkäinen, Juha and Stoye, Jens, editors, Combinatorial Pattern Matching, volume 7354 of Lecture Notes in Computer Science, pages 226. Springer, 2012.Google Scholar
Navarro, Gonzalo and Mäkinen, Veli. Compressed full-text indexes. ACM Computing Surveys, 39(1):Article 2, 2007.CrossRefGoogle Scholar
Navarro, Gonzalo and Raffinot, Mathieu. Flexible Pattern Matching in Strings – Practical On-Line Search Algorithms for Texts and Biological Sequences. Cambridge University Press, 2002.CrossRefGoogle Scholar
Needleman, Saul B. and Wunsch, Christian D.. A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology, 48(3):443453, 1970.Google Scholar
Nevill-Manning, Craig G. Inferring sequential structure. PhD thesis, Citeseer, 1996.Google Scholar
Nishimoto, Takaaki, Kanda, Shunsuke, and Tabei, Yasuo. An optimal-time RLBWT construction in BWT-runs bounded space. In Bojanczyk, Mikolaj, Merelli, Emanuela, and Woodruff, David P., editors, 49th International Colloquium on Automata, Languages, and Programming, ICALP 2022, July 4–8, 2022, Paris, France, volume 229 of LIPIcs, pages 99:199:20. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2022.Google Scholar
Norri, Tuukka, Cazaux, Bastien, Kosolobov, Dmitry, and Mäkinen, Veli. Linear time minimum segmentation enables scalable founder reconstruction. Algorithms for Molecular Biology, 14(1):12:112:15, 2019.CrossRefGoogle ScholarPubMed
Norri, Tuukka, Cazaux, Bastien, Dönges, Saska, Valenzuela, Daniel, and Mäkinen, Veli. Founder reconstruction enables scalable and seamless pangenomic analysis. Bioinformatics, 37(24): 46114619, 2021.CrossRefGoogle ScholarPubMed
Ntafos, Simeon C. and Hakimi, S. Louis. On path cover problems in digraphs and applications to program testing. IEEE Transactions on Software Engineering, 5(5):520529, 1979.CrossRefGoogle Scholar
Nykänen, Matti and Ukkonen, Esko. The exact path length problem. Journal of Algorithms, 42(1):4152, 2002.Google Scholar
Ohlebusch, Enno, Gog, Simon, and Kügel, Adrian. Computing matching statistics and maximal exact matches on compressed full-text indexes. In SPIRE, pages 347358, 2010.CrossRefGoogle Scholar
Okanohara, Daisuke and Tsujii, Jun’ichi. Text categorization with all substring features. In Proceedings of SDM 2009, pages 838846, 04 2009.Google Scholar
Onodera, Taku, Sadakane, Kunihiko, and Shibuya, Tetsuo. Detecting superbubbles in assembly graphs. In WABI, pages 338348, 2013.CrossRefGoogle Scholar
Orlin, James B. A faster strongly polynominal minimum cost flow algorithm. In Proceedings of the 20th Annual ACM Symposium on Theory of Computing, May 2–4, 1988, Chicago, Illinois, USA, pages 377387, 1988.Google Scholar
Orlin, James B. A faster strongly polynomial minimum cost flow algorithm. Operations Research, 41:338350, 1993.CrossRefGoogle Scholar
Pabinger, Stephan, Dander, Andreas, Fischer, Maria, et al. A survey of tools for variant analysis of next-generation genome sequencing data. Briefings in Bioinformatics, 2013.CrossRefGoogle Scholar
Pagh, Anna, Pagh, Rasmus, and Rao, S. Srinivasa. An optimal Bloom filter replacement. In Proceedings of the Sixteenth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 823829, 2005.Google Scholar
Pandey, Prashant, Gao, Yinjie, and Kingsford, Carl. VariantStore: An index for large-scale genomic variant search. Genome Biology, 22(1):125, 2021.CrossRefGoogle ScholarPubMed
Parida, Laxmi. Pattern Discovery in Bioinformatics: Theory and Algorithms. Chapman & Hall/CRC, 2007.CrossRefGoogle Scholar
Paten, Benedict, Novak, Adam M., Eizenga, Jordan M., and Garrison, Erik. Genome graphs and the evolution of genome inference. Genome Research, 27(5):665676, 2017.Google Scholar
Patterson, Murray, Marschall, Tobias, Pisanti, Nadia, et al. WhatsHap: Haplotype assembly for future-generation sequencing reads. In 18th Annual International Conference on Research in Computational Molecular Biology (RECOMB 2014), volume 8394 of Lecture Notes in Computer Science, pages 237249. Springer, 2014.Google Scholar
Payne, Alexander, Holmes, Nadine, Clarke, Thomas, et al. Readfish enables targeted nanopore sequencing of gigabase-sized genomes. Nature Biotechnology, 39(4):442450, 2021.Google Scholar
Pell, Jason, Hintze, Arend, Canino-Koning, Rosangela, Howe, Adina, Tiedje, James M., and Brown, C. Titus. Scaling metagenome sequence assembly with probabilistic de Bruijn graphs. Proceedings of the National Academy of Sciences, 109(33):1327213277, 2012.CrossRefGoogle ScholarPubMed
Pellicer, Jaume, Fay, Michael F., and Leitch, Ilia J.. The largest eukaryotic genome of them all? Botanical Journal of the Linnean Society, 164(1):1015, 2010.CrossRefGoogle Scholar
Pevzner, Pavel A. ℓ-tuple DNA sequencing: Computer analysis. Journal of Biomolecular Structure and Dynamics, 7(1):6373, 1989.CrossRefGoogle Scholar
Pevzner, Pavel A., Tang, Haixu, and Waterman, Michael S.. A new approach to fragment assembly in DNA sequencing. In RECOMB, pages 256267, 2001.Google Scholar
Pijls, Wim and Potharst, Rob. Another note on Dilworth’s decomposition theorem. Journal of Discrete Mathematics, 2013:692645, 2013.CrossRefGoogle Scholar
Policriti, Alberto and Prezza, Nicola. Hashing and Indexing: Succinct data structures and smoothed analysis. In ISAAC 2014 – 25th International Symposium on Algorithms and Computation, 2014.CrossRefGoogle Scholar
Policriti, Alberto and Prezza, Nicola. Fast randomized approximate string matching with succinct hash data structures. BMC Bioinformatics, 16 Suppl 9:S4, 2015.CrossRefGoogle ScholarPubMed
Policriti, Alberto, Tomescu, Alexandru I., and Vezzi, Francesco. A randomized numerical aligner (rNA). Journal of Computer and System Sciences, 78(6):18681882, 2012.Google Scholar
Prezza, Nicola, Pisanti, Nadia, Sciortino, Marinella, and Rosone, Giovanna. Detecting mutations by eBWT. In Parida, Laxmi and Ukkonen, Esko, editors, 18th International Workshop on Algorithms in Bioinformatics, WABI 2018, August 20-22, 2018, Helsinki, Finland, volume 113 of LIPIcs, pages 3:13:15. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2018.Google Scholar
Pritt, Jacob, Chen, Nae-Chyun, and Langmead, Ben. FORGe: Prioritizing variants for graph genomes. Genome Biology, 19(1):116, 2018.CrossRefGoogle ScholarPubMed
Puglisi, Simon J., Smyth, William F., and Turpin, Andrew. A taxonomy of suffix array construction algorithms. ACM Computing Surveys, 39(2), 2007.CrossRefGoogle Scholar
Qi, Ji, Wang, Bin, and Hao, Bai-Iin. Whole proteome prokaryote phylogeny without sequence alignment: A K-string composition approach. Journal of Molecular Evolution, 58(1):111, 2004.Google Scholar
Qin, Zhaohui S., Yu, Jianjun, Shen, Jincheng, et al. HPeak: An HMM-based algorithm for defining read-enriched regions in ChIP-Seq data. BMC Bioinformatics, 11:369, 2010.CrossRefGoogle ScholarPubMed
Qiu, Yutong and Kingsford, Carl. Constructing small genome graphs via string compression. Bioinformatics, 37(Supplement 1):i205i213, 2021.Google Scholar
Rabin, Michael O. and Scott, Dana. Finite automata and their decision problems. IBM Journal of Research and Development, 3(2):114125, 1959.Google Scholar
Raffinot, Mathieu. On maximal repeats in strings. Information Processing Letters, 80(3):165169, 2001.CrossRefGoogle Scholar
Räihä, Kari-Jouko and Ukkonen, Esko. The shortest common supersequence problem over binary alphabet is NP-complete. Theoretical Computer Science, 16:187198, 1981.CrossRefGoogle Scholar
Rakocevic, Goran, Semenyuk, Vladimir, Lee, Wan-Ping, et al. Fast and accurate genomic analyses using genome graphs. Nature Genetics, 51(2):354362, 2019.Google Scholar
Rautiainen, Mikko and Marschall, Tobias. Aligning sequences to general graphs in O(V + mE) time. bioRxiv, 2017. doi: 10.1101/216127.Google Scholar
Rautiainen, Mikko and Marschall, Tobias. GraphAligner: Rapid and versatile sequence-to-graph alignment. Genome Biology, 21(1):128, 2020.CrossRefGoogle ScholarPubMed
Rautiainen, Mikko and Marschall, Tobias. MBG: Minimizer-based sparse de Bruijn graph construction. Bioinformatics, 37(16):24762478, 2021.CrossRefGoogle ScholarPubMed
Reinert, Gesine, Chew, David, Sun, Fengzhu, and Waterman, Michael S.. Alignment-free sequence comparison (i): Statistics and power. Journal of Computational Biology, 16(12): 16151634, 2009.Google Scholar
Ren, Jie, Song, Kai, Sun, Fengzhu, Deng, Minghua, and Reinert, Gesine. Multiple alignment-free sequence comparison. Bioinformatics, 29(21):26902698, 2013.CrossRefGoogle ScholarPubMed
Rho, Mina, Choi, Jeong-Hyeon, Kim, Sun, Lynch, Michael, and Tang, Haixu. De novo identiffication of LTR retrotransposons in eukaryotic genomes. BMC Genomics, 8:90, 2007.CrossRefGoogle Scholar
Rieck, Konrad and Laskov, Pavel. Linear-time computation of similarity measures for sequential data. Journal of Machine Learning Research, 9:2348, 2008.Google Scholar
Rieck, Konrad, Laskov, Pavel, and Sonnenburg, Sören. Computation of similarity measures for sequential data using generalized suffix trees. In Advances in Neural Information Processing Systems 19: Proceedings of the 2006 Conference. The MIT Press, 2007.Google Scholar
Rizzi, Romeo, Tomescu, Alexandru I., and Mäkinen, Veli. On the complexity of minimum path cover with subpath constraints for multi-assembly. BMC Bioinformatics, 15(9):S5, 2014.CrossRefGoogle ScholarPubMed
Roberts, Michael, Hayes, Wayne, Hunt, Brian R., Mount, Stephen M., and Yorke, James A.. Reducing storage requirements for biological sequence comparison. Bioinformatics, 20(18): 33633369, 2004.Google Scholar
Rødland, Einar Andreas. Compact representation of k-mer de Bruijn graphs for genome read assembly. BMC Bioinformatics, 14(1):313, 2013.Google Scholar
Ronen, Roy, Boucher, Christina, Chitsaz, Hamidreza, and Pevzner, Pavel. SEQuel: Improving the accuracy of genome assemblies. Bioinformatics, 28(12):i188i196, 2012.CrossRefGoogle ScholarPubMed
Rousu, Juho, Shawe-Taylor, John, and Jaakkola, Tommi. Efficient computation of gapped substring kernels on large alphabets. Journal of Machine Learning Research, 6(9), 2005.Google Scholar
Ruan, Jue and Li, Heng. Fast and accurate long-read assembly with wtdbg2. Nature Methods, 17(2):155158, 2020.CrossRefGoogle ScholarPubMed
Rytter, Wojciech. Application of Lempel–Ziv factorization to the approximation of grammar-based compression. Theoretical Computer Science, 302(1):211222, 2003.CrossRefGoogle Scholar
Sacomoto, Gustavo. Efficient algorithms for de novo assembly of alternative splicing events from RNA-seq data. PhD thesis, The University Claude Bernard Lyon 1, Lyon, France, 2014.Google Scholar
Sadakane, Kunihiko. Compressed text databases with efficient query algorithms based on the compressed suffix array. In Proceedings of the 11th International Symposium on Algorithms and Computation (ISAAC), LNCS v. 1969, pages 410421, 2000.Google Scholar
Sahlin, Kristoffer, Street, Nathaniel, Lundeberg, Joakim, and Arvestad, Lars. Improved gap size estimation for scaffolding algorithms. Bioinformatics, 28(17):22152222, 2012.CrossRefGoogle ScholarPubMed
Sakharkar, Meena K., Chow, Vincent T., and Kangueane, Pandjassarame. Distributions of exons and introns in the human genome. In Silico Biol, 4(4):387393, 2004.Google Scholar
Salikhov, Kamil, Sacomoto, Gustavo, and Kucherov, Gregory. Using cascading Bloom filters to improve the memory usage for de Brujin graphs. In Algorithms in Bioinformatics, pages 364376. Springer, 2013.CrossRefGoogle Scholar
Salmela, Leena and Schröder, Jan. Correcting errors in short reads by multiple alignments. Bioinformatics, 27(11):14551461, 2011.CrossRefGoogle ScholarPubMed
Salmela, Leena, Mäkinen, Veli, Välimäki, Niko, Ylinen, Johannes, and Ukkonen, Esko. Fast scaffolding with small independent mixed integer programs. Bioinformatics, 27(23):32593265, 2011.Google Scholar
Salmela, Leena, Sahlin, Kristoffer, Mäkinen, Veli, and Tomescu, Alexandru I.. Gap filling as exact path length problem. Journal of Computational Biology, 23(5):347361, 2016.CrossRefGoogle ScholarPubMed
Sankoff, David. Minimal mutation trees of sequences. SIAM Journal on Applied Mathematics, 28:3542, 1975.CrossRefGoogle Scholar
Schleimer, Saul, Wilkerson, Daniel S., and Aiken, Alex. Winnowing: Local algorithms for document fingerprinting. In Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data, SIGMOD ’03, page 7685, New York, NY, USA, 2003. Association for Computing Machinery.Google Scholar
Schnattinger, Thomas, Ohlebusch, Enno, and Gog, Simon. Bidirectional search in a string with wavelet trees. In 21st Annual Symposium on Combinatorial Pattern Matching (CPM 2010), volume 6129 of Lecture Notes in Computer Science, pages 4050. Springer, 2010.Google Scholar
Schneeberger, Korbinian, Hagmann, Jörg, Ossowski, Stephan, et al. Simultaneous alignment of short reads against multiple genomes. Genome Biology, 10:R98, 2009.CrossRefGoogle ScholarPubMed
Schröder, Jan, Schrder, Heiko, Puglisi, Simon J., Sinha, Ranjan, and Schmidt, Bertil. SHREC: A short-read error correction method. Bioinformatics, 25(17):21572163, 2009.Google Scholar
Schrijver, Alexander. Combinatorial Optimization. Polyhedra and Efficiency. Vol. A, volume 24 of Algorithms and Combinatorics. Springer-Verlag, 2003. Paths, flows, matchings, Chapters 1–38.Google Scholar
Schulz, Marcel H., Weese, David, Rausch, Tobias, Döring, Andreas, Reinert, Knut, and Vingron, Martin. Fast and adaptive variable order Markov chain construction. In Crandall, Keith A. and Lagergren, Jens, editors, Algorithms in Bioinformatics, pages 306317. Springer, 2008.CrossRefGoogle Scholar
Schulz, Tizian, Wittler, Roland, and Stoye, Jens. Sequence-based pangenomic core detection. iScience, 25(6):104413, 2022.CrossRefGoogle ScholarPubMed
Segata, Nicola, Waldron, Levi, Ballarini, Annalisa, Narasimhan, Vagheesh, Jousson, Olivier, and Huttenhower, Curtis. Metagenomic microbial community profiling using unique clade-specific marker genes. Nature Methods, 9(8):811814, 2012.CrossRefGoogle ScholarPubMed
Seth, Sohan, Välimäki, Niko, Kaski, Samuel, and Honkela, Antti. Exploration and retrieval of whole-metagenome sequencing samples. Bioinformatics, 30(17):24712479, 2014.Google Scholar
Shafin, Kishwar, Pesout, Trevor, Lorig-Roach, Ryan, et al. Nanopore sequencing and the Shasta toolkit enable efficient de novo assembly of eleven human genomes. Nature Biotechnology, 38(9):10441053, 2020.CrossRefGoogle ScholarPubMed
Sharma, Vineet K., Kumar, Naveen, Prakash, Tulika, and Taylor, Todd D.. Fast and accurate taxonomic assignments of metagenomic sequences using MetaBin. PLoS One, 7(4):e34030, 2012.CrossRefGoogle ScholarPubMed
Shawe-Taylor, John and Cristianini, Nello. Kernel Methods for Pattern Analysis. Cambridge University Press, 2004.CrossRefGoogle Scholar
Sherman, Rachel M. and Salzberg, Steven L.. Pan-genomics in the human genome era. Nature Reviews Genetics, 21(4):243254, 2020.CrossRefGoogle ScholarPubMed
Shimbel, A. Structure in communication nets. In Proceedings of the Symposium on Information Networks (New York, 1954), pages 199203. Polytechnic Press of the Polytechnic Institute of Brooklyn, 1955.Google Scholar
Sibbesen, Jonas Andreas, Maretty, Lasse, and Krogh, Anders. Accurate genotyping across variant classes and lengths using variant graphs. Nature Genetics, 50(7):10541059, 2018.Google Scholar
Simpson, Jared T. and Durbin, Richard. Efficient construction of an assembly string graph using the FM-index. Bioinformatics, 26(12):367373, 2010.Google Scholar
Sims, Gregory E., Jun, Se-Ran, Wu, Guohong A., and Kim, Sung-Hou. Alignment-free genome comparison with feature frequency profiles (FFP) and optimal resolutions. Proceedings of the National Academy of Sciences, 106(8):26772682, 2009.CrossRefGoogle ScholarPubMed
Sirén, Jouni, Välimäki, Niko, and Mäkinen, Veli. Indexing finite language representation of population genotypes. In 11th International Workshop on Algorithms in Bioinformatics (WABI 2011), volume 6833 of Lecture Notes in Computer Science, pages 270281. Springer, 2011.Google Scholar
Sirén, Jouni, Välimäki, Niko, and Mäkinen, Veli. Indexing graphs for path queries with applications in genome research. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 11(2):375388, 2014.CrossRefGoogle ScholarPubMed
Sirén, Jouni, Garrison, Erik, Novak, Adam M., Paten, Benedict, and Durbin, Richard. Haplotype-aware graph indexes. Bioinformatics, 36(2):400407, 2020.Google Scholar
Sirén, Jouni, Monlong, Jean, Chang, Xian, et al. Pangenomics enables genotyping of known structural variants in 5202 diverse genomes. Science (New York, N.Y.), 374:abg8871, 2021.CrossRefGoogle ScholarPubMed
Smith, T. F. and Waterman, M. S.. Identification of common molecular subsequences. Journal of Molecular Biology, 147(1):195197, 1981.CrossRefGoogle ScholarPubMed
Smola, Alex J. and Vishwanathan, S. V. N.. Fast kernels for string and tree matching. In Becker, S., Thrun, S., and Obermayer, K., editors, Advances in Neural Information Processing Systems 15, pages 585592. MIT Press, 2003.Google Scholar
Sobih, Ahmed, Tomescu, Alexandru I., and Mäkinen, Veli. Metaflow: Metagenomic profiling based on whole-genome coverage analysis with min-cost flows. In Singh, Mona, editor, Research in Computational Molecular Biology – 20th Annual Conference, RECOMB 2016, Santa Monica, CA, USA, April 17–21, 2016, Proceedings, volume 9649 of Lecture Notes in Computer Science, pages 111121. Springer, 2016.Google Scholar
Song, Kai, Ren, Jie, Reinert, Gesine, Deng, Minghua, Waterman, Michael S., and Sun, Fengzhu. New developments of alignment-free sequence comparison: Measures, statistics and next-generation sequencing. Briefings in Bioinformatics, page bbt067, 2013a.Google Scholar
Song, Kai, Ren, Jie, Zhai, Zhiyuan, Liu, Xuemei, Deng, Minghua, and Sun, Fengzhu. Alignment-free sequence comparison based on next-generation sequencing reads. Journal of Computational Biology, 20(2):6479, 2013b.CrossRefGoogle ScholarPubMed
Song, Li and Florea, Liliana. CLASS: Constrained transcript assembly of RNA-seq reads. BMC Bioinformatics, 14(S-5):S14, 2013. Proceedings of RECOMB-seq: Third Annual Recomb Satellite Workshop on Massively Parallel Sequencing Beijing, China. 11–12 April 2013.Google Scholar
Spang, Rainer, Rehmsmeier, Marc, and Stoye, Jens. A novel approach to remote homology detection: Jumping alignments. Journal of Computational Biology, 9(5):747760, 2002.CrossRefGoogle ScholarPubMed
Su, Chien-Hao, Wang, Tse-Yi, Hsu, Ming-Tsung, et al. The impact of normalization and phylogenetic information on estimating the distance for metagenomes. IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB), 9(2):619628, 2012.CrossRefGoogle ScholarPubMed
Su, Xiaoquan, Wang, Xuetao, Xu, Jian, and Ning, Kang. GPU-Meta-Storms: Computing the similarities among massive microbial communities using GPU. In Systems Biology (ISB), 2013 7th International Conference on, pages 6974. IEEE, 2013.Google Scholar
Tanaseichuk, Olga, Borneman, James, and Jiang, Tao. Separating metagenomic short reads into genomes via clustering. In WABI, pages 298313, 2011.Google Scholar
Tardos, É. A strongly polynomial algorithm to solve combinatorial linear programs. Combinatorica, 5:247255, 1985.CrossRefGoogle Scholar
Tavakoli, Neda, Gibney, Daniel, and Aluru, Srinivas. Haplotype-aware variant selection for genome graphs. In Proceedings of the 13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics, pages 19, 2022.Google Scholar
Tempel, Sebastien, Rousseau, Christine, Tahi, Fariza, and Nicolas, Jacques. ModuleOrganizer: Detecting modules in families of transposable elements. BMC Bioinformatics, 11:474, 2010.CrossRefGoogle ScholarPubMed
Teo, Choon Hui and Vishwanathan, SVN. Fast and space efficient string kernels using suffix arrays. In Proceedings of the 23rd International Conference on Machine Learning, pages 929936. ACM, 2006.CrossRefGoogle Scholar
Tettelin, Hervé, Masignani, Vega, Cieslewicz, Michael J., et al. Genome analysis of multiple pathogenic isolates of Streptococcus agalactiae: Implications for the microbial “pan-genome”. Proceedings of the National Academy of Sciences, 102(39):1395013955, 2005.CrossRefGoogle ScholarPubMed
Tomescu, Alexandru I., Kuosmanen, Anna, Rizzi, Romeo, and Mäkinen, Veli. A novel combinatorial method for estimating transcript expression with RNA-Seq: Bounding the number of paths. In Darling, Aaron E. and Stoye, Jens, editors, Proceedings of the 13th International Workshop on Algorithms in Bioinformatics (WABI), LNCS 8126, pages 8598, Springer, 2013a.CrossRefGoogle Scholar
Tomescu, Alexandru I., Kuosmanen, Anna, Rizzi, Romeo, and Mäkinen, Veli. A novel min-cost flow method for estimating transcript expression with RNA-Seq. BMC Bioinformatics, 14 (S-5):S15, 2013b. Proceedings of RECOMB-seq 2013.CrossRefGoogle ScholarPubMed
Trapnell, Cole, Pachter, Lior, and Salzberg, Steven L.. TopHat: Discovering splice junctions with RNA-seq. Bioinformatics, 25(9):11051111, 2009.CrossRefGoogle ScholarPubMed
Trapnell, Cole, Williams, Brian A., Pertea, Geo, et al. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nature Biotechnology, 28(5):511515, 2010. doi: 10.1038/nbt.1621.Google Scholar
Turner, Isaac, Garimella, Kiran V., Iqbal, Zamin, and McVean, Gil. Integrating long-range connectivity information into de Bruijn graphs. Bioinformatics, 34(15):25562565, 2018.CrossRefGoogle ScholarPubMed
Tutte, William T. A short proof of the factor theorem for finite graphs. Canadian Journal of Mathematics, 6(1954):347352, 1954.CrossRefGoogle Scholar
Ukkonen, Esko. Algorithms for approximate string matching. Information and Control, 64(1–3):100118, 1985.CrossRefGoogle Scholar
Ukkonen, Esko. On-line construction of suffix trees. Algorithmica, 14(3):249260, 1995.Google Scholar
Ulitsky, Igor, Burstein, David, Tuller, Tamir, and Chor, Benny. The average common substring approach to phylogenomic reconstruction. Journal of Computational Biology, 13(2):336350, 2006.Google Scholar
Uricaru, Raluca, Rizk, Guillaume, Lacroix, Vincent, et al. Reference-free detection of isolated SNPs. Nucleic Acids Research, 43(2):e11e11, 2014.Google Scholar
Valenzuela, Daniel, Norri, Tuukka, Välimäki, Niko, Pitkänen, Esa, and Mäkinen, Veli. Towards pan-genome read alignment to improve variation calling. BMC Genomics, 19: Article number 87, 2018.Google Scholar
Välimäki, Niko. Least random suffix/prefix matches in output-sensitive time. In 23rd Annual Symposium on Combinatorial Pattern Matching (CPM 2012), volume 7354 of Lecture Notes in Computer Science, pages 269279. Springer, 2012.Google Scholar
Välimäki, Niko and Puglisi, Simon J.. Distributed string mining for high-throughput sequencing data. In 12th International Workshop on Algorithms in Bioinformatics (WABI 2012), volume 7534 of Lecture Notes in Computer Science, pages 441452. Springer, 2012.Google Scholar
Välimäki, Niko and Rivals, Eric. Scalable and versatile k-mer indexing for high-throughput sequencing data. In Bioinformatics Research and Applications, pages 237248. Springer, 2013.Google Scholar
Välimäki, Niko, Ladra, Susana, and Mäkinen, Veli. Approximate all-pairs suffix/prefix overlaps. Information and Computation, 213:4958, 2012.Google Scholar
van Emde Boas, Peter. Preserving order in a forest in less than logarithmic time and linear space. Information Processing Letters, 6(3):8082, 1977.Google Scholar
Vatinlen, Benedicte, Chauvet, Fabrice, Chrétienne, Philippe, and Mahey, Philippe. Simple bounds and greedy algorithms for decomposing a flow into a minimal set of paths. European Journal of Operational Research, 185(3):13901401, 2008.Google Scholar
Vazirani, Vijay V. Approximation Algorithms. Springer-Verlag. 2001.Google Scholar
Vinga, Susana and Almeida, Jonas. Alignment-free sequence comparison – A review. Bioinformatics, 19(4):513523, 2003.CrossRefGoogle ScholarPubMed
Viterbi, Andrew. Error bounds for convolutional codes and an asymptotically optimum decoding algorithm. IEEE Transactions on Information Theory, 13(2):260269, 1967.Google Scholar
Volfovsky, Natalia, Haas, Brian, and Salzberg, Steven. A clustering method for repeat analysis in DNA sequences. Genome Biology, 2:RESEARCH0027, 02 2001.CrossRefGoogle ScholarPubMed
Wagner, Robert A., and Fischer, Michael J.. The string-to-string correction problem. Journal of the Association for Computing Machinery, 21:168173, 1974.CrossRefGoogle Scholar
Wan, Lin, Reinert, Gesine, Sun, Fengzhu, and Waterman, Michael S.. Alignment-free sequence comparison (ii): Theoretical power of comparison statistics. Journal of Computational Biology, 17(11):14671490, 2010.CrossRefGoogle ScholarPubMed
Wandelt, Sebastian, Starlinger, Johannes, Bux, Marc, and Leser, Ulf. RCSI: Scalable similarity search in thousand(s) of genomes. PVLDB, 6(13):15341545, 2013.Google Scholar
Wang, Yi, Leung, Henry C. M., Yiu, Siu-Ming, and Chin, Francis Y. L.. MetaCluster 4.0: A novel binning algorithm for NGS reads and huge number of species. Journal of Computational Biology, 19(2):241249, 2012a.CrossRefGoogle ScholarPubMed
Wang, Yi, Leung, Henry C. M., Yiu, Siu-Ming, and Chin, Francis Y. L.. MetaCluster 5.0: A two-round binning approach for metagenomic data for low-abundance species in a noisy sample. Bioinformatics, 28(18):i356i362, 2012b.Google Scholar
Wang, Yi, Leung, Henry C. M., Yiu, Siu M., and Chin, Francis Y. L.. MetaCluster-TA: taxonomic annotation for metagenomic data based on assembly-assisted binning. BMC Genomics, 15 (Suppl 1):S12, 2014a.Google Scholar
Wang, Ying, Liu, Lin, Chen, Lina, Chen, Ting, and Sun, Fengzhu. Comparison of metatranscriptomic samples based on k-tuple frequencies. PLoS One, 9(1):e84348, 2014b.CrossRefGoogle ScholarPubMed
Weiner, P. Linear pattern matching algorithm. In Proceedings of the 14th Annual IEEE Symposium on Switching and Automata Theory, pages 111, 1973.Google Scholar
Weintraub, Andres. A primal algorithm to solve network flow problems with convex costs. Management Science, 21(1):8797, 1974.CrossRefGoogle Scholar
West, Douglas B. Introduction to Graph Theory, 2nd edition. Prentice Hall, 2000.Google Scholar
Williams, Lucia, Tomescu, Alexandru Ioan, and Mumey, Brendan. Flow decomposition with subpath constraints. IEEE/ACM Transactions on Computational Biology and Bioinformatics, pages 11, 2022.CrossRefGoogle Scholar
Willner, Dana, Thurber, Rebecca Vega, and Rohwer, Forest. Metagenomic signatures of 86 microbial and viral metagenomes. Environmental Microbiology, 11(7):17521766, 2009.CrossRefGoogle ScholarPubMed
WolframAlpha. www.wolframalpha.com, 2022. Last accessed 3 May 2022.Google Scholar
Yang, Xiao, Chockalingam, Sriram P., and Aluru, Srinivas. A survey of error-correction methods for next-generation sequencing. Briefings in Bioinformatics, 14(1):5666, 2012.Google Scholar
Zerbino, Daniel and Birney, Ewan. Velvet: Algorithms for de novo short read assembly using de Bruijn graphs. Genome Research, 18:821, 2008.CrossRefGoogle Scholar
Ziv, Jacob. On finite memory universal data compression and classification of individual sequences. IEEE Transactions on Information Theory, 54(4):16261636, 2008.Google Scholar
Ziv, Jacob and Lempel, Abraham. A universal algorithm for sequential data compression. IEEE Transactions on Information Theory, 23(3):337343, 1977.CrossRefGoogle Scholar

Save book to Kindle

To save this book to your Kindle, first ensure [email protected] is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle.

Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service.

  • References
  • Veli Mäkinen, University of Helsinki, Djamal Belazzougui, Centre de Recherche sur l’Information Scientifique et Technique (CERIST), Algiers, Fabio Cunial, Broad Institute, Massachusetts, Alexandru I. Tomescu, University of Helsinki
  • Book: Genome-Scale Algorithm Design
  • Online publication: 28 September 2023
  • Chapter DOI: https://doi.org/10.1017/9781009341257.024
Available formats
×

Save book to Dropbox

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox.

  • References
  • Veli Mäkinen, University of Helsinki, Djamal Belazzougui, Centre de Recherche sur l’Information Scientifique et Technique (CERIST), Algiers, Fabio Cunial, Broad Institute, Massachusetts, Alexandru I. Tomescu, University of Helsinki
  • Book: Genome-Scale Algorithm Design
  • Online publication: 28 September 2023
  • Chapter DOI: https://doi.org/10.1017/9781009341257.024
Available formats
×

Save book to Google Drive

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive.

  • References
  • Veli Mäkinen, University of Helsinki, Djamal Belazzougui, Centre de Recherche sur l’Information Scientifique et Technique (CERIST), Algiers, Fabio Cunial, Broad Institute, Massachusetts, Alexandru I. Tomescu, University of Helsinki
  • Book: Genome-Scale Algorithm Design
  • Online publication: 28 September 2023
  • Chapter DOI: https://doi.org/10.1017/9781009341257.024
Available formats
×