Sequence databases and database searching

doi:10.1017/CBO9780511819049.004

2 - Sequence databases and database searching

from Section II - Data preparation

Published online by Cambridge University Press: 05 June 2012

Guy Bottu ,

Marc Van Ranst and

Edited by

Marco Salemi and

Philippe Lemey: Affiliation:
University of Oxford
Marco Salemi: Affiliation:
University of California, Irvine
Anne-Mieke Vandamme: Affiliation:
Katholieke Universiteit Leuven, Belgium

Book contents

Get access

Summary

THEORY

Introduction

Phylogenetic analyses are often based on sequence data accumulated by many investigators. Faced with a rapid increase in the number of available sequences, it is not possible to rely on the printed literature; thus, scientists had to turn to digitalized databases. Databases are essential in current bioinformatic research: they serve as information storage and retrieval locations; modern databases come loaded with powerful query tools and are cross-referenced to other databases. In addition to sequences and search tools, databases also contain a considerable amount of accompanying information, the so-called annotation, e.g. from which organism and cell type a sequence was obtained, how it was sequenced, what properties are already known, etc. In this chapter, we will provide an overview of the most important publicly available sequence databases and explain how to search them. A list of the database URLs discussed in this section is provided in Box 2.1.

To search sequence databases, there are basically three different strategies.

– To easily retrieve a known sequence, you can rely on unique sequence identifiers.
– To collect a comprehensive set of sequences that share a taxonomic origin or a known property, the annotation can be searched by keyword.
– To find the most complete set of homologous sequences a search by similarity of a selected query sequence against a sequence database can be performed using tools like BLAST or FASTA.

Type: Chapter
Information: The Phylogenetic Handbook
A Practical Approach to Phylogenetic Analysis and Hypothesis Testing
, pp. 33 - 67

DOI: https://doi.org/10.1017/CBO9780511819049.004 [Opens in a new window]

Publisher: Cambridge University Press

Print publication year: 2009

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Book contents

2 - Sequence databases and database searching

Summary

Access options

Book purchase

Temporarily unavailable

Book contents

2 - Sequence databases and database searching

Summary

Access options

Book purchase

Temporarily unavailable

Save book to Kindle

Save book to Dropbox

Save book to Google Drive