Book contents
- Frontmatter
- Contents
- List of contributors
- Foreword
- Preface
- Section I Introduction
- Section II Data preparation
- 2 Sequence databases and database searching
- 3 Multiple sequence alignment
- Section III Phylogenetic inference
- Section IV Testing models and trees
- Section V Molecular adaptation
- Section VI Recombination
- Section VII Population genetics
- Section VIII Additional topics
- Glossary
- References
- Index
3 - Multiple sequence alignment
from Section II - Data preparation
Published online by Cambridge University Press: 05 June 2012
- Frontmatter
- Contents
- List of contributors
- Foreword
- Preface
- Section I Introduction
- Section II Data preparation
- 2 Sequence databases and database searching
- 3 Multiple sequence alignment
- Section III Phylogenetic inference
- Section IV Testing models and trees
- Section V Molecular adaptation
- Section VI Recombination
- Section VII Population genetics
- Section VIII Additional topics
- Glossary
- References
- Index
Summary
THEORY
Introduction
From a biological perspective, a sequence alignment is a hypothesis about homology of multiple residues in protein or nucleotide sequences. Therefore, aligned residues are assumed to have diverged from a common ancestral state. An example of a multiple sequence alignment is shown in Fig. 3.1. This is a set of amino acid sequences of globins that have been aligned so that homologous residues are arranged in columns “as much as possible.” The sequences are of different lengths, implying that gaps (shown as hyphens in the figure) must be used in some positions to achieve the alignment. The gaps represent a deletion, an insertion in the sequences that do not have a gap, or a combination of insertions and deletions. The generation of alignments, either manually or using an automatic computer program, is one of the most common tasks in computational sequence analysis because they are required for many other analyses such as structure prediction or to demonstrate sequence similarity within a family of sequences. Of course, one of the most common reasons for generating alignments is that they are an essential prerequisite for phylogenetic analyses. Rates or patterns of change in sequences cannot be analysed unless the sequences can be aligned.
The problem of repeats
It can be difficult to find the optimal alignment for several reasons. First, there may be repeats in one or all the members of the sequence family; this problem is shown in the simple diagram in Fig. 3.2.
- Type
- Chapter
- Information
- The Phylogenetic HandbookA Practical Approach to Phylogenetic Analysis and Hypothesis Testing, pp. 68 - 108Publisher: Cambridge University PressPrint publication year: 2009
- 5
- Cited by