Book contents
- Frontmatter
- Contents
- Preface
- Acknowledgments
- 1 The Central Dogma
- 2 RNA Secondary Structure
- 3 Comparing DNA Sequences
- 4 Predicting Species: Statistical Models
- 5 Substitution Matrices for Amino Acids
- 6 Sequence Databases
- 7 Local Alignment and the BLAST Heuristic
- 8 Statistics of BLAST Database Searches
- 9 Multiple Sequence Alignment I
- 10 Multiple Sequence Alignment II
- 11 Phylogeny Reconstruction
- 12 Protein Motifs and PROSITE
- 13 Fragment Assembly
- 14 Coding Sequence Prediction with Dicodons
- 15 Satellite Identification
- 16 Restriction Mapping
- 17 Rearranging Genomes: Gates and Hurdles
- A Drawing RNA Cloverleaves
- B Space-Saving Strategies for Alignment
- C A Data Structure for Disjoint Sets
- D Suggestions for Further Reading
- Bibliography
- Index
C - A Data Structure for Disjoint Sets
Published online by Cambridge University Press: 05 June 2012
- Frontmatter
- Contents
- Preface
- Acknowledgments
- 1 The Central Dogma
- 2 RNA Secondary Structure
- 3 Comparing DNA Sequences
- 4 Predicting Species: Statistical Models
- 5 Substitution Matrices for Amino Acids
- 6 Sequence Databases
- 7 Local Alignment and the BLAST Heuristic
- 8 Statistics of BLAST Database Searches
- 9 Multiple Sequence Alignment I
- 10 Multiple Sequence Alignment II
- 11 Phylogeny Reconstruction
- 12 Protein Motifs and PROSITE
- 13 Fragment Assembly
- 14 Coding Sequence Prediction with Dicodons
- 15 Satellite Identification
- 16 Restriction Mapping
- 17 Rearranging Genomes: Gates and Hurdles
- A Drawing RNA Cloverleaves
- B Space-Saving Strategies for Alignment
- C A Data Structure for Disjoint Sets
- D Suggestions for Further Reading
- Bibliography
- Index
Summary
This appendix describes a simple yet very efficient Perl solution to a problem known as the disjoint sets problem, the dynamic equivalence relation problem, or the unionfind problem. This problem appears in applications with the following scenario.
Each one of a finite set of keys is assigned to exactly one of a number of classes. These classes are the “disjoint sets” or the partitions of an equivalence relation. Often, the set of keys is known in advance, but this is not necessary to use our Perl package.
Initially, each key is in a class by itself.
As the application progresses, classes are joined together to form larger classes; classes are never divided into smaller classes. (The operation of joining classes together is called union or merge.)
At any moment, it must be possible to determine whether two keys are in the same class or in different classes.
To solve this problem, we create a package named UnionFind. Objects of type UnionFind represent an entire collection of keys and classes. The three methods of this package are:
$uf = UnionFind– > new(), which creates a new collection of disjoint sets, each of which has only one element;
$uf– > inSameSet($key1,$key2), which returns true if its two arguments are elements of the same disjoint set or false if not;
$uf– > union($key1,$key2), which combines the sets to which its two arguments belong into a single set.
- Type
- Chapter
- Information
- Genomic PerlFrom Bioinformatics Basics to Working Code, pp. 313 - 317Publisher: Cambridge University PressPrint publication year: 2002