Distant homologies between proteins are often discovered
only after three-dimensional structures of both proteins
are solved. The sequence divergence for such proteins can
be so large that simple comparison of their sequences fails
to identify any similarity. New generation of sensitive
alignment tools use averaged sequences of entire homologous
families (profiles) to detect such homologies. Several
algorithms, including the newest generation of BLAST algorithms
and BASIC, an algorithm used in our group to assign fold
predictions for proteins from several genomes, are compared
to each other on the large set of structurally similar
proteins with little sequence similarity. Proteins in the
benchmark are classified according to the level of their
similarity, which allows us to demonstrate that most of
the improvement of the new algorithms is achieved for proteins
with strong functional similarities, with almost no progress
in recognizing distant fold similarities.
It is also shown that details of profile calculation strongly
influence its sensitivity in recognizing distant homologies.
The most important choice is how to include information
from diverging members of the family, avoiding generating
false predictions, while accounting for entire sequence
divergence within a family. PSI-BLAST takes a conservative
approach, deriving a profile from core members of the family,
providing a solid improvement without almost any false
predictions. BASIC strives for better sensitivity by increasing
the weight of divergent family members and paying the price
in lower reliability. A new FFAS algorithm introduced here
uses a new procedure for profile generation that takes
into account all the relations within the family and matches
BASIC sensitivity with PSI-BLAST like reliability.