GenThreader,
which is a method server offered by the
PSI-PRED
server, is
an efficient and reliable protein fold recognition method for genomic sequences.
The method uses a traditional sequence alignment algorithm to generate
alignments which are then evaluated by a method derived from threading
techniques. As a final step, each threaded model is evaluated by a neural
network in order to produce a single measure of confidence in the proposed
prediction. The speed of the method, along with its sensitivity and very
low false-positive rate makes it ideal for automatically predicting the
structure of all the proteins in a translated bacterial genome (proteome).
The method has been applied to the genome of Mycoplasma genitalium, and
analysis of the results shows that as many as 46% of the protein derived from
the predicted protein coding regions have a significant relationship to a
protein of known structure. In some cases, however, only one domain of the
protein can be predicted, giving a total coverage of 30% when calculated as a
fraction of the number of amino acid residues in the whole proteome.
Bottom
GenThreader can be applied either to complete translated genomic sequences or to individual protein sequences. It aims to detect superfamily relationships, such as fold similarities that result from common ancestry. It does this by generating sequence-structure alignments using a conventional sequence profile-based approach. The algorithm exploits pairwise potentials (estimated from statistics of atomic proximity or colocalisation) and solvation potentials (estimated from statistics of residue burial). These and various alignment parameters (such as length and score) are used to train a neural net to recognize related from unrelated structures.
GenThreader output provided, together with a code indicating which domains have been matched by the query. Pairwise sequence alignments, annotated with secondary structural elements of the matched structures, provide a visual representation of the quality of each match. These can be used in concert with the qualitative confidence levels, assigned as HIGH, MEDIUM, or LOW to help ascertain which is the best match.

Summary table illustrating the identified folds, match statistics (pariwise and
solvation energies, E-value, etc.), a rank-ordered and shaded estimation of
confidence based on those statistics.

The pairwise alignments
Top