The classification of protein structures based on the sequential
and structural similarity,
and the database of representative protein chains (PDB-REPRDB)
Tamotsu Noguchi, Yutaka Akiyama, Kentaro Onizuka and Makoto Ando
The Protein Data Bank (PDB) is a rich library of
atomic-coordinate data of biological macromolecules.
The PDB entries have been increasing rapidly by the improvement of X-ray
crystallography and NMR experimental techniques, and the
number of current entries is more than 7,500 (3.4Gbytes),
though not all entries are
competent for the purpose of computational protein structure analysis.
A lot of entries have insufficiently-refined coordinate data, or have
some or many similar entries in terms of structural or sequential similarity.
Thus the need for a classification procedure of protein sturcures has
become quit obvious.
We have proposed a representative chain database PDB-REPRDB, whose startegy
of selection is based
on the sequential and structural similarity.
In this paper, we have developed a representative chain database PDB-REPRDB, and
we report the MPI- parallelization of our automatic
construction system for PDB-REPRDB.
%Performance evaluation on three parallel computers is also reported.
Now that a calculation of a representative set can be done within 1.5 hours
rather than 1 week, with 110-folds speed-up achieved in this study.
We have opened a WWW service for the PDB-REPRDB, which have been accessed
more than 2100 times.
Real World Computing Partnership