Parallelization of the automatic determination system
for representative protein chains of the Protein Data Bank (PDB)
Tamotsu Noguchi(1), Yutaka Akiyama(1), Kentaro Onizuka(1),
Minoru Saito(1), Makoto Ando(1), Yoshihisa Shizawa(2)
The Protein Data Bank (PDB) is a rich library of
atomic-coordinate data of biological macromolecules.
The PDB entries has been increasing rapidly by the improvement of X-ray
crystallography and NMR experimental techniques, and the
number of current entries is more than 5,800 (2.4Gbytes),
though not all entries are
competent for the purpose of computational protein structure analysis.
A lot of entries have insufficiently-refined coordinate data.
Thus we have developed a representative chain database PDB-REPRDB, and
in this paper we report the MPI- parallelization of our automatic
construction system for PDB-REPRDB.
Performance evaluation on three parallel computers is also reported.
Now that a calculation of a representative set can be done within 2 days
rather than 2 weeks, with 10-folds speed-up achieved in this study.
(1) Real World Computing Partnership
(2) Information and Mathematical Science Laboratory, Inc.