Abstract
Joint work with S. R. McAllister.
The protein folding question has developed over the past four decades as one of the most challenging and potentially rewarding problems in computational biology. Three general classes of algorithms have emerged, based on the techniques of comparative
modeling, fold recognition, and first principles methods. For a detailed summary of protein structure prediction methods, the reader
is directed to two recent reviews [1,2]. Within the field of protein structure prediction, the packing of alpha-helices has been one of the more difficult problems. The use of distance constraints and topology predictions can be highly useful for reducing the conformational space that must be searched by deterministic algorithms to find a protein structure of minimum conformational energy.
We present a novel first principles framework to predict the structure of alpha-helical proteins. Given the location of the alpha-helical regions, a mixed-integer linear optimization model maximizes the interhelical residue contact probabilities to generate distance restraints
between alpha-helices [3]. Two levels of this formulation allow the prediction of both ``primary'' contacts between a helical pair
as well as the prediction of ``wheel'' contacts, one helical turn beyond the primary contacts. These predictions are subject to a
number of mathematical constraints to disallow sets of contacts that cannot be achieved by a folded protein. The interhelical contact prediction for alpha-helical proteins was evaluated on 26 proteins, where it identified an average contact distance below 11.0 Angstroms for the entire set.
A related optimization-based approach is proposed for the prediction of alpha-helical contacts in mixed alpha/beta proteins [4]. This contact prediction is based on the maximization of the number and hydrophobicity of hydrophobic interactions. The allowable sets of contacts is restricted based on knowledge or prediction of the beta-sheet topology and a number of distance geometry rules and constraints. The interhelical contact prediction for alpha/beta proteins was evaluated on 12 test proteins, where it identified an average contact distance below 11.0 Angstroms for 11 of these proteins.
The distance restraints from the interhelical contacts are then used to restrict the feasible space of the protein during the prediction of the tertiary structure using a hybrid optimization algorithm [5,6]. This tertiary structure prediction approach combines torsion angle dynamics and rotamer optimization with a deterministic global optimization technique (alphaBB) and a stochastic optimization technique
(conformational space annealing) to minimize a detailed atomistic-level energy function. The tertiary structure prediction results are promising and are highlighted by the exciting, near-native blind prediction of a de novo designed 4-helix bundle protein.
[1] Floudas CA, Fung HK, McAllister SR, Monningmann M, and Rajgaria R. Advances in Protein Structure Prediction and De Novo Protein Design: A Review. Chem Eng Sci. 2006;61: 966-988.
[2] Floudas CA. Computational Methods in Protein Structure Prediction. Biotechnol Bioeng. 2007;97:207-213.
[3] McAllister SR, Mickus BE, Klepeis JL, and Floudas CA. A Novel Approach for Alpha-Helical Topology Prediction in Globular
Proteins: Generation of Interhelical Restraints. Prot Struct Funct Bioinf. 2006;65:930-952.
[4] McAllister SR and Floudas CA. Alpha-helical Residue Contact Prediction in Mixed Alpha/Beta Proteins Using Mixed-Integer Linear Programming. In preparation, 2007.
[5] Klepeis JL and Floudas CA. ASTRO-FOLD: A Combinatorial and Global Optimization Framework for Ab Initio Prediction of
Three-dimensional Structures of Proteins from the Amino Acid Sequence. Biophys J, 2003;85:2119-2146.
[6] McAllister SR and Floudas CA. An Improved Hybrid Global Optimization Method for Protein Tertiary Structure Prediction. In preparation, 2007.