Computational strategies for protein structure determination using NMR spectroscopy

This thesis describes the development of a new computer program called DANGLE (Dihedral Angles from Global Likelihood Estimates), which predicts protein secondary structure and backbone φ and Ψ dihedral angels solely from amino acid sequence information, experimental NMR chemical shift measurements...

Full description

Bibliographic Details
Main Author: Cheung, M. S.
Published: University of Cambridge 2009
Subjects:
Online Access:http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.597587
Description
Summary:This thesis describes the development of a new computer program called DANGLE (Dihedral Angles from Global Likelihood Estimates), which predicts protein secondary structure and backbone φ and Ψ dihedral angels solely from amino acid sequence information, experimental NMR chemical shift measurements and a database of known protein structures and their associated shifts. The approach uses Bayesian inferential logic to analyse the likelihood of conformations throughout Ramachandran space, paying explicit attention to the population distributions expected for different amino acid residue types. The search algorithm used by DANGLE identifies the most probable backbone conformation of a query residue by analysing the distribution of dihedral angles in fragments that possess similar secondary chemical shifts and local amino acid sequences within a database of protein structures. Upon comparison with database entries, shift and sequence fragment matching for the query residue yields a scatter of (φ,Ψ) predictions in Ramachandran space. Bayesian statistics are used to compare this with the scatter patterns obtained for known backbone configurations from database proteins, producing a “global likelihood estimate” diagram which reports on the degeneracy and precision of the predicted conformation. Simple filtering procedures can identify the most ‘predictable” residues, yielding 92% of all φ and Ψ predictions accurate to within ± 30° with respect to high-resolution X-ray structures. In contrast to previous approaches, more than 80% of all φ and Ψ predictions for glycine and pre-proline are reliable. Furthermore, DANGLE provides meaningful upper and lower bounds for the predictions which are shown to represent the precision of the prediction. DANGLE is also able to assign 86% of the secondary structure correctly. A set of dihedral angle constraints predicted by DANGLE is used in combination with conventional structure calculation protocols to determine the solution structure of the SAM domain of human SLY protein with average RMSD of 0.52 Å for all backbone atoms in the final ensemble.