Computational methods to enable construcution of 3D models of protein complexes by integrating mass spectrometry and biochemical data

Lead Research Organisation: University of Cambridge
Department Name: Chemistry


Proteomic studies have yielded detailed lists of proteins but relatively little is known of their interactions and of their spatial arrangement in functional complexes. This gap is being bridged by complementing traditional biochemical and biophysical methods with emerging experimental approaches such as mass spectrometry of intact complexes and their constituents. To exploit this complementary experimental information, integrative computational approaches are required. Eventually we anticipate that these will provide molecular architectures and even atomic models of many protein complexes. Our mass spectrometry approach is typically to begin by measuring the mass of the isolated, intact assembly of proteins that make up the complex. In order to generate data on the connectivity and identity of the components within this complex, we measure the mass of the complex after both partial and complete disruption to observe subcomplexes and subunits of the intact complex respectively. This dataset is used to form the foundation of a contact map of the protein complex (termed a 2-D map). To create a 3-D model of the complex, we plan to incorporate ion mobility (IM) mass spectrometry to measure the size of the complex, along with all of the subcomplexes and components used to generate our 2-D map. While mass spectrometry methods for analyzing protein complexes are becoming well-established, computational approaches for data analysis are lagging seriously behind. To date, we have developed several pilot algorithms to approach the analysis of mass spectrometry data for protein complexes. For example, in the case of 2-D map generation, we currently use a simple network algorithm to find the shortest path that connects all subunit interactions determined in this way. This approach, while acceptable for very simple protein complexes, has serious limitations when we attempt to apply the algorithm to more complicated protein complexes with a high degree of modularity. To overcome this and other limitations we will develop a more sophisticated computer package capable of dealing with a variety of complexes, other than just globular-compact ones and able to incorporate data from a variety of methods including the shape of the individual subcomplexes.

Technical Summary

Our overall goal is to determine atomic models of protein complexes, particularly those that are either too dynamic, cannot be reconstituted in vitro or present only at low copy numbers in the cell. Such complexes are typically resistant to traditional structural biology methods such X-ray crystallography. To achieve this aim, information from a variety of experimental methods, such as proteomics, mass spectrometry, ion mobility mass spectrometry, and experimentally determined or modeled atomic structures of individual subunits must be combined effectively. New integrative computational methods are therefore required which we propose to address through a new collaboration with a leading computer science laboratory. In particular, we plan to develop a computational approach that will not only incorporate all of our mass spectrometry data but will also incorporate other pieces of information that are currently not considered during our model building of a protein complex (e.g., the charge on the ions and the intensity of the various subcomplexes) to better-construct the 2-D map of connectivity. We would also like to improve computational approaches for using IM mass spectrometry data to better generate 3-D models of protein complexes. Our previous algorithms for this aspect of data analysis include rudimentary course graining, molecular modeling, and clustering approaches to best assign a protein quaternary structure to the size information recorded from ion mobility measurements. Again, while these approaches were effective in simple cases, where reasonable structural possibilities are limited, more sophisticated approaches are needed to address most 3-D modeling of 'real-world' protein complexes. In this proposal we plan to develop new 3-D structure-generation algorithms for protein complexes.


10 25 50