Unit: Structural Bioinformatics - CNRS URA 2185

Director: Nilges Michael

The principal role of structural bioinformatics is to complement experimental structural studies in various ways by in silico studies. We are involved on the areas of calculation methods for structure determination with NMR; prediction of protein structure by comparative modelling to answer specific biological questions; and genomic analysis from large-scale structure prediction of genomic DNA. A focal point is the study of protein-protein interactions and protein ligand interactions, where we try to contribute to the fundamental understanding of binding mechanisms on the one hand, and to the search for potential therapeutic agents on the other, by including docking and virtual screening calculations. Most of the work is based on computer simulations, but we are directly involved in experimental work (X-ray crystal structure determination, neutraon scattering).

Development of a probabilistic structure determination method for NMR (M. Habeck, M. Nilges, W. Rieping)

Macromolecular structure determination is an inference problem: the measured quantities are noisy and incomplete, therefore insufficient to determine the structure uniquely. The objective should therefore be to explore all regions of conformational space compatible with the information at hand. Currently, this is attempted in a rather approximate way by repeated structure calculations with the same data (NOE-derived distances, torsion angles/ scalar couplings, residual dipolar couplings).

We have developed a probabilistic method that directly generates a posterior probability distribution. The latter represents the full knowledge about the target structure. The calculation of the probability distribution is computationally much more demanding than minimization because of its very high dimensionality. We could show that by a combination of several sampling methods the probability distribution can be simulated for medium-sized proteins by means of a new Markov Chain Monte Carlo strategy (so far these strategies had been used to simulate small peptides. A major advantage of the new method is that additional parameters necessary for the modelling but that cannot be measured (such as weighting factors) need not be guessed but can be determined in parallel with the structure. The method is the only method to date that gives a reliable estimate of the precision of a structure determined from NMR data. We used the method with several experimental and synthetic data sets, and we are working on several extensions and generalizations. We expect that in the long run, in particular with ever faster computers, the method will replace standard methods.

Study of 6--phosphogluconolactonase (6PGL) from T. Brucei by NMR and X-Ray crystallography (V. Stoven)

The parasitic protozoa Trypanosoma brucei are the causative agents of African sleeping sickness. The pentose phosphate pathway plays a crucial role in the host--parasite relationship. In order to evaluate this pathway as a potential drug target, it is necessary to study the associated enzymes in detail. Very little information was available for 6PGL (6--phosphogluconolactonase) the second enzyme in the cycle. In order to better understand the specificity and the mechanism of this enzyme, we have undertaken 3D structure determination of 6PGL, by NMR and by X-Ray crystallography (collaboration with Marc Delarue, Unité de Biochimie Structurale, Institut Pasteur). We have obtained X--ray diffraction data at 2.8 angstrom, and additional data on a mercury derivative was obtained at 2.1 angstrom. Structure refinement is currently in progress. An NMR study was undertaken in parallel, in order to perform structural and dynamic studies, in absence and in presence of the substrate (collaboration with the group of Geoffrey Bodenhausen, Ecole Normale Supérieure).

Modelling of proteins involved in infectious diseases: Entamoeba histolytica and Mycobacterium ulcerans (M. Krzeminsky, M. Zharan, R. Maroun)

The enteric protozoan parasite Entamoeba histolytica is the causative agent of amebiasis, a common health problem in many developing countries of the world. EhGEF1 is the first guanine nucleotide exchange factor from E. histolytica. This protein is constituted of one DBL-homology domain (DH, ~200 residues), involved in the catalytic reaction that stimulates GDP/GTP exchange, and one pleckstrin homology domain (PH, ~100 residues), which anchors the protein to the membrane and regulates GTPase activity in vivo. Experimental results indicate that EhGEF1 activates preferentially the small GTPases EhRacG and EhRho1. We performed knowledge-based modeling of the complex between the DH domain of EhGEF1 and EhRacG. We were able to identify those interface residues potentially important in the molecular recognition process. In the context of a 3D structure-activity relationship study, site-directed mutagenesis is being performed in order to probe the role of the proposed residues in the molecular interaction and specificity.

Other systems studied include the PIG-M protein, an enzyme involved in the synthesis of glycosyl-phosphatidylinositol (GPI), which seems to play a role in the virulence of E. histolytica. This protein belongs to the family of GPI mannosyltransferases-I (GPI-MT-I) that add the first mannose on the PI in the pathway of GPI biosynthesis, and a thioesterase domain (TE) of polyketide synthase from mycobacterium ulcerans, which is responsible for the closure of the so-called polyketide macrocyle essential to the toxicity.

A second major line of interest remains the modelling of snake venom components with anticoagulant activity. For phospholipases A2 (GIIA SVsPLA2) present in the Viperidae and Crotalidae families we found that the molecular electrostatic potentials calculated at the solvent-accessible surface of the 3D models show a correlation with their anticoagulant activity.

Complementarity of structure ensembles in protein-protein binding (R. Grünberg, J. Leckner, M. Nilges)

Protein-protein association is often accompanied by changes in receptor and ligand structure. This interplay between protein flexibility and protein-protein recognition is currently the largest obstacle both to our understanding of and to the reliable prediction of protein complexes. We performed two sets of molecular dynamics simulations for the unbound receptor and ligand structures of 17 protein complexes and applied shape-driven rigid body docking to all combinations of representative snapshots. The crossdocking of structure ensembles increased the likelihood of finding near-native solutions. The free ensembles appeared to contain multiple complementary conformations. These were in general not related to the bound structure. We suggest that protein-protein binding follows a three-step mechanism of diffusion, free conformer selection, and refolding. This model combines previously conflicting ideas and is in better agreement with the current data on interaction forces, time scales, and kinetics. The current model combines aspects from the induced fit model and the free conformer selection model, which in turn is based on the model of allosteric transitions by Jaques Monod, Jeffery Wyman and Jean-Pierre Changeux. Using extended molecular dynamics calculations we are currently investigating the dynamic properties of protein interaction surfaces in more detail. We also study the entropic costs of complex formation.

Elucidating factors governing ligand-repeptor interactions (Pak-Lee Chau)

The research aims to elucidate the factors governing ligand-receptor interactions in two model systems: (1) the interaction of small molecules with proteins (2) the interaction of small molecules with cell membranes.

Most drugs are small molecules, which interact with proteins in the human body. Using molecular dynamics simulations, we have developed unbinding methods and novel analysis procedures to unbind small ligands from protein receptors, to evaluate the free energy change of unbinding, and to define the role of water in the unbinding process. This method has been successfully applied to the retinol/serum retinol-binding protein complex, and is now being applied to elucidating the binding site and unbinding pathway of 5-HT from the 5-HT_3 receptor. The results will be compared with site-directed mutagenesis data. This work is performed in collaboration with Dr Sarah Lummis (Department of Biochemistry, University of Cambridge) and Dr Carla Molteni (Department of Physics, King's College, London).

A small number of drugs, the general anaesthetics, probably exert their action partly through interaction with the cell membrane. We are performing neutron scattering experiments to define the localisation of general anaesthetics in the cell membrane. The experiments are complemented by molecular dynamics simulations. This work is a collaboration with Professor Ruth Lynden-Bell (Department of Chemistry, University of Cambridge), Dr Steven Roser (Department of Chemistry, University of Bath) and Dr Paul Hoang (Universite de Franche-Comte, Besancon, France).

New free energy methods to analyze protein-ligand associations (A. Blondel)

The aim of our developments is to make quantitative predictions of the free energy of protein-ligand associations sufficiently reliable to be a useful complement to experiments on biological processes, or to contribute in the design of new drugs. Because of their energetic importance, we are also interested in the identification of the relevant conformations and mechanisms for protein-ligand association or for more complex processes such as enzymatic reactions or folding.

The conception of these methods was motivated by a thorough study of the association of R67 DHFR subunits combining experimental and computational approaches (see previous reports, Protein Folding and Modelling Unit). Our new method reduces significantly the uncertainties of free energy difference calculations. The method is based on the optimization of ensemble fluctuations during thermodynamic integration. Good convergence properties were observed, and predictions were in good agreement with experiments (~0.4 kcal/mol difference). A recent comparison with standard methods from the literature confirmed the superiority of our method. We further tested the convergence on systems presenting significant hysteresis. Based on the comparison of the calculation results with the set of precise experimental data that we have made available in the literature we conclude that we can predict free energy differences with a precision comparable to that of experiments, provided the mechanism of association and structural relaxations are taken into account.

Docking and virtual screening of potential therapeutic agents (N. Duclert-Savatier, V. Stoven)

Whereas a very high accuracy can be achieved in state-of-the-art free energy calculation, these calculations are not suitable and too slow to screen the large compound data bases to find a potential binding partner for a protein with known 3D structure. This speed is offered by less accurate docking methods that characterize protein--ligand interactions empirically and use approximate rules to calculate binding free energies from molecular conformation. We employ different empirical docking strategies in combination with the aim to perform virtual screening. A first target is a subtilisin that is essential for the malaria parasite to enter the host cell. We also take part in the project coordinated by Stewart Cole, with the aim to identify lead compounds against Mycobacterium Tuberculosis. Methods developments aim at integrating results from several different docking strategies in an intelligent way (by machine learning techniques).

Keywords: Protein structure, protein function, protein dynamics, molecular recognition, sequence analysis

Activity Reports 2004 - Institut Pasteur

Page Top research Institut Pasteur homepage

If you have problems with this Web page, please write to rescom@pasteur.fr