Homepage   general_banner
PDF Version      Protein Folding and Modeling - CNRS URA 2185

  Director : Michel Goldberg (goldberg@pasteur.fr)



This Unit combines physical-chemistry, biochemistry, genetics and molecular modeling approaches to deal with problems related to the structure of proteins, their function and and their integration in various cellular functions such as the acquisition of their functional structure in vitro and in vivo, the mechanism of the molecular propagation of the misfolded conformation of a non conventional pathogen (the "prion") and the energetic aspects of their interactions with their specific ligands.



The main theme of the Unit deals with the acquisition (the "folding" process) of the unique tridimensional structure that endows a protein with its specific biological function. These studies are conducted at four levels:

- experimental studies in vitro of the molecular mechanisms involved in the folding.

- experimental studies on the quality control of proteinin vivo folding in the bacterial envelope.

- experimental studies on the mechanisms of transmission of the prion pathogenic conformation responsible for spongiform encephalopathies.

- computer modeling of the stability of proteins and of the energy of their interactions.

I- Protein folding in vitro (Michel Goldberg)

These studies aim at understanding, at a fundamental level, the mechanisms that enable a protein to acquire, in a few seconds at most, the complex three-dimensional structure, named "native", that endows it with its biological properties. The knowledge thus acquired is used to improve protein folding, in particular in industrial processes related to biotechnologies.

a- Studies on the role of disulfide bonds in the folding of hen lysozyme ( Michel Goldberg, Nicole Jarrett, Hideki Tachibana and Alain Chaffotte)

We had previously demonstrated the importance of SS bonds in the quasi-instanteneous acquisition of its secondary structure and in stabilizing folding intermediates (see report of year 2001). To gain a better understanding of the coupling between disulfide bonds and regain of the local native-like elements of tridimensional structure, we determined the kinetics of regain of antigenic motives that can be recognized by specific monoclonal antibodies only if the protein is (at least locally) correctly folded. Two antigenic motifs, present on opposite faces of the folded protein, and depending on the proper folding of two topologically distinct regions of the polypeptide chain, have been studied. The kinetics of appearance of their reactivity to two monoclonal antibodies have been observed by means of an ELISA based, pulsed immunolabeling method developped in our laboratory. The quantitative analysis of these kinetics and their comparison with the kinetics of appearance of species with 1, 2, 3 or 4 disulfides, as well as of regain of native secondary and tertiary structures, showed that the antigenic motive carried by the helical domain of lysozyme is recvored together with the helicoidal secondary structure, through disulfide exchange within intermediates with 2 SS bonds. The formation of the 3rd SS bond coincides with the appearance of the second antigenic motive, which is carried by the beta-domain of lysozyme. Among all the 2 SS lysozyme variants (supplied by Hideki Tachibana - Kobe University - Japan) only that containing the two intra-helical domain disulfide bonds (6-127 and 30-115) is recognized by one of the antibodies. The second antibody fails to recognize any of the 2 SS variants. Since SS bond 76-94 is known to be the last to get oxidized, we thus could define the order in which the native SS bonds are formed, and show their respective roles in in the formation of various levels of the molecular structure. These results provide a detailed image of lysozyme oxidative folding, and bridge the gap between the "natural" folding of lysozyme in the cell and studies performed in vitro on the oxidized protein. They emphasize the early role of SS bonds in orienting the folding process towards the native lysozyme conformation.

b- Production and study of a recombinant protein, candidate vaccine against malaria (Alain Chaffotte and Anne-Gaëlle Planson)

The C-terminal fragment F19 of MSP1 (Merozoïte Surface Protein) from Plasmodium falciparum is an anti-malaria vaccine candidate. When expressed in the E. coli periplasm as a fusion protein with MalE (maltose binding protein) it undergoes an oxidative folding allowing the correct distribution of its 6 disulfide bridges, thus leading to the "native" conformation as demonstrated by bi-dimensional NMR. Several resonances have been identified by comparing the bi-dimensional NMR spectra with those of the homologous recombinant fragment produced in insect cells. These resonances correspond to various sites spread in the three-dimensional structure. Direct individual (i.e. not fused) periplasmic expression of the F19 fragment showed that it is indeed localized in the periplasm as a soluble totally oxidized protein, with the expected molar mass. Its immunoreactivity tested towards 4 distinct specific monoclonal antibodies is largely decreased, which indicates an alteration of the surface conformation as compared to that of F19 produced in insect cells. Additionally, bi-dimensional NMR spectra are characteristic of a non folded protein. Non-reducing electrophoresis evidenced polymers, which suggests that the non-native conformation of F19 results from a heterogeneous non-native distribution of disulfide bonds. Comparison between structural characteristics of F19 either expressed as an individual protein or as fused to MalE, in E. coli periplasm, points to the remarkable properties of MalE in assisting the oxidative folding of proteins or protein fragments when fused at its C-terminal end ("internal chaperon"). Further analysis of this folding-helper property is under study. It consists in a detailed comparison of the refolding characteristics of MalE fused with F19, the latter being maintained either folded (oxidized) or denatured by reducing its cysteins.

c- Kinetics of association-dissociation and folding of the oligomeric R67 DHFR (Annick Méjean and Christophe Bodenreider)

In order to decipher the molecular events responsible for initiating the folding of an "entirely beta" protein, the bacterial R67 dihydrofolate reductase (R67 DHFR), we aim at finding out whether association of two disordered monomers triggers the polypeptide chain folding, or folding of the monomers is required for, and hence must precede, the association. For that purpose, we performed refolding kinetics at pH5, where R67 DHFR is a stable dimer. Monitoring the folding kinetics by means of spectroscopic methods (near UV circular dichroism, tryptophan fluorescence), three phases were identified, one of which corresponds to the cis-trans isomerisation of prolyl residues. This year, using a fluorescence energy transfer (between two labeled monomers) signal, we identified the phase corresponding ot the association and uncovered an additional folding phase. Thus, by modeling the various kinetics, we could establish a folding scheme that includes the conformational heterogeneity of prolyl residues in the denatured state (responsible for the very slow phase), a very rapid collapse of the protein, then a rapid step leading to only little defined protein structure, and a "medium" phase during which monomer association and most of the monomer folding occur in coupled mechanism. Using a DHFR variant that remains dimeric at pH 8, we showed that none of these phases is influenced by the pH. Finaly, we showed that the DHFR R67 folding at pH (where the non mutated protein tetramerizes) can be described by the same scheme as that at pH 5, with only one additional phase, the association of the folded dimers into tetramers. The rate constants and activation energies of each of the observable phases in this process were determined, thus leading to a satisfactory description of the energy landscape explored by the protein during its folding. More specifically, we demonstrated that the association into dimers is responsible for nearly all the stabilizing energy of the native dimer. We thus completed the characterization of the folding of a protein with an essentially all-beta-sheet secondary structure. Very few such studies had thus far been reported. We showed that, for R67 DHFR, folding obeys a non-hierarchical process until the dimeric state, followed by an entirely hierarchical process for the formation of the tetramer.

II- Control of the folding of bacterial envelope proteins (Jean-Michel Betton, Nadia Benaroudj, Nathalie Sassoon, Jean-Philippe Arié and Marika Miot)

To understand how bacterial cells recognize, signal and respond to, the presence of misfolded envelope proteins, we use the overexpression of a mutant of the periplasmic maltose-binding protein, or MalE31, displaying a defective folding pathway and leading to the formation of inclusion bodies in the periplasm. At the cellular level, we showed that expression of MalE31 precursors carrying altered signal sequences allowed an elimination of misfolded species more efficiently than when naturally expressed in the periplasm. These results indicated that the cellular factors, chaperones and proteases, involved in protein folding to repair or eliminate misfolded proteins are either more active towards MalE31 or more represented in the cytoplasm than in the periplasm. At the molecular level, in collaboration with the unit of "Structural Immunology" we solved the crystal structure of MalE31. These structural data provide evidence that the effect of the mutation is exerted at the level of folding intermediates, rather than that of the native conformation. However, in co-expression with FkpA, a periplasmic peptidyl-prolyl isomerase with chaperone activity, MalE31 can attain a native conformation with a maltose-binding activity comparable with that of wild-type MalE, but displayed defective maltose transport behaviour. The structural characterization of MalE31 identified a new region on the surface of MalE probably involved in the interactions with MalF and MalG, the membrane components of the maltose transport system.

We are currently studying the effect of temperature on the bacterial growth. Indeed, when overproduced at 30°C, MalE31 did not interfere with the bacterial physiology, but at 37°C the aggregation of MalE31 becomes toxic and causes lethality. Contrary to the expectation, heat-shock conditions (growth at 42°C) rescue this lethal phenotype by increasing the degradation of MalE31. Based on these observations, we have started to search multicopy suppressors of the toxic phenotype linked to the formation of inclusion bodies from MalE31, at 37°C. We just finished to construct a library of E. coli DNA fragments cloned in an expression vector. Next, we will select or screen clones which restore growth of the MalE31 overproducing strain at 37°C.

We participate to a structural genomics project in Institut Pasteur which focusses on proteins from Mycobacterium tuberculosis. By using a high-throughput cloning strategy, based on the site-specific recombination of lambda phage (Gateway), we successfully cloned the first 150 targets into a bacterial expression vector. In parallel to this structural genomics effort, we are interested by alternative expression systems to produce recombinant proteins. The Rapid Translation System (or RTS from Roche) designed for protein expression by a coupled transcription-translation reaction has retained our attention. The key technology is the cell-free continuous exchange of substrates and energy components via a semi-permeable membrane. In collaboration with the laboratory of "Macromolecule Structural Chemistry", we assessed the selective incorporation of heavy atoms, 15N or 13C, into specific amino acid residues. Although bacterial expression remains the most economical method from producing uniformly labeled proteins, selective labeling of proteins with one or more15N/13C enriched amino acid in E. coli is not always possible due to amino acid metabolism. We compared the efficient incorporation of 15N/13C Asp and Arg into MalE-wt between bacterial and RTS production. In contrast to bacterial expression, no scrambling or dilution of isotope labels was observed with RTS production. With the development of new high yield E. coli lysate, we are pursuing our evaluation with nine different amino acids (Cys, Glu, Asp, Leu, Ser, Thr, His, Gly, and Arg). Indeed, about 5 mg/ml of protein can be produced in this new lysate. High efficient incorporation of selective labels in proteins produced by RTS500 provides the means to resolve and assign the side-chain resonances in NMR spectra of larger proteins.

In the structural genomics project, we started the cloning of auto-compartmentalized proteases and molecular chaperone of Mycobacterium tuberculosis. Indeed, the genome sequence of this bacteria has revealed the presence of a proteolytic complex homologous to eucaryotic proteasome which is generally not found in bacteria. Furthermore, M. tuberculosis contains two copies of the gene encoding the ClpP protease while Escherichia coli has only one gene. Biochemical studies will be undertaken in parallel to crystallographic studies, with the hope to determine a specificity in the activity of these proteases which could be linked to the virulence of this bacteria. Finally, we will study the heat-shock proteins that do not belong to the DnaK or GroEL family, and whose function is totally unknown.

III- Prion Structure and Mechanism of Infectivity (Patrick Bittoun, Pierre Falanga, Michel Goldberg and Mireille Hontebeyrie)

It is currently accepted that the pathogen responsible for spongiform enkephalopathies (Creuzfeldt-Jakob Disease in man, Mad Cow Disease, Sheep Scrapie) contains no nucleic acid and consists exclusively in an abnormaly folded form, named PrPSc, of a protein that in healthy individuals is present as a non pathogenic, normally folded form (named PrPc) of the same protein. Though the conformation of the normal PrPc protein has been determined, neither that of the infectious, misfolded PrPSc, nor nor the mechanism of the conversion of PrPc into PrPSc have been elucidated. We have, since the beginning of this year, launched a project aimed on one hand at solving the PrPSc structure by molecular modeling based on experimental constraints, and on the other hand of identifying the molecular interactions involved in the PrPc to PrPSc conversion. This year has been devoted to obtaining, setting up and starting a high security, confined laboratory (P3 facility), and at implementing in our laboratory new experimental techniques required by this project.

We have also undertaken to better characterize, and eventually improve, the source of PrPSc we envisage to be using. The material we are planning to use will be PrPSc produced by a cell-line that was provided to us by Dr Hubert Laude (INRA - Jouy en Josas). These cells are, in absence of infection, heterogeneous in terms of their PrPc production. We have demonstrated that this heterogeneity is not related to the state of the cells in the cell cycle, nor to their size. With the hope of obtaining a better source of PrP, we isolated several clones expressing high levels of PrPc on their surface. As controls, we also isolated clones expressing low levels of PrPc. We analyzed the total amount of PrPc (i.e. on the surface and internal) expressed by these cell lines, and observed that two clones expressing high levels of PrPc produced, after infection, much less PrPSc than the starting population. Should this observation be verified on other high producing clones, the hypothesis could be put forward that cells producing high levels of PrPc on their surface would have undergone a mutation affecting the PrPc --> PrPSc conversion. One could then imagine that, in the starting population, the PrPSc producing cells would be the low PrPc producers. This hypothesis is currently under study. If it were to be verified, comparing the two types of clones should lead to the identification of factors involved in the regulation of the conversion.

In parallel, we have launched a program of production of monoclonal antibodies directed against epitopes located on the surface of the protease resistant region of the PrPSc, with the aim of identitying the solvent-exposed regions of this molecule.

IV- Molecular modeling of protein assembly processes and energies (Arnaud Blondel, Roland Nageotte and Josselin Noirel)

We develop physical theories that describe biological macromolecules with the following motivations:

- propose new methods to model the energies of protein association that would be fast and reliable enough to bypass an experimental approach. This is an important issue for the analysis of various biological processes and for the design of new drugs.

- be able to identify the relevant conformations of biological macromolecular systems. This applies to the study of the mode of association between proteins or between proteins and drugs as well as for unraveling complex mechanisms such as protein folding.

To develop quantitative methods to model the energy of association between proteins we have, during the last years, pursued simultaneously the experimental and the theoretical study of a model system, the R67 dihydrolate reductase (R67 DHFR). This protein confers an antibiotic resistance and is formed by the association of 4 identical subunits. Mutations were introduced by genetic construction to probe the importance of various subunit contacts. The association mechanism was analyzed and the energy of these associations was measured by means of very precise methods developed in the laboratory (see rapport for 2001).

In parallel, in order to test the theoretical methods, the energies involved in those associations were calculated by computer modeling a method to calculate free energy differences that was concieved and developed in the laboratory and that significantly reduces uncertainties. This method, which takes long range electrostatics into account, was implemented in the laboratory in the academic version of the CHARMM program which runs on Unix/Linux workstations and massively parallel super-computers. Including, in the analysis, the energetic contibution of the structural relaxations due to mutations resulted in good convergence properties (~0.4 kcal/mol per calculation) and good agreement with experimental data (~0.4 kcal/mol average difference).

Many results of these studies were obtained in preceding years. Several specific points were further developed in 2002:

- The mathematical model used to determine the affinity constants of the complexes from the experimental data was extended to take various possible reactions into account. The relative importance of these reactions was determined experimentally. The robustness of this model was tested on the experimental data.

- The theoretical basis of the free energy calculation method was further developed. Methods used in the literature were evaluated, our method was validated and new application fields were suggested.

Thus, this research provided the following results:

- A set of precise experimental data allowing to test predictions on protein-protein interactions.

- Molecular modeling seems to predict the effects of mutations on protein association with a precision comparable to that of experiments.

- Within the limits of the calculations that were performed, the force fieldand modeling methods used in the CHARMM program give a goodrepresentation of the physics of biological macromolecules and thus seemto be relevant to other modeling problems. It is worth pointing out that a series of calculations used to takes about 6 month when this project was conceived and launched, that in 2000, it took about 2 months on the most powerful workstation, but 3 weeks are now necessary on a powerful PC and on PC cluster or super-computers ( > 1 Tflops), the same calculations should take 2-3 days. This represent much less time than that necessary for an experimental approach.

Based on the observation that the methods we developed were able to describe correctly interactions between biological macromolecules of known structure, we are now focussing our interest on the identification of the biologically relevant conformations for those systems when the structure is unknown. The application field of such studies covers the prediction of association between proteins, or between proteins and drugs, as well as the unraveling of complex biological mechanisms. The difficulty in such approaches arises from the structural complexity of biological systems which are composed of thousand of atoms precisely adjusted. We have previously developed a conformational search algorithm. This approach was reused and we have developed tools to evaluate its performance. Enhanced molecular dynamic was used to generate new structures which were then sorted by grouping/combining. We could propose a structure for a complex. This allowed a better understanding of the association and of the effect of mutations and showed the power of the method.

V- Contribution of physical-chemical methods from the Unit to various collaborationS (Alain Chaffotte - Michel Goldberg - Roland Nageotte)

The Unit offers to the scientific community, in particular the pasteurian one, its equipments and expertise in the physical-chemical studies of proteins in solution and of their interactions. It contributes to the conception of experiments using analytical ultracentrifugation, elutriation, fluorescence spectroscopy, Fourier transform infrared spectrocopy, circular dichroism, and stopped-flow rapid mixing. It performs and interprets the corresponding experiments for the benefit of laboratories on and off campus. Since the middle of this year, some of these activities have been placed in the framework of the technical platform of "Biophysics of macromoleculs and their interactions" (Patrick England) at the Institut Pasteur.

VI- Teaching and Education

The Unit is in charge of organizing the "Protein Biochemistry" laboratory course (co-Director: A. Chaffotte and J.M. Betton) of the Institut Pasteur, which is associated to the Master's program of the Paris 6, Paris 7 and Orsay Universities, the Ecole Normale Supérieure, the Ecole Polytechnique, and the CEA and members of this Unit have contributed to the laboratory sessions of the 2002 course.

2 Master students and 3 PhD students were under training in the Unit during the academic year 2000-2001.

Keywords: prion, periplasm, modeling, physical-chemistry, proteomics


puce Publications of the unit on Pasteur's references database


  Office staff Researchers Scientific trainees Other personnel
  LENOIR Lucile (llenoir@pasteur.fr) BENAROUDJ Nadia, IP, Chargée de Recherche (nbenarou@pasteur.fr)

BETTON Jean-Michel, CNRS, Directeur de Recherche (jmbetton@pasteur.fr)

BLONDEL Arnaud, IP, Assistant (ablondel@pasteur.fr)

CHAFFOTTE Alain-François, IP, Chef de Laboratoire (chaffott@pasteur.fr)

FALANGA Pierre, IP, Chargé de Recherche (pfalanga@pasteur.fr)

GOLDBERG Michel, IP, Professor (goldberg@pasteur.fr)

HONTEBEYRIE-JOSKOWICZ Mireille, IP, Chef de Laboratoire (mhj@pasteur.fr)
DE ALMEIDA Paulo Cezar, Univ. Mogi das Cruzes, Postdoc. (Brazil)

ARIE Jean-Philippe, UP11, PhD student

BITTOUN Patrick, IP, Postdoc. (pbittoun@pasteur.fr)

BODENREIDER Christophe, UP7, PhD student

DE LAS HERAS Sanchez Ana Isabel, Student (Spain)

JARRETT Nicole, Undergraduate (USA)

MIOT Marie-Caroline, UP7, Student

NOIREL Josselin, UP7, Student

PHICHIT Ping, UP7, Student

PLANSON Anne-Gaëlle, UP7, PhD student

TACHIBANA Hideki, Univ. de Kobe, Professor (Japan)
NAGEOTTE Roland, IP, Ingineer (nageotte@pasteur.fr)

SASSOON-CLAVIER Nathalie, IP, Technician (nsassoon@pasteur.fr)

LENOIR Lucile, IP, Secretary (llenoir@pasteur.fr)

NAVARRO-MARTINEZ Maria, IP, Responsable de Préparation

NGUYEN Huuu-Huan, IP, Agent de Laboratoire

NINO Marguerite, IP, Agent de Laboratoire

TOMMASINO Patrice, IP, Agent de Laboratoire

Activity Reports 2002 - Institut Pasteur

Page Top research Institut Pasteur homepage

If you have problems with this Web page, please write to rescom@pasteur.fr