A new Rosetta Stone: comparison of the sequences of related genomes
In the last year of its activity, the Regulation of Gene Expression Unit began to develop comparative studies for two pairs of genomes, each consisting of the genome of a reference bacterium and of a bacterium that is an interesting model of disease. This Unit, which was dedicated to establishment of the genomic sequence of Bacillus subtilis and the management of associated knowledge, has now completed its activity. It served as a basis for the creation of a new research unit, the activity of which is based on techniques for the large-scale exploration of genomes. This unit is the Genetics of Bacterial Genomes Unit. The creation of this new unit is closely associated with the University of Hong Kong (China) of the HKU-Pasteur Research Center.
Two model microorganisms were used: Escherichia coli, the oldest model used by geneticists and the best known bacterium, and Bacillus subtilis, a bacterium often found attached to the surface of leaves and the soil and which is the source of numerous enzymes used in industry. These bacteria were compared, respectively, with Photorhabdus luminescens, which kills insects and propagates by the intermediary of a nematode, and Bacillus cereus, a sporulating bacterium implicated in many diseases, one of the closest relatives of which is the dangerous agent of anthraxis. The study was based on the identification and function of the key genes involved in the overall adaptation of the bacterium to its environment.
The H-NS protein and the response to metabolic transitions (Philippe Bertin)
The capacity to adjust to the availability and activity of cell proteins, particularly those involved in regulation and metabolic enzymes, is essential if cells are to grow and to adapt to changes in the environment. The H-NS protein is involved in many cell functions and affects the expression of genes regulated by environmental factors (temperature, pH, osmolarity etc.). It modifies the expression of numerous genes, affects recombination and transposition and is involved in bacterial virulence particularly via the control of motility.
Examination of wild-type and mutant strains by electron microscopy has shown that loss of motility results from a lack of flagella. The synthesis of flagella requires the expression of a large number of genes, organized in a hierarchical manner. The overexpression of the flhDC genes, the master operon in flagellum biosynthesis, in the hns mutant restores motility, suggesting that these genes are the preferential target of H-NS. Previous work has shown that the synthesis or the accumulation of several proteins is specifically affected by the hns mutation. We have therefore tried, in collaboration with C. Laurent-Winter (Génopôle, Pasteur Institute), to characterize these proteins, the synthesis of which is specifically affected by H-NS, and to identify the proteins regulated in this way. Several of these proteins have been identified by microsequencing or by superimposition of 2D electrophoresis reference maps of E. coli. Some correspond to known targets of H-NS (GadA, OmpF, OmpC, ProX). Other proteins, the synthesis or accumulation of which is affected in the hns mutant, are involved in mechanisms as diverse as construction of the cell wall, transport processes, carbohydrate metabolism and bacterial responses to diverse environmental conditions. In particular, the synthesis of several heat-shock proteins, including three chaperones (DnaK, GrolEL and GrpE), is induced in the hns mutant.
A complementary study, in collaboration with J.P. Le Caer (Laboratory of Neurobiology and Cell Biology, headed by J. Rossier, ESPCI Paris), made it possible to identify other proteins, the synthesis or accumulation of which is specifically affected by hns mutation. This work involved an important technical advance, the staining of 2D gels with silver nitrate. This staining technique is 100 times more sensitive than that generally used (Coomassie blue). In addition, it is compatible with the identification of proteins by mass spectrometry. Comparison of the electrophoretic profiles of the wild-type and mutant strains has shown that the level of accumulation of more than 60 proteins, almost half of which have been identified, is affected in an hns mutant.
The transcription products (the transcriptome) were analyzed in parallel with all the proteins of the bacterium separated by 2-dimensional electrophoresis (proteome) by use of the sequences of the corresponding genomes.
These very powerful methods revealed large groups of genes (the H-NS regulon comprises more than 250 genes) regulated in the same manner in the cell. We also carried out complementary phylogenetic (evolution of the protein) and structural analyses of several of the proteins identified in various organisms.
Construction of sulfur-containing amino acids (Isabelle Martin-Verstraete)
As described below, in silico studies have demonstrated that sulfur metabolism genes play a particular, organizational role. This led to the initiation of a research program on sulfur metabolism and the metabolism of polyamines, compounds essential for cell construction. Very little is known about the biosynthesis pathways of sulfur-containing amino acids (methionine and cysteine) in gram-positive bacteria. Similarly, little is known about the pathways of sulfur assimilation and of metabolic regulation in these bacteria. The work of the Unit initially focused on the characterization of genes involved in the pathway of methionine and cysteine biosynthesis in Bacillus subtilis. In addition to the detailed characterization of numerous genes, this study provided evidence to support the hypothesis that enzyme activity gradually becomes more specific during the molecular evolution of metabolic pathways, with the primary activity of the enzyme generally fairly non-specific. As for the study on H-NS, we carried out a complementary analysis of the protein synthesis profile (proteome analysis by 2-dimensional electrophoresis) and of the transcription profile by membrane hybridization (transcriptome analysis). These complementary analyses demonstrated the essential role of sulfur metabolism (justifying this choice of subject as central to the genomic approach) and identified a certain number of groups of genes that are co-regulated. We were thus able to identify several genes of unknown function that seem to be involved in sulfur metabolism, including several encoding transporters. The work of the Unit was also dedicated to the identification of regulatory genes essential for sulfur metabolism. Several of these genes have been characterized. They differ markedly from known genes of gram-negative bacteria such as Escherichia coli, rendering this study particularly interesting.
All of this work is part of the BACELL program of detailed functional analysis of the genome of Bacillus subtilis funded by the European Union.
The degradation of aromatic compounds, iron and oxygen (Francis Biville)
The catabolism of aromatic compounds plays a key role in many organisms. It is particularly interesting because it generally uses oxygen (often in the form of gaseous oxygen) and is thus explicitly involved in control of the intracellular concentration of this extremely reactive gas. Several years ago, we characterized an oxygenase, the regulation of which seemed to be important as it was associated with the expression of a large number of genes (in particular, via the HcaR activator). The peptide sequence of this enzyme is, in any case, remarkable and is very similar to that of an enzyme essential for apoptosis in animal cells. Using computer, genetic, physiological and global (proteome, transcriptome) methods of analysis, the unit has identified targets of two structurally similar regulators (the HcaR and YhaJ regulators), the synthesis of which increases during entry into stationary phase. Inactivation of the hcaR gene has a significant effect on the level of expression of seven genes involved in the response to oxygen. This effect involves a large decrease in resistance to certain reactive derivatives of oxygen during the stationary phase. The product of the hcaR gene is also a regulator involved in the control of the initial stages of 3-phenylpropionic acid catabolism. We are currently studying the effect of hcaR inactivation on the perturbation of metabolism during entry into the stationary phase. Studies with yhaJ, encoding another regulator, showed that the overexpression of this gene leads to a change in the process of cell division. Identification of the targets of this regulator, involved in generating the observed phenotype, is underway. A third, unknown regulator of the LysR type (ygiP) has been identified as a positive regulator of genes encoding a protein with an activity necessary for growth on glycerol and in anaerobic conditions. Comparative transcriptome analysis led to the hypothesis that this activator modulates the expression of genes involved in its mobility and regulation, iron transport and acid stress. Physiological experiments have shown that the inactivation of ygiP leads to much more rapid biofilm formation, thereby confirming the transcriptome analysis results relating to the action of this regulator in flagellum biosynthesis. This work will help us to identify the targets of other unknown activators of the LysR family, to determine their mode of action and to define more accurately their DNA targets.
Annotation, information management and the analysis of results in silico (Ivan Moszer, Eduardo Rocha, Paris; Claudine Médigue, Evry)
The very large amount of data generated by the sequencing of genomes and the enigmatic nature of many of the genes discovered during this process have necessitated research based on the techniques and concepts of computing, statistics and mathematics, to complement experimental studies involving molecular genetics techniques. A specialized Bacillus subtilis database, which has served as the model for several others, has been constructed (available at: http://genolist.pasteur.fr/SubtiList/ ). This database is currently being improved, in terms of both its conception and data, by the results of transcriptome analysis carried out within the European BACELL Network. In addition, a platform of genomic annotation, Imagene, has been used to establish a general model for the identification of errors in genome sequences, giving rise to a major revision of the sequence and annotations of the B. subtilis genome. This project has been carried out by a consortium (Geno*) combining the Unit, an INRIA aboratory and two companies, Hybrigenics and GENOME Express. The most biological and naturalist aspect of this research aims to identify the link between the major functions of the living organism and cellular architecture. Genome analysis has shown that the order of genes on the chromosome is not random, but is instead related to the architecture of the cell. The study of sulfur metabolism has shown that the selection pressure leading to this unexpected link is probably the diffusion of reactive molecules, gas and free radicals. These reactions are due to the products of numerous genes the function of which is currently unknown. This raises the question as to whether cellular compartmentalization is at the origin of a characteristic of genomes that caused much surprise at its discovery: the large number of genes of unknownfunction.