C. albicans first generation DNA Arrays

 

Introduction

Candida albicans genome sequence data from GenBank and from the Stanford GenomeTechnology Center (Assembly 3) were used to make gene arrays. Using these data, 3313 putative ORFs were identified, from which 2016 ORFs were selected for array construction. They comprised 364 GenBank entries, 1020 S cerevisiae homologues and 632 hypothetical ORFs. A 3'-fragment of 300-400 bp from each ORF was polymerase chain reaction (PCR)-amplified, yielding 2002 products that were arrayed on nylon membranes (Fig. 1). The array covers about one third of the expected 6000 protein coding genes in C. albicans. 

 

A table is provided with information on all the probes that are present in these arrays. This includes the location of the probes on the arrays, the location of the identified ORFs in the Contigs of Assembly 3 of the C. albicans genome or in the Genbank entries, gene names, functional assignments and links to the C. albicans genomic database CandidaDB, to the Stanford Genome Technology Center database for Assembly 6 of the C. albicans genome and to Genbank.

 

These arrays were first described in:

Murad AMA, d'Enfert C, Gaillardin C, Tournu H, Tekaia F, Talibi D, Maréchal D, Marchais V, Cottin J, Brown AJ (2001) Transcript profiling in Candida albicans reveals new cellular functions for the transcriptional repressors, CaTup1, CaMig1 and CaNrg1. Mol. Microbiol. 42:981-993 (2001)[Medline].

Sequence data for Candida albicans was obtained from the Stanford Genome Technology Center website. Sequencing of Candida albicans was accomplished with the support of the NIDR and the Burroughs Wellcome Fund.

 

            To download the HDF.xls table, click here

            To get more informations on the gene annotation procedure and the nomenclature, click here

            To get more informations on the arrays, click here

            To download the transcriptdata.xls file of Murad et al., click here

 

 

Back to the top of the page
Back to RIF home page

 

C. albicans gene annotation

A total of 6213 protein sequences predicted from the S. cerevisiae genome sequence were downloaded from MIPS (25 April 1995). C. albicans sequences were obtained from two different sources: 380 non-redundant entries of C. albicans ORFs were retrieved from GenBank (9 July 1999); and Assembly 3 was obtained from the Stanford C. albicans sequencing project (22 April 1999). Assembly 3 contained 1919 contigs of at least 2 kb, covering 12 301 999 bp, or around 80%, of the C. albicans genome, assuming a haploid genome size of 15.5 Mb. The largest contigs (Contig3-3189 to Contig3-3718), representing 6 992 176 bp, were annotated using the following two approaches. Firstly, ORFs were identified using the graphical analysis tool orffinder. Secondly, each contig was searched for segments matching known C. albicans or S. cerevisiae genes. Assembly 3 was compared with C. albicans entries using blastn, and to S. cerevisiae ORFs using blastx using the alternative yeast nuclear code (Ohama et al., 1993). All blast searches were automatically launched and displayed using the scripts blastallgenomes and readblast (Tekaia et al., 2000), which provided a working annotation table listing the ORF name, its length, the percentage identity/similarity/gap and the position of the beginning and end of the match in the contig and in the ORF. paintblast was used to compare orffinder and blast outputs, which facilitated identification of possible frameshifts or introns. Several classes of C. albicans ORFs were annotated: known C. albicans genes, homologues of S. cerevisiae genes and additional ORFs larger than 150 codons. Overlapping ORFs on the same strand were assumed to result from sequencing errors and considered as a single ORF. The longest ORF was retained when two ORFs overlapped on opposite strands. 3'-truncated ORFs that lay at the end of contigs were discarded because 3'-ends of ORFs were to be arrayed. A total of 3313 putative C. albicans ORFs were identified, and 2016 of these were selected for the construction of gene arrays.

 

In the accompanying table, spot numbers are linked to an ORF name of the following type A.x.y where:  

 

Different annotations are shown for each ORF:

Type:             Genbank if the ORF was extracted from Genbank

                        ho if the ORF was extracted from a Contig but did not show a match with a S. cerevisiae ORF in our annotation process

                        sc if the ORF was extracted from a Contig and showed a match with an entire S. cerevisiae ORF in our annotation process

                        scN if the ORF was extracted from a Contig and showed a match with the COOH of a S. cerevisiae ORF in our annotation process

SChomol: S. cerevisiae homologue identified by comparison of the contig to a S. cerevisiae ORF database using blastx

YPD link: Html link to annotation of corresponding S. cerevisiaegene at Proteome Inc.

Stanford: Corresponding ORF in the Stanford Contig6 release of October 2000 as deduced from a blastn analysis of the GCCA database vs the Contig6 database

Stan link: Html link to annotation of corresponding C. albicans ORF at Stanford

CandidaDB: Accession number of corresponding ORF in the C. albicans genomic database CandidaDB. Annotation in CandidaDB was performed by the European Galar Fungail Consortium using Assembly 6 of the C. albicans genome available from the  Stanford DNA Sequencing and Genome Technology Center

Gene: Gene name as available in CandidaDB

Function: Function as available in CandidaDB

 

Back to the top of the page
Back to RIF home page

 

C. albicans array construction 
PCR products were arrayed using published procedures (Richmond et al.,1999). PCR primers of 18-22 bases in length were designed using primer3 software  to amplify a 3'-region of 300-400 bp from each ORF. These oligonucleotides were synthesized with a 5'-tag (5'-oligonucleotide, 5'-CGACGCCCGCTGATA: 3'-oligonucleotide, 5'-GTCCGGGAGCCATC') to facilitate subsequent re-amplification of the PCR products. Using these oligonucleotides, the ORFs were PCR-amplified from the C. albicans SC5314 genome. The purity and length of all PCR products were checked by agarose gel electrophoresis. A total of 2002 PCR products, which satisfied our quality controls, were spotted in duplicate onto nylon membranes using a BioGrid System (BioRobotics) (Fig. 1). Candida albicans genomic DNA, E. coli ORFs and S. cerevisiae ORFs were included on the membranes as controls. 

A schematic representation of the array is shown below.

The sequences of the PCR primers for each ORF are available upon request to Christophe d'Enfert.

 

 

 

 

Back to the top of the page 
Back to RIF home page

 

 

Publications

 

Murad AMA, d'Enfert C, Gaillardin C, Tournu H, Tekaia F, Talibi D, Maréchal D, Marchais V, Cottin J, Brown AJ (2001) Transcript profiling in Candida albicans reveals new cellular functions for the transcriptional repressors, CaTup1, CaMig1 and CaNrg1. Mol. Microbiol. 42:981-993 (2001)[Medline].

 

Murad, A.M.A., Leng, P., Straffon, M., Wishart, J., Macaskill, S., MacCallum, D. Schnell, N., Talibi, D., Marechal, D., Tekaia, F., d’Enfert, C., Gaillardin, C., Odds, F.C. and Brown, A.J.P. (2001) NRG1 represses yeast-hypha morphogenesis and hypha-specific gene expression in Candida albicans. EMBO. J. 20, 4742-4752. [Medline]

 

Fradin, C., Kretschmar, M., Nichterlein, T., Gaillardin, C., d'Enfert, C., and Hube, B. (2002) Stage-specific gene expression of
Candida albicans in human blood. Molec. Microbiol. 47:1523-1543. [Medline]

 

 

Back to the top of the page 
Back to RIF home page