Research / Biology IT Center

Biology IT Center

The « Biology IT Center » is in charge of the installation, maintenance and interfacing of software for biologists, as well as making available a choice of international biological databanks. We ensure the concept of scientific databases that make laboratory data available online. A workflow with interfaces to biological software is developed with the feedback of researchers.

Software for biology

Researchers have access to a growing pool of 230 packages including about 1350 programs. New available applications are added on demand from biologists who need to use new biological software. The installation, test, update and maintenance of this software are the main part of our activities. Concurrently we have built 280 standard Web interfaces that give the researchers a user-friendly access to the locally installed applications.

Development

Local development and deployment service are available for researchers who are in need of informatics skills to develop and make available their projects, or to present their results using databases and Web tools. Recently, we have incorporated a process to study and write the specifications for each new project.

The Mobyle project

This project aims at giving a friendly portal sharing all our bioinformatics analysis software. This portal is available to the campus and outside, and is comparable to the services provided by the NCBI, the EBI or the SIB to researchers worldwide.

In 2009-2011, Mobyle has been extended with two collaborations:
  • MobyleNet : Mobyle bioinformatics portal Network with 9 French plat-forms of bioinformatics.
  • BMID : a NIAID/NIH collaboration to create common user interface to publish tools on the Web and improve their usability to scientists.

DNA, Protein, Structural and Genomic databanks

Biological databanks are updated frequently. Original databanks are accessed to update local copies that become available for every researcher. The important data volume is the most difficult constraint to manage. The storage space used for the various formats supported (NCBI Blast2, Fasta, Golden, GCG, etc) represents 4.7 TB. Today, 40 banks, including Embl, GenBank, RefSeq, Uniprot, Nt ,Nrprot, Genpept, PDB are available.

Databases

A tool “Bibliolist” was developed to curate, update or insert new bibliography into GenoList. Publications are automatically linked based on the gene name or gene description of organism to update. The results need to be validated by the annotator to be exported in GenoList.

Management Software for Biological Resource Center of Institut Pasteur (CRBIP)

We maintain the web application for visualisation of the CRBIP catalog and orders. We maintain also ARPAS, the software for the management of the collections of biological resources. This system is used by 28 collections of organisms on the campus.
To increase the availability, maintainability and interoperability we have started a new project to develop a second generation management software called BRC-LIMS. This project, sponsored by IBiSA is under development with an external provider. We are managing the project BRC-LIMS for the CRBIP and other 10 French Biological Resource Centers of microorganisms, to renew ARPAS software and realize a common catalog application : FBRCMi-catalog. The first release is scheduled on May 2012.

Bioinformatics training courses

A course of bio-informatics is taught in the “Genome Analysis” course of Master 2.

Contribution to “High-Throughput Sequencing”

We are in charge of installation, development and scripting for the specific software required by NGS analysis. The pipeline is set up by our group to be used by every researcher in the campus. Parallelization has been added to speed up the CPU-bound applications, using the cluster (elapsed time divided by 30).
The GMOD Genome Browser, GBrowse, has been installed on the Genopole server to handle and display NGS data produced at the Institut Pasteur. This tool is used by our users to confirm, discover or reannotate genomes previously sequenced and annotated. It is also used to focus on specific genome regions to identify genes of interest in developing new weapons and solutions against bacteria and viruses.
Today the cluster is being rescaled to follow the requirements needed by the NGS computing with CPU-bound and memory and I/O intensive applications.