On September 16, 1997, approximately 83% of the DNA sequence of the genome of Pseudomonas aeruginosa was released on http://www.pseudomonas.com according to an agreement between the parties funding this endeavor; namely, the US Cystic Fibrosis Foundation and PathoGenesis Corp., Seattle. Filling in the remainder of the sequence may take up to 2 years. Pseudomonas aeruginosa is a bacterium of substantial medical importance in nosocomial (hospital) infections and cystic fibrosis, and there is a large and active group of scientists studying this bacterium. To improve the quality of analysis of the genomic sequence, and ensure the development and widespread availability of genetic tools to analyze gene function, a volunteer committee of Pseudomonas scientists has been formed. It is proposed to establish a dedicated web site and central facility to promote in depth analysis of the genome sequences as they become available, publication of these analyses, and follow up studies to permit assignment of functions to gene sequences. A committee of volunteer scientists from the Pseudomonas research community has been established to coordinate these efforts.
Overview
On September 17, 1997, approximately 83% of the genomic DNA sequence of Pseudomonas aeruginosa in 2545 contigs was released on the Internet under the address http://www.pseudomonas.com. This represents the progress of a shotgun sequencing effort in the facility of Dr. Maynard Olsen, U. Washington, Seattle with the financial assistance of the U.S. Cystic Fibrosis Foundation and the financial and technical assistance of PathoGenesis Corp., Seattle. The actual sequencing effort involved 15,000,000 nucleotides of sequence (approximately 2.5 x the number of nucleotides in the P. aeruginosa genome). According to Dr. Richard Garber, PathoGenesis Corp., the total number of unique nucleotides 4,959,694 represents 83% of the sequence and contains 84% of previously published DNA and protein sequences currently in the database (i.e. 84% of 900,000 bp of sequence). At a special session on genomics at the Sixth International Pseudomonas meeting in Madrid, Spain, Dr. Garber appealed to the Pseudomonas community for their contributions and assistance to improving the identification of open reading frames and creating genetic tools for further analysis. It was felt that the following unique features justified significant effort: (i) the Pseudomonas aeruginosa chromosome would be by far the largest bacterial genome sequenced to date, (ii) P. aeruginosa occupies several interesting niches; it is one of the most nutritionally versatile organisms known, it is a major medical opportunistic pathogen being one of the top 3 nosocomial pathogens, causing life threatening, antibiotic resistant infections in hospitalized persons (220,000 per year in the USA) and it is the most frequent cause of eventually fatal, chronic lung infections in persons with cystic fibrosis, it is also opportunistic pathogen of farm animals, it is a common environmental organism that can be found in the soil, on rocks and in streams, and it is closely related to several plant pathogens, plant growth enhancing organisms and agents of bioremediation, (iii) Pseudomonas aeruginosa PAO1, that is being sequenced, is a wild type bacterium (cf. Escherichia coli K-12) which is nutritionally indistinguishable from most isolates, and is capable of causing infections in all relevant animal models, and (iv) Pseudomonas aeruginosa is arguably, amongst bacteria, only second to E. coli in terms of our understanding of its biology, pathoGenesis and metabolism.
Formation of a Pseudomonas Genome Committee
For the above reasons, it has been decided to form a volunteer committee, to assist in genomic analysis and developmental of useful genetic tools. The committee shall be Dr. Rick Garber, PathoGenesis Corp., Seattle, WA, Dr. Wendy Hufnagle, PathoGenesis Corp., Seattle, WA, Dr. John Mattick, University of Queensland, Brisbane, Australia, Dr. Roger Levesque, Laval University, Quebec City, Canada; Dr. Dennis Ohman, University of Tennessee, Tennessee; Dr. Bob Hancock, University of British Columbia, Vancouver, Canada, Dr. Fiona Brinkman, University of British Columbia, Vancouver, Canada, and Dr. Burkhardt Tummler, Hannover, Germany.
Mandate of the Committee
The mandate of this committee will be (1) to improve the quality of the sequence annotation through analysis of the full and partial open reading frames identified in the Pseudomonas genome-sequencing project, (2) to promote further functional and genetic analysis of interesting genes identified through the sequencing project, (3) to promote the development of widely available genetic tools for the analysis of unknown genes and genes that are lethal in high copy number, including a mini Tn5 lac fusion library of mutants/transcriptional fusions, an overlapping BAC library in E. coli (currently existing), a BAC vector for P. aeruginosa, an integrative plasmid that does not necessarily lead to gene duplication (for studying genetic dominance and genes that are lethal when overexpressed) and a fully gridded cosmid library, (4) to maintain a publicly accessible web site with information including that mentioned above, (5) to solicit the assistance from Pseudomonas aeruginosa scientists with special expertise particularly, in functional analyses of a given sets of genes, to promote the analysis of genomic variation between Pseudomonas aeruginosa strains (so we can understand diversity) and of sequence conservation in related Pseudomonads, and (6) to provide a forum for discussion of genomic analyses, through the aforementioned central facility web site, to provide additional information through an e-mail newsletter, and to encourage publication of Pseudomonas genomics papers through a common forum (possibly the journal Microbiology which has expressed interest in and published such papers previously).
Solicitation of Statements of Interest
Microbial genes sequenced in large genome projects are commonly annotated
according to similarity with a sequence in the data base and/or possession
of motifs (sequence signatures) that suggest the encoded proteins fit into
certain functional classes (e.g. ABC transporters contain an ATP binding
motif and transmembrane alpha-helices). However, the degree of similarity
and the range of similarity is critical for deducing a particular genes
function. For example, one P. aeruginosa gene has been found to
be similar to a manesium cobalt transport protein gene in E. coli
and so would commonly be annotated as a "putative magnesium cobalt transport
protein gene". However, closer inspection reveals that the similarity between
the two genes is primarily in the region that encodes transmembrane alpha
helices which likely anchor the E. coli protein in the cytoplasmic
membrane (this protein has been shown to be a cytoplasmic membrane associated
protein). Therefore, the P. aeruginosa gene may be better annotated
as a putative cytoplasmic membrane associated protein. It is also imperative
that genes be annotated according to their similarity with genes or regions
of genes who's function have been studied experimentally. Current automated
sequence analysis and annotation methods are not sophisticated enough at
this time to deal with this properly, and there is a need for better bioinformatics
to handle annotation as well as more experimental data. Thus, we are seeking
assistance for more detailed manual analysis of the genomic sequence of
P.
aeruginosa. Two types of assistance are sought, for genomic analysis
and subsequent genetic analysis. Genomic analysis will involve: (i)
analysis of selective subsets of genes looking for homologs, from both
Pseudomonas
aeruginosa and other bacteria, of known genes in various pathways or
cellular compartments; (ii) detailed motif searching of proteins to identify
general functional classes; (iii) analysis of potential transcriptional
regulation by identification of known regulator binding motifs (fur box,
anr box, lux box, pho box, etc.); (iv) analysis of the genetic context
of these genes to determine if operon organization gives for hints of involvement
of a given gene product in a pathway, etc. Genetic analysis would involve
the utilization of genetic tools to confirm the above analyses and/or derive
functional information (e.g. studying mutants, analysis of conditional
expression, etc.). The classes of genes to be examined, and volunteers
who have already offered to help are listed with the committee members
for the Pseudomonas aeruginosa Genome Analysis
Committee.
Pseudomonas
aeruginosa Community Annotation Project
Last updated: December 4, 1998