Introduction to Genomics

Genomics
The study of genomes, particularly the set of techniques, analytical methods, and scientific questions special to the study of complete genomes.
Genetics
The study of heredity and inherited variation.

Why study complete genomes?

Intra-genomic studies

Analyses of gene families and superfamilies

    Protein kinases

    ATP-binding cassette (ABC) transporters

    Multidrug-resistant transport proteins

Physical map of chromosomes

    Isochores

    Regions with distinctive base composition

    Associated with patterns of gene expression & recombination

Error-prone sequences

    EXAMPLE

Identification of uncharacterized genes

    Roughly 2000 open reading frames (about half) in the yeast genome have no homolog of known biochemical function

    URFs -- Unidentified open Reading Frames

    Comparable numbers of URFs have been found in other genomes

    Possible explanations for URFs

    Rapid sequence evolution makes identity of sequence obscure

    Improved sequence matching algorithms can help reduce this category

    Gene encodes a genuinely novel product

    Need to do follow-up biochemical work

    But the availability of the sequence provides a powerful tool for identifying gene function.

    Identification of non-protein information

Comparative studies

Identify core set of genes for all organisms

Identify contents of ancestral genome

Is there any real 'model organism'? If so, what is it?

Do all organisms use the same gene for the same purpose?

    The answer here is clearly "no".

    Example:

    When the genome of the archaean Methanococcus jannaschii was sequenced, four of the 20 amino-acyl-tRNA synthetases could not be found.

    These enzymes are critical in preparing the tRNAs for protein synthesis.

    Biochemical evidence indicated that glutamine and asparagine are incorporated as transamindated derivatives of glutamate and aspartate, so the absence of these tRNA synthetases was not surprising.

    A comparable story was postulated for cysteine, hypothesizing that cysteine-tRNA is produced by trans-sulfuration of serine-tRNA.

    But the absence of lysyl tRNA synthetase was unexplained.

    Subsequently the lysine-tRNA synthetase was identified -- it is dissimilar to the other tRNA syntetases, apparently a new class.

    Not the product of rapid evolution.

    The bacteriuim Borrelia burgdorferi was subsequently found to have the same kind of lysine-tRNA synthetase as that found in M. jannaschii

    Likely explanations:

    Parallel evolution

    Horizontal gene transfer & substitution

What is the relative importance of horizontal gene transfer in evolution?

    It is now clear that horizontal gene transfer has played a significant role in evolution

    But the full measure of its importance is still a subject for active study

Other questions answerable by comparative genomics


Doolittle, R.F. 1998. Microbial genomes opened up. Nature 392:339-342.

Clayton, R.A., O. White, K.A. Ketchum, and J. C. Venter. 1997. The first genome of the third domain of life. Nature 387:459-462.

List of complete genomes and genomes in progress: http://www.tigr.org/tdb/mdb/mdb.html

Database of genome sizes: http://www.cbs.dtu.dk/databases/DOGS/index.html

Bioinformatics Home
Syllabus
Links
Reading