Plastid-associated genes in dinoflagellates identified by cDNA screening and expression analysis

Project Summary

Dinoflagellates are a distinctive group of flagellate protists that are distantly related to most model organisms, and have many features that are unusual among eukaryotes. They are important members of both marine and freshwater plankton communities, and can be responsible for harmful algal blooms and toxin production. About half of all dinoflagellates are photosynthetic. While there are several types of plastids (chloroplasts) found in dinoflagellates, the majority of photosynthetic dinoflagellates rely on plastids that include the pigment peridinin and are bound by three membranes (peridinin-type plastids). The evolutionary history of the peridinin-type dinoflagellate plastid is poorly understood; it is thought to be a secondary plastid acquired when a dinoflagellate ingested a second eukaryote already equipped with plastids, but relatively little is known about dinoflagellate genetics, and molecular data from the peridinin-type plastid have only recently begun to be available. The project will explore the genetic and genomic properties of dinoflagellates, with the aim of using genomic techniques to study the integration of the peridinin-type plastid into its host cell.

The primary goals of the project are to identify a dinoflagellate suitable as a model for molecular genetic studies, explore the fundamental properties of the nuclear, chloroplast, and mitochondrial genomes in candidate dinflagellates, and determine the feasibility of genomic studies in these organisms. A suitable model dinoflagellate for study should be photosynthetic and rely on a peridinin-type plastid, should be easy to grow to high cell density in culture, and should be amenable to nucleic acid extraction and standard laboratory procedures. Ideally the organism would be in axenic culture, but few such dinoflagellate cultures are available. Other desirable properties include an easily synchronized cell cycle, stability of the nucleus and organelles in cell fractionation, and environmental and economic importance of the organism.

When one or a small number of 'candidate' dinoflagellates have been identified, genome organization and expression will be studied in these cultures. The size of the genome will be verified by DAPI microspectrofluorometry using a confocal microscope. Because the nuclear genome of dinoflagellates is thought to be very large as much as 100 times larger than the human genome it would not be cost effective to attempt to fully sequence the nuclear genome. Consequently, this study will examine the feasibility of using cDNA approaches to study the genome. There are conflicting data concerning the fraction of polyadenylated mRNA in dinoflagellates, so the fraction of poly-A mRNA will be quantified. These studies will guide preparation of either poly-A RNA, or else from total RNA from which rRNA and tRNA have been selectively removed. This pool of RNA will be used to generate a normalized cDNA library. The effectiveness of normalization will be estimated by southern hybridization with excised inserts from a number of randomly picked clones; capture-recapture statistics will be used to estimate the redundancy and complexity of the cDNA library.

Once a suitably diverse cDNA library has been developed, several hundred randomly picked clones will be sequenced. From these sequences it will be possible to make some initial inferences about the feasibility of making a full survey of cDNA diversity. One key question is why these organisms have such remarkably large genomes; it is not clear that the complexity of the genome matches its large size. While an initial estimate of genome complexity can be obtained by Southern Hybridization against the cDNA library or DNA/DNA rehybridization kinetics, sequence data will provide greater specificity, albeit only for expressed genes. To understand the organization of the dinoflagellate genome, it will also be important to determine whether complex gene families are present as would be expected if the genome is polyploid. In addition, several known dinoflagellate genes are encoded as polygenes. Representative cDNAs will be used to examine the nuclear genome for gene copy number and arrangement.

From these studies will emerge a more detailed view than has previously been available of the size and complexity of the genome of a representative dinoflagellate. Because so few genes are known from dinoflagellates, it is very likely that simply determining the DNA sequence of a substantial number of genes will yield many important sequences. Of particular interest is the interaction between the nuclear and plastid genomes. The library will be used to identify plastid-associated genes (i.e., those whose products are targeted to the plastid, are involved in the maintenance of normal plastid function, or are encoded in the plastid genome). These will be identified from the library by their similarity to cyanobacterial genes, or by similarity to genes thought to be plastid-expressed in other organisms. Genes that are suspected to be plastid-expressed will be fully sequenced by 5-RACE, asymmetric PCR, or screening of genomic libraries, from which we hope to identify enough transit peptides to identify a consensus transit peptide sequence which could itself be used to screen for plastid-expressed genes.