Genome structure variation in glycine species

Jungmin Ha, Purdue University

Abstract

The genus Glycine includes Glycine max, soybean, which is one of the most important crops due in part to its nitrogen fixation capacity through symbiosis with soil-borne microorganisms. However, the narrow genetic diversity of elite cultivars poses a potential threat due to constantly evolving threats such as disease and other biotic and abiotic stresses. Therefore, the development of genomic tools for wild Glycine species has been undertaken to provide an infrastructure in order to find useful genetic variation and move it into cultivated soybean. This genomic infrastructure includes genetic maps from the crosses between G. max and G. soja (the undomesticated ancestor of soybean), whole genome sequence and BAC-based physical maps. Current sequencing technology limits our ability to use sequence information to describe and understand structural variation between genomes in that sequence reads are short and long-range mate pairs sequencing is difficult. However, once a genome is sequenced as a reference genome, the genomes of relatives can be relatively easily assembled with low sequence coverage. The length of reads determined by current sequencing technology are often not long enough to distinguish large-scale structural variation inresequencing as compared to a reference genome. BAC-based FingerPrinted Contig (FPC) physical maps were constructed and integrated with the draft sequence of G. max as a framework to begin to explore structural variation within and between G. max and G. soja. The G. max FPC map covers up to 95% of the soybean draft genome sequence (Gmax1.01) incorporating 4,628 genetic markers that were used to align the physical maps with genetic linkage maps. A minimum set of BAC clones covering as much as possible of the draft sequence was selected to provide a minimum tiling path of clones that can be used for further investigations and gene cloning. A BAC-based FPC map of Glycine soja was also constructed and aligned to G. max genome sequence map using BES (BAC end sequence) information. In order to detect chromosomal variation between G. max and G. soja, the FPC contigs of G. soja that aligned to 2 or more chromosomes of G. max were chosen as candidate contigs spanning potential chromosomal rearrangements such as reciprocal translocations. Other molecular techniques, such as BAC sequencing and fluorescent in situ Hybridization (FISH), will be used to confirm these computationally predicted rearrangements. Two to four BAC clones covering candidate breakpoints will be sequenced per contig to test our pipeline to identify chromosomal rearrangement between two species. Since more than 50% of soybean genome consists of repetitive sequences, the candidate BAC clones will be sequences with enough coverage to be assembled de novo then the BAC sequences will be aligned to the draft sequence of G. max to identify chromosomal variation. Centromeres play an evolutionarily conserved role in chromosome movement at meiosis and mitosis. Paradoxically, even though they play an essential role, the DNA sequences that underlie centromeric loci are not conserved even in a single genus. Since in most high eukaryotic organisms centromeres consist of the most frequent tandem repeats in the genomes, tandem repetitive sequences common in genomes have been identified in wild Glycine species using a combination of computation approaches, including Glycine canescens, Glycine cyrtoloba, Glycine falcata, Glycine stenophita, Glycine syndetika, Glycine tomentella D3 and Glycine tomentella T2. Centromeric repeats was identified and tested cytogenetically in G. falcata and G. canescens. We built and verified the pipeline to identify candidate centromeric repeats and evolutionary processes of centromeres in seven wild perennial Glycines were characterized. This study sheds light on chromosome structural variation within the Glycine species and provides a framework for comparative genomics, gene cloning and evolutionary analyses of legume genomes.

Degree

Ph.D.

Advisors

Stuart, Purdue University.

Subject Area

Agronomy|Evolution and Development|Plant sciences|Bioinformatics

Off-Campus Purdue Users:
To access this dissertation, please log in to our
proxy server
.

Share

COinS