SESSION V - SIGA WORKING GROUPS

Proceedings of the XLVII Italian Society of Agricultural Genetics - SIGA Annual Congress

Verona, Italy - 24/27 September, 2003

ISBN 88-900622-4-X

Poster Abstract - 5.56

STRUCTURE, ORGANIZATION AND SEQUENCE COMPOSITION OF THE GRAPE (VITIS VINIFERA L.) GENOME

G. FAES*^,**, M. MOROLDO*^,**, F. CATTONARO*, A. ZUCCOLO*, P. FONTANA**, R. VELASCO**, M. MORGANTE*

*) Dipartimento di Produzione Vegetale e Tecnologie Agrarie, Università degli Studi di Udine, via delle Scienze 208, 33100 Udine

**) Istituto Agrario San Michele all’Adige, via Mach 1, 38010 San Michele TN

grapevine, retroelements, genome structure

One of the fastest ways to get insights into the structure, organization and sequence composition of a eukaryotic genome is to sequence a library of randomly sheared genomic DNA. In order to characterize the genome of grape, V. vinifera (1C = 485 Mbp), two libraries were constructed from the cv. Pinot Noir. Genomic fragments in the size range of 1200-1500 bp were produced by nebulizing genomic DNA, and were then ligated into pCR-Script. The ligation mix was first used to transform E. coli strain DH10b (mrcA, mrcB, mrcC, mrr) and produce a random genomic library, and inserts from about 2500 clones were selected for sequencing from both directions. The same ligation mix was then used to transform E. coli strain DH5a (MrcA, MrcB, MrcC, Mrr). The restriction systems for methylated DNA are intact in this strain, so that DNA fragments containing methylated inserts are less likely to survive the cloning process. About 500 clones from this second library (methyl-filtered library) were sequenced from both directions.

Sequences were trimmed and both BlastN and BlastX searches were performed. A total of 2 Mbp of sequence were obtained from the random genomic library, corresponding to about 0.5% of the grape genome. This sample should provide a representative sample of the genome and contain most of the abundant repetitive elements present within it. A sequence was classified as a known element if it had a blast E-value of <10^-5. Clusters of related sequences were identified by assembling sequences using the Arachne (Whitehead Institute, MA) genome assembler. Further analysis are currently performed against a proprietary proprietary EST database clustered with 83.000 grape ESTs present into the TIGR database (www.tigr.org/dtb/tgi/vvgi). In the first library about 40% of the sequences were classified, with a preminence of genes (16%) and retroelements (11.5%). About 10% was made up of cpDNA sequences and 2% was classified as mitochondrial. The second library showed a higher percentage of classified sequences, about 53%. Not surprisingly, more than 28% of the sequences were genes, and chloroplast clones accounted for another 16%. We also observed that retroelements were less abundant than in the previous library (3%), while mitochondrial clones had increased to 5.5%.

The comparison between the libraries shows that repeated sequences like retrotransposons are largely excluded from genomic shotgun libraries by choosing methylation-restrictive E. coli strains, indicating that they must be heavily methylated. In contrast, genes are much more represented in the filtered library than in the random one, being probably hypomethylated. Other basic properties of the grape genome have been estimated from the sequences, such as G+C content, dinucleotide frequencies including that of the CpG one, frequencies of microsatellites. We will also discuss the distribution of specific repeats in the genome, as assessed by their hybridisation to BAC clones and among species of the genus as assessed by hybridisations with labelled total genomic DNA.