Amplification and Sequencing of Rosaceae Expressed Sequence tags (ESTs) as a Resource for Functional Genomics Databases
Abstract
Rosaceae genome size is small (300 Mb) of about twice that of Arabidopsis (Baird et al., 1994). Rosaceae members are characterized by a relatively short juvenile period (2-3 yrs) and extensive genetics and genomics resources such as molecular marker maps, interesting mutants and clone library resources (Georgi et al., 2002). In addition, it has been demonstrated that molecular marker tools developed in peach are easily applied to other species in the family (Joobeur et al., 1998; Zhebentyayeva et al., 2003). Peach (Prunus persica) is being developed as a model organism for Rosaceae, an economically important family that includes fruits and ornamental plants such as apple, pear, strawberry, cherry, almond and rose. To demonstrate the utility of the integrated and fully annotated database and analysis tools, they described a case study where they anchored Rosaceae sequences to the peach physical and genetic map by sequence similarity (Jung et al., 2004).
Several marker maps of Prunus fruit crops have been published, three of them, using peach (Rajapakse et al.,
1995), almond x peach (Foolad et al., 1995) and almond (Viruel et al., 1995) progenies, were constructed mainly with RFLP markers. Joobeur et al. (1998) found that the Texa x Earlygold map has a level of saturation similar to these maps, and therefore it covers most of the distance of the Prunus genome and has a sufficient marker density for use in plant breeding. However, its total distance (491 cM) is clearly shorter when compared to the potato (684 cM), tomato (1276 cM) and rice (1491 cM) maps. This difference may be due either to the small nuclear DNA content of the Prunus genome, about two and four times smaller than the rice and tomato genomes, respectively, (Arumuganathan and Earle, 1992).
Expressed sequence tags (ESTs) are considered as a functional genomic resource in plant molecular biology. It is produced by transcriptome (the tran- scribed portion of the genome) sequenc- ing. EST was analyzed in many plant species i.e., in Arabidopsis (Spiegelman et al., 2000), in grapes (Scott et al., 2000),
Pinus radiata and Pinus taeda (Cato et al., 2001), sugar beet (Schneider et al., 2002), rice (Jin et al., 2003), in Ginkgo biloba (Brenner et al., 2005) and in tomato (Labate and Baldo, 2005). Annotations of ESTs include contig assembly, putative function, simple sequence repeats, and anchored position to the peach physical map where applicable. The importance of high- quality fruit and the intrinsic difficulties of breeding in a perennial species require the development and application of structural and functional genomic databases for the sustained improvement of rosaceaous fruit crops. Identification and characterization of genes controlling the genetic basis of the traits, and their tagging with molecular markers, permits facilitated introgression of important characters, speeding development of new breeding material combining the best traits formerly isolated in separate varieties (Abbott et al., 2006).
The ESTree db (Lazzari et al., 2004, 2005, 2007, 2008) is Expressed Sequence Tags (ESTs) database that was developed by the Italian ESTree Interuniversitary Centre as a platform for easy genomics and functional genomics data integration and retrieval. Together with the GDR database (genome database for Rosaceae), it represents the most complete online resource for peach EST analysis. The ESTree db sequence analysis is based on a semi-automated Perl pipeline that during its steps feeds the tables of a MySQL database. Queries to the database can be performed via a PHP- based web interface. The ESTree and the GDR databases represent the only existing online resources dedicated to peach EST analysis. The two databases are very similar in terms of entry number (71,540 peach sequences in the ESTree db, 70,939 in the GDR db), but quite different in terms of information and its retrieval. The ESTree db clustering procedure produced a dataset of 27,097 unigenes, 4,303 of which were derived from our in-house prepared libraries (Lazzari et al., 2008).
The aim of this study was to amplify, isolate and sequence some peach, almond and their F1 progeny ESTs for GDR and ESTree databases as functional genomic resources.
References
Abbott, A. G., T. Zebentenvayeva, L. Georgi, L. Garay, R. Horn, S. Jung, D. Main, J. D. Lalli, V. Decroocq, M. L. Badenes, W. V. Baird and G. L. Reighard (2006). The Rosaceae genome database: A tool for improving apricot genetics and agriculture. ISHS Acta Hor- ticulturae 717: XIII International Symposium on Apricot Breeding and Culture.
Arumuganathan, K. and E. D. Earle (1992). Nuclear DNA content of some important plant species. Plant Mol. Biol. Rep., 9: 208-218.
Baird, W. V., A. S. Estager and J. Wells (1994). Estimating nuclear DNA content in peach and related diploid species using laser flow cytometry and DNA hybridization. J Amer. Soc. Hort. Sci., 119: 1312-1316.
Brenner, E. D., M. S. Katari, D. W. Stevenson, S. A. Rudd, A. W. Douglas, W. N. Moss, R. W. Twigg, S. J. Runko, G. M. Stellari, W. R. McCombie and G. M. Coruzzi (2005). EST analysis in Ginkgo biloba: an assessment of conserved developmental regula- tors and gymnosperm specific genes. BMC Genomics, 15: 143.
Cato, S. A., R. C. Gardner, J. Kent and T. E. Richardson (2001). A rapid PCR-based method for genetically mapping ESTs. Theor. Appl. Genet., 102: 296-306.
Foolad, M. R., S. Arulsekar, V. Becerra and F. A. Bliss (1995). A genetic map of Prunus based on an interspecific cross between peach and almond. Theor. Appl. Genet., 91: 262-269.
Georgi, L., Y. Wang, D. Yvergniaux, T. Ormsbee, M. Inigo, G. Reighard and G. Abbott (2002). Construc- tion of a BAC library and its application to the identification of simple sequence repeats in peach [Prunus persica (L.) Batsch]. Theor. Appl. Genet., 105: 1151- 1158.
Jin, Q., D. Waters, G. M. Cordeiro, R. J. Robert Henry and R. F. Reinke (2003). A single nucleotide poly- morphism (SNP) marker linked to the fragrance gene in rice (Oryza sativa L.). Plant Science, 165: 359-364.
Joobeur, T., M. A. Viruel, M. C. de Vicente, B. Jauregui, J. Ballester, M. T. Dettori, I. Verde, M. J. Truco, R. Messeguer, I. Batlle, R., Quarta, E. Dirlwanger and P. Arus (1998). Construction of a saturated linkage map for Prunus using an almond X peach F2 progeny. Theor. Appl. Genet., 97: 1034-1041.
Jung, S., C. Jesudurai, M. Staton, Z. Du, S. Ficklin, I. Cho, A. Abbott, J. Tomkins and D. Main (2004). GDR (Genome Database for Rosaceae): integrated web re- sources for Rosaceae genomics and genetics research. BMC Bioinformatics, 9: 130.
Labate, J. A. and A. M. Baldo (2005). Tomato SNP discovery by EST mining and resequencing. Molecu- lar Breeding, 16: 343-349.
Lazzari, B., A. Caprera, C. Cosentino, A. Stella, L. Milanesi and A. Viotti (2007). ESTuber db: an online database for Tuber borchii EST sequences. BMC Bioinformatics, 8: 13.
Lazzari, B., A. Caprera, A. Vecchietti, I. Merelli, F. Barale, L. Milanesi, A. Stella and C. Pozzi (2008). Version VI of the ESTree db: an improved tool for peach transcrip- tome analysis. BMC Bioinformat- ics, 9: S9doi: 10.1186/1471-2105-
-S2-S9.
Lazzari, B., A. Caprera, A. Vecchietti, A. Stella, L. Milanesi and C. Pozzi (2005). ESTree db: a Tool for Peach Functional Genomics. Italian Society of Bioinformatics (BITS): Annual Meeting.
Lazzari, B., A. Caprera, L. Milanesi, A. Stella, F. Bianchi, A. Vecchietti, C. Cosentino, A. Viotti and C. Pozzi (2004). ESTree DB and ESTuber DB: a fully automated procedure for EST sequence analysis and database manage- ment. Proceedings of the XLVIII Italian Soc of Agric Genet. SIFV- SIGA Joint Meeting: Lecce.
Rajapakse, S., L. E. Beltho, G. He, A. E. Estager, R. Scorza, I. Verde, R. E. Ballard, V. W. Baird, A. Callahan. R. Monet and A. G. Abbott (1995). Genetic mapping in peach using morphological, RFLP and RAPD markers. Theor. Appl. Genet., 90: 503-510.
Schneider, K., R. Schäfer-Pregl, D. C. Borchardt and F. Salamini (2002). Mapping QTLs for sucrose content, yield and quality in a sugar beet population fingerprinted by EST-related markers. Theor. Appl. Genet., 104: 1107-1113.
Scott, K. D., P. Eggler, G. Seaton, M. Rossetto, E. M. Ablett, L. S. Lee and R. J. Henry (2000). Analysis of SSRs derived from grape ESTs. Theor. Appl. Genet., 100: 723-726.
Spiegelman, J. I., M. N. Mindrinos, C. Fankhauser, D. Richards, J. Lutes, J. Chory and P. J. Oefner (2000). Cloning of the Arabidopsis RSF1 Gene by Using a Mapping Strategy Based on High-Density DNA Arrays and Denaturing High- Performance Liquid Chromatogra- phy. The Plant Cell, 12: 2485-2498.
Viruel, M. A., R. Messeguer, M. C. de Vicente, J. Garcia-Mas, P. Puigdomenech, F. J. Vargas and P. Arus (1995). A linkage map with RFLP and isozyme markers for almond. Theor. Appl. Genet., 91: 964-971.
Zhebentyayeva, N., L. Reighard, M. Gorina and G. Abbott (2003). Simple sequence repeat (SSR) analysis for assessment of genetic variability in apricot germplasm. Theor. Appl. Genet., 106: 435-44.