In silico EST-SSR Identification and Development through EST Sequences from Metroxylon sagu Rottb. for Genetic Diversity Analysis

Main Article Content

Devit Purwoko
Siti Zulaeha
Teuku Tajuddin
Farida Rosana Mira
Maharani Dewi Solikhah
Gemilang Rahmadara
Nurul Fitri Hanifah


Sago plant (Metroxylon sagu Rottb.) is one of the most carbohydrate-producing plants in the world. Microsatellites or simple sequence repeats (SSRs) play an important role in the genome and are used extensively compared to other molecular markers. For the first time, we are exploiting data expressed sequence tags (EST) of sago plants to identify and characterise markers in this species. EST data about sago plants are obtained through the EST database on the National Center for Biotechnology Information (NCBI) website. We obtained data of 458 Kb (412 contig) with a maximum and minimum length of 1,138 and 124 nucleotides, respectively. We successfully identified 820 perfectly patterned SSR using Phobos 3.3.12 software. The type characterisation of EST-SSR was dominated by tri-nucleotides 36% (294), followed by hexa-nucleotides 24% (202), tetra-nucleotides 15% (120), penta-nucleotides 13% (108) and di-nucleotides 12% (96). The most frequency of SSR motifs in each type is AG, AAG and AAAG. Analysis of synteny on the EST sequence with the online application Phytozome found that sequences were distributed on 12 Oryza sativa chromosomes with a likeness percentage between 63% to 100% and e-value between 0 to 0.094. We developed the primer and generated 19 primers. Furthermore, we validated 7 primers that all generated polymorphic alleles. To our knowledge, this report is the first identification and characterisation of EST-SSR for sago species and these markers can be used for genetic diversity analysis, marker assisted selection (MAS), cultivar identification, kinship analysis and genetic mapping analysis.

Article Details

How to Cite
In silico EST-SSR Identification and Development through EST Sequences from Metroxylon sagu Rottb. for Genetic Diversity Analysis. (2024). Tropical Life Sciences Research, 35(1), 13–32.
Original Article


Aberlenc-Bertossi F, Castillo K, Tranchant-Dubreuil C, Chérif E, Ballardini M, Abdoulkader S, Gros-Balthazard M, Chabrillange N, Santoni S, Mercuri A and Pintaud J-C. (2014). In silico mining of microsatellites in coding sequences of the date palm (Arecaceae) genome, characterization, and transferability. Applications in Plant Sciences 2(1): 1300058.

Amiteye S. (2021). Basic concepts and methodologies of DNA marker systems in plant molecular breeding. Heliyon 7(10): e08093.

An X, Liu Q, Ying J, Wei J, Dong G, Luo X, Li W, Liu T, Zhou H, Zou L and Chen C. (2023). Development of expressed sequence tag–simple sequence repeat markers related to the salt-stress response of Kenaf (Hibiscus cannabinus). Agronomy 13(7): 1946.

Aslanbay Guler B and Imamoglu E. (2023). Molecular marker technologies in food plant genetic diversity studies: An overview. Foods and Raw Materials 11(2): 282–292.

Cai K, Zhu L, Zhang K, Li L, Zhao Z, Zeng W and Lin X. (2019). Development and characterization of EST-SSR markers from RNA-seq data in Phyllostachys violascens. Frontiers in Plant Science 10: 50.

D’Imperio M, Viscosi V, Scarano M T and D’Andrea M. (2011). Integration between molecular and morphological markers for the exploitation of olive germplasm (Olea europaea). Science Horticulture 130: 229–240.

Duran C, Singhania R, Raman H, Batley J and Edwards D. (2013). Predicting polymorphic EST-SSRs in silico. Molecular Ecology Resources 13(3): 538–545.

Ellis J R and Burke J M. (2007). EST-SSRs as a resource for population genetic analyses. Heredity 99(2): 125–132.

Endo C, Yamamoto N, Kobayashi M, Nakamura Y and Yokoyama K. (2017). Development of simple sequence repeat markers in the halophytic turf grass Sporobolus virginicus and transferable genotyping across multiple grass genera/species/genotypes. Euphytica 213(2): 1–12.

Eujayl I, Sledge M K, Wang L, May G D, Chekhovskiy K, Zwonitzer J C and Mian M A. (2004). Medicago truncatula EST-SSRs reveal cross-species genetic markers for Medicago spp. Theoretical and Applied Genetics 108: 414–422.

Feingold S, Lloyd J, Norero N, Bonierbale M and Lorenzen J. (2005). Mapping and characterization of new EST-derived microsatellites for potato (Solanum tuberosum L.). Theoretical and Applied Genetics (Theoretische und angewandte Genetik) 111(3): 456–466.

Flach M. (1995). Research priorities for sago palm development in Indonesia and Sarawak: An agenda for research. ISHS Acta Horticulturae 389: 19–40.

Gotz S, Garcia-Gomez J M, Terol J, Williams T D, Nagaraj S H, Nueda M J, Robles M, Talon M, Dopazo J and Conesa A. (2008). High-throughput functional annotation and data mining with the Blast2GO suite. Nucleic Acids Research 36(10): 3420–3435.

Hailu G and Asfere Y. (2020). The role of molecular markers in crop improvement and plant breeding programs: A review. Agricultural Journal 15(6): 171–175.

Haq S U, Dhingra P, Sharma M, Kothari S L and Kachhwaha S. (2021). Plasticity of tandem repeats in expressed sequence tags of angiospermic and nonangiospermic species: Insight into cladistic, phenetic and elementary explorations. Journal of Applied Biology and Biotechnology 9(2): 36–59.

Haq S U, Jain R, Sharma M, Kachhwaha S and Kothari S L. (2014). Identification and characterization of microsatellites in expressed sequence tags and their cross transferability in different plants. International Journal of Genomics. 2014: 863948.

Jain N, Patil G B, Bhargava P and Nadgauda R S. (2014). In silico mining of EST-SSRs in Jatropha curcas L. towards assessing genetic polymorphism and marker development for selection of high oil yielding clones. American Journal of Plant Sciences 5: 1521–1541.

Jong F S. (1995). Research for the development of sago palm (Metroxylon sagu Rottb.) cultivation in Sarawak, Malaysia. Kuching, Sarawak: Department of Agriculture, 139 pp.

Kale S M, Pardeshi V C, Kadoo N Y, Ghorpade P B, Jana M M and Gupta V S. (2012). Development of genomic simple sequence repeat markers for linseed using next generation sequencing technology. Moleculer Breeding 30: 597–606.

Kalia R K, Rai M K, Kalia S, Singh R and Dhawan A K. (2011). Microsatellite markers: An overview of the recent progress in plants. Euphytica 177: 309–334.

Kantety R V, La Rota M, Matthews D E and Sorrells M E. (2002). Data mining for simple sequence repeats in expressed sequence tags from barley, maize, rice, sorghum and wheat. Plant Molecular Biology 48: 501–510.

Kumar P, Gupta V K, Misra A K, Modi D R and Pandey B K. (2009). Potential of molecular markers in plant biotechnology. Plant Omics Journal 2: 141–162.

Maskromo I, Larekeng S H, Novarianto H and Sudarsono S. (2016). Xenia negatively affecting kopyor nut yield in Kalianda Tall Kopyor and Pati Dwarf Kopyor coconuts. Emirates Journal of Food and Agriculture 28: 644–652.

Mayer C, Leese F and Tollrian R. (2010). Genome-wide analysis of tandem repeats in Daphnia pulex: A comparative approach. BMC Genomics 11(1): 277.

Molla M R, Islam M N, Rohman M M and Rahman L. (2010). Microsatellite allele size profiling to determine varietal identity and genetic diversity among groundnut varieties in Bangladesh. Natural Sciences 8: 123–127.

Mondini L, Noorani A and Pagnotta M A. (2009). Assessing plant genetic diversity by molecular tools. Diversity 1(1): 19–35.

Nejad A M, Tonomura K, Kawashima S, Moriya Y, Suzuki M, Itoh M, Kanehisa M, Endo T and Goto S. (2006). EGassembler: Online bioinformatics service for large-scale processing, clustering and assembling ESTs and genomic DNA fragments. Nucleic Acids Research 34(459–462): 459–462.

Novero A U, Ma B M and Hannah J E. (2012). Epigenetic inheritance of spine formation in sago palm (Metroxylon sagu Roettb.). Plant Omics Journal 5: 559–566.

Parida S K, Kalia S K, Kaul S, Dalal V, Hemaprabha G, Selvi A, Pandit A, Singh A, Gaikwad K, Sharma T R, Srivastava P S, Singh N K and Mohapatra T. (2009). Informative genomic microsatellite markers for efficient genotyping applications in sugarcane. TAG Theoretical and Applied Genetics (Theoretische und angewandte

Genetik) 118(2): 327–338.

Pashley C H, Ellis J R, McCauley D E and Burke J M. (2006). EST databases as a source for molecular markers: Lessons from Helianthus. Journal of Heredity 97: 381–388.

Perrier X and Jacquemoud-Collet J P. (2006). DARWin software.

Pesik A, Efendi D, Novarianto H, Dinarti M, Maskromo I, Tenda E T and Sudarsono S. (2015). Keragaman dan hubungan genetik antara kelapa tetua genjah kuning nias. Buletin Palma 16: 129–140.

Pesik A, Efendi D, Novarianto H, Dinarti D, Sudarsono S. (2017). Development of SNAP markers based on nucleotide variability of WRKY genes in coconut and their validation using multiplex PCR. Biodiversitas Journal of Biological Diversity 18: 465–475.

Priyanka P, Kumar D, Yadav A, Yadav K and Dwivedi U. (2017). Analysis of simple sequence repeats information from floral expressed sequence tags resources of papaya (Carica papaya L.). American Journal of Plant Sciences 8: 2315–2331.

Purwoko D, Cartealy I C, Tajuddin T, Dinarti D and Sudarsono S. (2019). SSR identification and marker development for sago palm based on NGS genome data. Breeding Science 69(1): 1–10.

Purwoko D, Zulaeha S, Tajuddin T, Khairiyah H, Fauzi R Z and Priyanti. (2021). SSR markers characterization for Temu Ireng (Curcuma aeruginosa Roxb.) generated from EST of Curcuma longa. Jurnal Bioteknologi dan Biosains Indonesia 8(2): 160–173.

Salgotra R K and Chauhan B S. (2023). Genetic diversity, conservation, and utilization of plant genetic resources. Genes 14(1): 174.

Supek F, Bošnjak M, Škunca N and Šmuc T. (2011). REVIGO summarizes and visualizes long lists of gene ontology terms. PLoS ONE 6(7): e21800.

Singh R, Zaki N M, Ting N C, Rosli R, Tan S G, Low E T L, Ithnin M and Cheah S C. (2008). Exploiting an oil palm EST database for the development of gene-derived SSR markers and their exploitation for assessment of genetic diversity. Biologia 63: 227–235.

Singh R B, Singh B and Singh R K. (2019). Development of potential dbEST-derived microsatellite markers for genetic evaluation of sugarcane and related cereal grasses. Industrial Crops and Products 128: 38–47.

Sint D, Raso L and Traugott M. (2012). Advances in multiplex PCR: Balancing primer efficiencies and improving detection success. Methods in Ecology and Evolution 3: 898–905.

Tinche, Asmono D, Dinarty D, Sudarsono S. (2014). Genetic diversity oil palm (Elaeis guineensis Jacq.) Nigeria population based on SSR (Simple Sequence Repeats) marker analysis. Buletin Palma 15: 14–23.

Ukoskit K, Posudsavang G, Pongsiripat N, Chatwachirawong P, Klomsaard P, Poomipant P and Tragoonrung S. (2018). Detection and validation of EST-SSR markers associated with sugar-related traits in sugarcane using linkage and association mapping. Genomics 111: 1–9.

Untergasser A, Cutcutache I, Koressaar T, Ye J, Faircloth B C, Remm M and Rozen S G. (2012). Primer3: New capabilities and interfaces. Nucleic Acids Research 40(15): e115.

Varshney R K, Marcel T C, Ramsay L, Russell J, Röder M S, Stein N, Waugh R, Langridge P, Niks R E and Graner A. (2007). A high density barley microsatellite consensus map with 775 SSR loci. Theoretical and Applied Genetics 114(6): 1091–1103.

Vieira L D, da Silva J O, Pereira C C O, de Carvalho S A, Silveira R D D, Malafaia G and de Menezes I P P. (2016). In silico identification of putative expressed sequence tag (EST)-simple sequence repeats (SSRs) markers of resistance to Meloidogyne spp. in common bean. African Journal of Agricultural Research 11(23): 2007–2012.

Yang Y, He R, Zheng J, Hu Z, Wu J and Leng P. (2020). Development of EST-SSR markers and association mapping with floral traits in Syringa oblata. BMC Plant Biology 20: 436.

Zhao Y, Williams R, Prakash C S and He G. (2013). Identification and characterization of gene-based SSR markers in date palm (Phoenix dactylifera L.). BMC Plant Biology 12: 237.