Estrada, Richard [1], Saldaña, Carla L. [1], Salazar, Wilian [1], Contreras-Liza, Sergio [2], Guerrero-Abad, Juan C. [3], Vásquez, Héctor V. [1], Maicelo, Jorge L. [1], Arbizu, Carlos I. [1].

Identification of variants in whole genome sequencing data of a Peruvian landrace of Capsicum chinense, arnaucho chile pepper.

C. chinense is considered to be one of the species with the most varieties domesticated in America. The area where a greater diversity is observed for this species of chili pepper is the South American Amazon basin. In Peru, C. chinense possesses the greatest morphological variability in terms of size, shape and color of fruits. However, their molecular component is still unknown. We here report variant genomics for a Peruvian landrace of C. chinense, “arnaucho chile pepper”. We sampled this landrace from Valley of Supe in Peru. High-molecular-weight DNA from fresh leaves were isolated using a modified CTAB protocol. The DNA sample was sequenced using a Truseq DNA Nano library and NovaSeq6000 Illumina platform. We generated 77 Gb of whole-genome shotgun sequences containing 1,062 million raw paired-end reads of 150 bp. To estimate the genome size, raw paired-end (PE) reads were processed by removing leading and trailing low-quality regions or those that contained the TruSeq index and universal adapters. A 17-mer distribution was generated and the genome size of C.chinense was subsequently estimated at 2.3 Gbp where the main peak lied at the k-mer depth of 38. Also, arnaucho chile pepper presents a heterozygosity of 0.0829 % and a GC content of 34.86%. These trimmed reads were mapped to the reference genome (ASM227189v2) using default settings, obtaining 97.3% of the total reads mapped. Duplicate alignments were marked and 248,470,838 reads were excluded. The post-mapping was the input file for the variant call and the variants were filtered for quality (Q>30). We obtained 6,128,822 variants: 5,039,077 SNPs, 655,784 variants, 127,594 indels, and 2.04 of ratio SNP transitions/transversions. The estimated genome size represents 74.19% of the reference genome. The genome data provided here is expected to contribute to a better understanding of the genetics of this species, as well as important molecular pathways that could be crucial for the biology, management, and for promoting its appropriate conservation and genomics-assisted breeding.

1 - Instituto Nacional de Innovación Agraria, Centro Experimental La Molina, Dirección de Desarrollo Tecnológico Agrario, Av. La Molina 1981, La Molina, Lima, Lima, 15024, Peru
2 - Universidad Nacional Jóse Faustino Sánchez Carrión, Departamento de Agronomía, Av. Mercedes Indacochea 609, Huacho, Lima, 15136, Peru
3 - Instituto Nacional de Innovación Agraria, Centro Experimental La Molina, Dirección de Recursos Genéticos y Biotecnología, Av. La Molina 1981, La Molina, Lima, Lima, 15024, Peru


Presentation Type: Poster
Number: PPG006
Abstract ID:710
Candidate for Awards:None

