P1018 ConvolutionTR: Comprehensive Study of SNPs Within Tandem Repeats

Marcelo G. Narciso , Embrapa Rice and Beans, Goiás, GO, Brazil
Michel E Beleza Yamagishi , Embrapa Agriculture Informatics, Campinas, SP, Brazil
Paula R. Kuser-Falcao , Embrapa Agriculture Informatics, Campinas, SP, Brazil
Tandem Repeats (TR) are sequences where the same pattern repeats consecutively in a given DNA sequence. We have developed a fast algorithm to “de novo” genome-wide TRs discovery called ConvolutionTR. Our algorithm takes advantage of a well-known mathematical convolution operator in order to identify TRs in a sequence. It is noteworthy that ConvolutionTR finds all TRs in a given sequence, while other popular software do not. We have further developed its implementation in order to identify SNPs and INDELs within tandem repeats. In this work, we report for the first time a comprehensive study of SNPs within tandem repeats of a model organism. Our results show that, due their widespread occurrence, SNPs within TR deserves attention; even though some of them may be explained by simple base-calling errors.