P1007 Development of a Computational Pipeline to Detect Positive Selection

Leandro C. Cintra , Brazilian Agricultural Research Corporation, Campinas-SP, Brazil
Jorge A. Hongo , Unicamp - University of Campinas
Francisco P. Lobo , EMBRAPA - Brazilian Agricultural Research Corporation, Campinas, Brazil
Genes in which new mutations are likely to confer selective advantage tend to evolve rapidly and are said to be evolving under positive selection rather than under neutral or purifying selection. Classic examples of such genes are the ones involved in immunity and defense (in host genomes) or in pathogen-related processes (in parasites genomes), presumably because new mutations help these organisms in the evolutionary arms race that occurs in the host-parasite biological interaction. Therefore, the search for positive selection can detect several genes directly related to the organism lifestyle and, therefore, could be used as a tool for gene prospection. In this study we integrated several third-party software and code generated by our team to develop a pipeline which detects positive selection in groups of homologous genes. Our pipeline starts with multi-fasta files of predicted genes from distinct genomes. With this data our pipeline 1) translates these genes into protein sequences; 2) infer the groups of homologous using OrthoMCL; 3) align each group of homologous using MUSCLE; 3) align the codons using the information of protein alignment; 4) generate phylogenetic trees using phylip; 5) combine the codon alignment and phylogenetic trees into PAML to detect positive selection. All programs used in this pipeline are fully controlled through a main configuration file. The pipeline is entirely developed and is currently being tested by reanalyzing data from previous studies that already identified genes under positive selection in host and parasite genomes.