Abstract Detail



Phylogenomics

Miller, Joe [1].

REPHINE: REcursive PHylogenetic INferencE.

At first instance building DNA sequence phylogenies appears as a two-step process, first a statement of homology is developed in a multiple sequence alignment (MSA), then a model of evolution, based on the homology assessment is used to infer a phylogeny.  In reality phylogenetic inference includes several complicated steps after generating data including combining data from different parts of the genome, deciding among alignment methods, masking non-homologous areas, model choices, tree building and determining a suitable measure of confidence of the results. In building large trees workers are combining in a single study, both ancient and recent divergence events.  Using this single alignment across a deep phylogenetic study potentially adds homoplasy due to use of non-homologous alignment of rapidly evolving DNA sequence.  These rapidly evolving DNA, which could resolve species at the tips of the tree, can’t be used optimally because different clades will have different optimal MSAs. We present a new modular workflow, REPHINE (REcursive PHylogenetic INferencE), that recursively builds topologies based on optimized sub-alignments.  REPHINE integrates DNA alignment, masking of non-homologous sequence sites and tree building into a novel recursive workflow. REPHINE builds an initial tree based on the entire dataset then assesses the reliability of the result. Poorly supported clades are reanalyzed with a new optimal MSAs and phylogenetic analysis that progress recursively from the root to the tips of the tree. For each clade an optimized MSA is used for its phylogenetic reconstruction.  A super-tree reconstruction method is used to build final tree using only the optimal subtrees.  The final tree is used as a constraint to the original MSA to put branch lengths on the final tree. The modular format of REPHINE could readily be altered to accommodate other programs for alignment, masking, and tree building as well as to add model selection.  The modular nature of REPHINE integrates well with NGS data as it allows calculations to be spread out over many computing nodes parallel.  REPHINE offers a new paradigm in phylogenetics that yields a final tree where all nodes are analyzed in accordance with the inferred optimal fit between the tree, the model, and the data.


1 - National Science Foundation, Office of International Science and Engineering, 2415 Eisenhower Avenue, Alexandria, VA, 22314, USA

Keywords:
Phylogenetics
DNA alignments.

Presentation Type: Oral Paper
Number:
Abstract ID:348
Candidate for Awards:None


Copyright © 2000-2018, Botanical Society of America. All rights reserved