Hadoop Phylogenetic Tree

Hadoop Phylogenetic Tree is a package of multi-platform Java software tools, which aimed at constructing phylogenetic tree on large scale multiple similar DNA/RNA sequence alignment output .


As the tree construct process is a clustering process , Hadoop Phylogenetic tree employs one-pass clustering algorithm as the basic preprocessing algorithm for grouping data. After the preprocessing , the input sequences will be clustered into several clusters.


For improving the calculate speed , we deploy the algorithm on hadoop clusters . After the preprocessing , each cluster will execute the Neighbor Joining alogrithm to construct the sub tree. After the sub tree completed , chose the root of sub tree as the represent to construct the final tree . The final phylogenetic tree will also be created by the Neighbor Joining algorithm. The input file can be the sequences file without multiple sequences alignment(MSA) or the output file of MSA. You can use this tool in any OS with JVM.


Datasets

mitochondrial genomes

16s rRNA


All Rights Reserved Copyright @ 2015|Dr. Quan Zou
Last Modified in 2016/7/26