MyTaxa

Assign taxonomy to metagenomic fragments

MyTaxa represents a new algorithm that extends the Average Amino Acid Identity (AAI) concept (Konstantinidis and Tiedje, PNAS 2005) to identify the taxonomic affiliation of a query genome sequence or a sequence of a contig assembled from a metagenome, including short sequences (e.g., 100-1,000nt long), and to classify sequences representing novel taxa at three levels (whenever possible), i.e., species, genus and phylum. MyTaxa can assign a larger number of sequences and with higher accuracy compared to other tools available for the same purposes. This is largely attributed to the fact that MyTaxa considers all genes present in an unknown (query) sequence as classifiers and quantifies the classifying power of each gene using predetermined weights, which are derived from the analysis of orthologs of the gene from all available complete genomes. The weights are for i) how well the orthologs of the gene in question resolve the classification of the corresponding genomes at a given taxonomic level (species, genus, etc.) based on their degree of sequence conservation (for instance, the 16S rRNA gene resolves well at the genus and phylum levels but poorly at the species level); and ii) how frequently the ortholog gene phylogeny of the genomes compared deviates from the species phylogeny, the latter being approximated by the AAI tree, due primarily to horizontal gene transfer (HGT). MyTaxa also reports the statistical probability of the taxonomic assignment based on the Maximum Likelihood analysis.

You can use MyTaxa on-line (see an example), or download MyTaxa stand-alone. If you use MyTaxa, please cite:

Reference
Luo C, Rodriguez-R LM & Konstantinidis KT (2014). MyTaxa: an advanced taxonomic classifier for genomic and metagenomic sequences. Nucl. Acids Res. 42 (8): e73.