MyTaxa represents a new algorithm that extends the Average Amino Acid Identity
(AAI) concept (Konstantinidis and Tiedje, PNAS 2005) to identify the taxonomic
affiliation of a query genome sequence or a sequence of a contig assembled from
a metagenome, including short sequences (e.g., 100-1,000nt long), and to classify
sequences representing novel taxa at three levels (whenever possible), i.e., species,
genus and phylum. MyTaxa can assign a larger number of sequences and with
higher accuracy compared to other tools available for the same purposes. This
is largely attributed to the fact that MyTaxa considers all genes present in an
unknown (query) sequence as classifiers and quantifies the classifying power of
each gene using predetermined weights, which are derived from the analysis of
orthologs of the gene from all available complete genomes. The weights are for
i) how well the orthologs of the gene in question resolve the classification of the
corresponding genomes at a given taxonomic level (species, genus, etc.) based on
their degree of sequence conservation (for instance, the 16S rRNA gene resolves
well at the genus and phylum levels but poorly at the species level); and ii) how
frequently the ortholog gene phylogeny of the genomes compared deviates from the
species phylogeny, the latter being approximated by the AAI tree, due primarily to
horizontal gene transfer (HGT). MyTaxa also reports the statistical probability of the
taxonomic assignment based on the Maximum Likelihood analysis.
You can use MyTaxa on-line (see an example),
or download MyTaxa stand-alone.
If you use MyTaxa, please cite:
- Reference
- Luo C, Rodriguez-R LM & Konstantinidis KT (2014). MyTaxa: an advanced taxonomic classifier for genomic and metagenomic sequences. Nucl. Acids Res. 42 (8): e73.