IMPORTANT NOTE: Our servers are currently experiencing an unexpected
downtime. We're working to resolve the issue as soon as possible. In the
interim no jobs are being processed, and submitted tasks will likely fail.
We apologize for any inconveniences this may cause.
Any collection of genomes (for ANI) or proteomes (for AAI). It's important
that you include only genomes OR only proteomes. Once you have the
different files (one per organism), simply place them in the same folder
and build an archive such as .zip, .tar, .tar.gz, or .tar.bz2. Draft genomes
are accepted, although they should be good advanced drafts for accurate
estimations (e.g., >80% complete).
How many genomes can I upload?
We're currently limiting this service to 50 genomes. Note that the
running time grows quadratically with the number of genomes, and large
collections may take over a day to finish running.
How should I upload draft genomes?
You don't have to treat draft genomes differently. Just use one file per
genome in the archive.
§ Output
ANI/AAI matrix plot
The graphic output is a symmetrical matrix with ANI or AAI values. Whenever
possible, consistent groups at the species level are highlighted with red
rectangles (≥95% ANI or ≥90% AAI). You can download a high-resolution
version of this image in the link below, as well as the list of values, and
the distance matrix (as raw text). Note that the matrix may contain several
zeroes (100s in the distance matrix). These are values that were below the
accurate range, typically for ANI between organisms of different genera
(<80% ANI). If your matrix contains too many of such values, the
clustering may not be accurate, and you should use AAI (proteins) instead.
For more details, see
Goris et al 2007.
Distance clustering plot
The matrix above is used for hierarchical clustering of the input genomes,
and the resulting tree is displayed. Note that this is simply a clustering,
and it shouldn't be assumed to be a phylogenetic tree (although in many
cases it may correlate well).
List of ANI/AAI values
The list of ANI or AAI values is a raw text tab-delimited table with header.
Each row corresponds to a pair-wise comparison, and the columns are:
SeqA
ID of the first genome.
SeqB
ID of the second genome.
ANI/AAI
Value of ANI or AAI (%).
SD
Standard deviation of identity (%) between reciprocal best matching
fragments (ANI) or proteins (AAI).
N
Number of reciprocal best matches found.
Omega
Minimum number of fragments (ANI) or proteins (AAI) between the two
genomes. This is the maximum possible number that N can
take.
Frx
N/Omega ratio (%); the percentage of the genome
shared.