Enveomics collection

A toolbox for microbial genomics and metagenomics

ani.rb

Calculates the Average Nucleotide Identity between two genomes.

    See source code, Artistic license 2.0.

§ References

    Konstantinidis & Tiedje, 2005, PNAS; Altschul et al, 2000, JMB (BLAST); Kent WJ, 2002, Genome Res (BLAT); Rodriguez-R & Konstantinidis, 2016, PeerJ Preprints.

§ Requirements

§ Usage

ani.rb --seq1 in_file --seq2 in_file [opts]

§ Arguments

Sequence 1*
 --seq1 in_file  FastA file containing the genome 1.
Alternatively, you can supply an NCBI-acc with the format ncbi:CP014272 instead of files.
Sequence 2*
 --seq2 in_file  FastA file containing the genome 2.
Alternatively, you can supply an NCBI-acc with the format ncbi:AE005174 instead of files.
Window
 --win integer  Window size in the ANI calculation (in bp).
Step
 --step integer  Step size in the ANI calculation (in bp).
Length
 --len integer  Minimum alignment length (in bp).
Identity
 --id float  Minimum alignment identity (in %).
Hits
 --hits integer  Minimum number of hits.
No correction
 --nocorrection   Report values without post-hoc correction.
Min ACTG
 --min-actg float  Minimum fraction of ACTGN in the sequences before assuming proteins.
Executables
 --bin in_dir  Directory containing the binaries of the search program.
Program
 --program select  Search program to be used.
Make sure that you have installed the search program you want to use. If you have downloaded the program, but it's not installed, please use the Executables option above.
Threads
 --threads integer  Number of parallel threads to be used.
SQLite3 DB
 --sqlite3 out_file  Path to the SQLite3 database to create (or update) with the results.
Name 1
 --name1 string  Name of Sequence 1 to use in SQLite3 DB. By default it's determined by the filename.
Name 2
 --name2 string  Name of Sequence 2 to use in SQLite3 DB. By default it's determined by the filename.
Don't save regions
 --no-save-regions   Don't save the fragments in the SQLite3 database.
Don't save RBM
 --no-save-rbm   Don't save the reciprocal best matches in the --sqlite3 database.
Lookup first
 --lookup-first   Indicates if the ANI should be looked up first in the database. Requires SQLite3 DB, Auto, Name 1, and Name 2. Incompatible with Result, Tab, and Out.
Precision
 --dec integer  Decimal positions to report.
Out
 --out out_file  Saves a file describing the alignments used for two-way ANI.
Result
 --res out_file  Saves a file with the final results.
Tab
 --tab out_file  Saves a file with the final two-way results in a tab-delimited form. The columns are (in that order): ANI, standard deviation, fragments used, fragments in the smallest genome.
Auto
 --auto   ONLY outputs the ANI value in STDOUT (or nothing, if calculation fails).
Quiet
 --quiet   Run quietly (no STDERR output).
* Mandatory.