aai.rb
Calculates the Average Amino acid Identity between two genomes.
See source code, Artistic license 2.0.
§ References
Konstantinidis & Tiedje, 2005, JBac; Altschul et al, 2000, JMB (BLAST); Kent WJ, 2002, Genome Res (BLAT); Buchfink B, Xie C, Huson D, 2015, Nat Meth (Diamond); Rodriguez-R & Konstantinidis, 2016, PeerJ Preprints.
§ Requirements
§ Usage
aai.rb --seq1 in_file --seq2 in_file [opts]
§ Arguments
- Sequence 1*
--seq1 in_file
FastA file containing the genome 1 (proteins).Alternatively, you can supply the NCBI-acc of a genome (nucleotides) with the format ncbi:CP014272 instead of files.- Sequence 2*
--seq2 in_file
FastA file containing the genome 2.Alternatively, you can supply the NCBI-acc of a genome (nucleotides) with the format ncbi:NC_004337 instead of files.- Length
--len integer
Minimum alignment length (in aa).- Length fraction
--len-fraction float
Minimum alignment length as a fraction of the shorter sequence (range 0-1).- Identity
--id float
Minimum alignment identity (in %).- Bit-score
--bitscore float
Minimum bit score (in bits).- Hits
--hits float
Minimum number of hits.- Nucleotides
--nucl
The input sequences are nucleotides (genes), not proteins.- Max ACTG
--max-actg float
Maximum fraction of ACTGN in the sequences before assuming nucleotides.- Executables
--bin in_dir
Path to the directory containing the binaries of the search program.- Program
--program select
Search program to be used.Make sure that you have installed the search program you want to use. If you have downloaded the program, but it's not installed, please use the Executables option above.- Threads
--threads integer
Number of parallel threads to be used.- SQLite3 DB
--sqlite3 out_file
Path to the SQLite3 database to create (or update) with the results.- Name 1
--name1 string
Name of Sequence 1 to use in SQLite3 DB. By default determined by filename.- Name 2
--name2 string
Name of Sequence 2 to use in SQLite3 DB. By default determined by filename.- Don't save RBM
--no-save-rbm
Don't save the reciprocal best matches in the --sqlite3 database.- Lookup first
--lookup-first
Indicates if the AAI should be looked up first in the database. Requires SQLite3 DB, Auto, Name 1, and Name 2. Incompatible with Result, Tab, Out, and RBM.- Precision
--dec integer
Decimal positions to report.- RBM
--rbm out_file
Saves a file with the reciprocal best matches.- Out
--out out_file
Saves a file describing the alignments used for two-way AAI.- Result
--res out_file
Saves a file with the final results.- Tab
--tab out_file
Saves a file with the final two-way results in a tab-delimited form. The columns are (in that order): AAI, standard deviation, proteins used, proteins in the smallest genome.- Auto
--auto
ONLY outputs the AAI value in STDOUT (or nothing, if calculation fails).- Quiet
--quiet
Run quietly (no STDERR output).