Enveomics collection

A toolbox for microbial genomics and metagenomics

HMM.essential.rb

Finds and extracts a collection of essential proteins suitable for genome completeness evaluation and phylogenetic analyses in Archaea and Bacteria.

    See source code, Artistic license 2.0.

§ References

    Eddy, 2011, PLoS CB; Dupont et al, 2012, ISME J; Rodriguez-R et al, 2014, ISME J; Lee, 2019, Bioinf; Eren et al, 2015, PeerJ; Rodriguez-R & Konstantinidis, 2016, PeerJ Preprints.

§ Requirements

§ Usage

HMM.essential.rb --in in_file [opts]

§ Arguments

Input file*
 --in in_file  FastA file containing all the proteins in the genome.
Collection
 --collection string  Reference collection of essential proteins to use. One of: dupont_2012 (default, Dupont et al 2012 modified by Rodriguez-R et al 2015), or lee_2019 (Lee 2019 modified by Eren et al 2015).
Output file
 --out out_file  FastA file with the translated essential genes. By default the file is not produced.
Per model
 --per-model out_file  Prefix of translated genes in independent files with the name of the model appended. By default files are not produced.
Report
 --report out_file  Path to the report file. By default, the report is sent to the STDOUT.
HMMsearch output
 --hmm-out out_file  Save HMMsearch output in this file. By default, not saved.
Out file
 out_file   Save the aligned proteins in this file. By default, not saved.
Bacteria
 --bacteria   If set, ignores models typically missing in Bacteria.
Archaea
 --archaea   If set, ignores models typically missing in Archaea.
Genome eq
 --genome-eq   If set, ignores models not suitable for genome-equivalents estimations. See Rodriguez-R et al, 2015, ISME J 9(9):1928-1940.
Rename
 --rename string  If set, renames the sequences with the string provided and appends it with pipe (|) and the gene name (except in --per-model files).
No stats
 --no-stats   If set, no statistics are reported on genome evaluation.
No genes
 --no-genes   If set, statistics won't include the lists of missing/multi-copy genes.
Metagenome
 --metagenome   If set, it allows for multiple copies of each gene and turns on metagenomic report mode.
List models
 --list-models   If set, it only lists the models and exits. Compatible with 'Archaea', 'Bacteria', 'Genome eq', and 'Quiet'; ignores all other parameters.
Bin
 --bin in_dir  Directory containing the binaries of HMMer 3.0+.
Model file
 --model-file in_file  External file containing models to search.
Threads
 --threads integer  Number of parallel threads to be used.
Quiet
 --quiet   Run quietly (no STDERR output).
* Mandatory.