ogs.extract.rb
Extracts sequences of Orthology Groups (OGs) from genomes (proteomes).
See source code, Artistic license 2.0.
§ References
Rodriguez-R & Konstantinidis, 2016, PeerJ Preprints.
§ Requirements
§ Usage
ogs.extract.rb --in in_file --out out_file --seqs in_file [opts]
§ Arguments
- Input file*
--in in_file
Input file containing the OGs (as generated by ogs.mcl.rb).- Output file*
--out out_file
Output directory where to place extracted sequences.- Sequences*
--seqs in_file
Path to the proteomes in FastA format, using '%s' to denote the genome. For example: /path/to/seqs/%s.faa.- Core
--core float
Use only OGs present in at least this fraction of the genomes. To use only the strict core genome*, use --core 1.* To use only the unus genome (OGs with exactly one gene per genome), use: --core 1 --duplicates 1.- Duplicates
--duplicates integer
Use only OGs with less than this number of in-paralogs in a genome. To use only genes without in-paralogs*, use --duplicates 1.* To use only the unus genome (OGs with exactly one gene per genome), use: --core 1 --duplicates 1.- Per genome
--per-genome
If set, the output is generated per genome. By default, the output is per OG.- Prefix
--prefix
If set, each sequence is prefixed with the genome name (or OG number, if --per-genome) and a dash.- Rand
--rand
Get only one gene per genome per OG (random) regardless of in-paralogs. By default all genes are extracted.- First
--first
Get only one gene per genome per OG (first) regardless of in-paralogs. By default all genes are extracted. Takes precedence over --rand.- Quiet
--quiet
Run quietly (no STDERR output).