Enveomics collection

A toolbox for microbial genomics and metagenomics

FastA.extract.rb

Extracts a list of sequences and/or coordinates from multi-FastA files.

    See source code, Artistic license 2.0.

§ References

    Rodriguez-R & Konstantinidis, 2016, PeerJ Preprints.

§ Requirements

§ Usage

FastA.extract.rb --in in_file --out out_file [opts]

§ Arguments

Input file*
 --in in_file  Input FastA file.
Output file*
 --out out_file  Output FastA file.
Coordinates
 --coords string  Comma-delimited list of coordinates (mandatory unless -C is passed). The format of the coordinates is SEQ:FROM..TO or SEQ:FROM~LEN: SEQ: Sequence ID, or * (asterisk) to extract range from all sequences FROM: Integer, position of the first base to include (can be negative) TO: Integer, last base to include (can be negative) LEN: Length of the range to extract.
Coordinates file
 --coords-file in_file  File containing the coordinates, one per line. Each line must follow the format described for Coordinates.
Quiet
 --quiet   Run quietly (no STDERR output).
* Mandatory.