Enveomics collection

A toolbox for microbial genomics and metagenomics

FastA.subsample.pl

Subsamples a set of sequences.

    See source code, Artistic license 2.0.

§ References

    Rodriguez-R & Konstantinidis, 2016, PeerJ Preprints.

§ Requirements

§ Usage

FastA.subsample.pl [opts] in_file

§ Arguments

Fraction
 -f string  Fraction of the library to be sampled (as percentage). It can include several values (separated by comma), as well as ranges of values in the form 'from-to/by'. For example, the -f value 1-5/1,10-50/10,75,99 will produce 12 subsamples with expected fractions 1%, 2%, 3%, 4%, 5%, 10%, 20%, 30%, 40%, 50%, 75%, and 99%.
Replicates
 -r integer  Number of replicates per fraction.
Out base
 -o out_file  Prefix of the output files to be created. The output files will have a suffix of the form '.fraction-replicate.fa', where 'fraction' is the percentage sampled and 'replicate' is an increasing integer for replicates of the same fraction. By default: Path to the input file.
Force
 -F   Force overwriting output file(s).
Zeroes
 -z   Include leading zeroes in the numeric parts of the output files (e.g., file.002.50-01.fa instead of file.2.50-1.fa), so that alphabetic sorting of files reflects the sampled fraction.
Quiet
 -q   Run quietly.
Input file*
 in_file  Input multi-FastA file(s).
* Mandatory.