Enveomics collection

A toolbox for microbial genomics and metagenomics

BlastTab.pairedHits.rb

Identifies the best hits of paired-reads.

    See source code, Artistic license 2.0.

§ References

    Rodriguez-R & Konstantinidis, 2016, PeerJ Preprints.

§ Requirements

§ Usage

BlastTab.pairedHits.rb --blast in_file [opts] > out_file

§ Arguments

Blast*
 --blast in_file  Input Tabular BLAST file.
This script assumes that paired hits are next to each other. If this is not the case (e.g., because the blast was concatenated), you must sort the input before running this script.
Min score
 --minscore float  Minimum (summed) Bit-Score to consider a pair-match.
Best hits
 --besthits integer  Outputs top best-hits only (use 0 to output all the paired hits).
Orientation
 --orient select  Checks the orientation of the hit. Values are: 0, no checking; 1, same direction; 2, inwards; 3, outwards; 4, different direction (i.e., 2 or 3).
Sister prefix
 --sisprefix string  Sister read number prefix in the name of the reads. Escape characters as dots (\.), parenthesis (\(, \), \[, \]), other characters with special meaning in regular expressions (\*, \+, \^, \$, \|). This prefix allows regular expressions (for example, use ':|\.' to use any of colon or dot). Note that the prefix will not be included in the base name reported in the output.
Output file*
 out_file  Tab-delimited flat file, with the following columns: (1) Query ID (without the "sister" identifier). (2) Subject ID. (3) Bit score (summed from both sister reads). (4/5) From/To (subject) coordinates for read 1. (6/7) From/To (subject) coordinates for read 2. (8) Reads orientation (1: same direction, 2: inwards, 3: outwards). (9) Estimated insert size.
* Mandatory.