Posts

Showing posts with the label fastq sequence quality control

Paired-end fastq quality control with Skewer

Image
Quality trimming and adapter clipping paired end reads is a tricky business. For paired-end alignments to work, the order of the sequences in both files (forward and reverse) needs to be retained. While there are plenty of read trimmers available open-source (think FASTX-toolkit, SeqTK, CutAdapt, etc), I haven't found many that:

Retain pairing infoRun parallel and are fastAre easy to set up and runHave good docs
Then I fell in love with Skewer. It smashed through 14.5 million gzipped read pairs, doing adapter clipping and quality trimming in two minutes on my 8-core workstation. It auto-detects quality encoding so you can safely analyse any Illumina data. Awesome!

$ skewer -t 8 -q 20 SRR634969.sra_1.fastq.gz SRR634969.sra_2.fastq.gz
Parameters used:
-- 3' end adapter sequence (-x):AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC
-- paired 3' end adapter sequence (-y):AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTA
-- maximum error ratio allowed (-r):0.100
-- maximum indel error ratio allowed (-d):0.030
-- e…