Reverse complement a sequence

Starting with a tab separated file

Use the reverse utility and sed to generate the reverse complement of a tabulated sequence file.

cut -f2 file | rev | sed 'y/CGAT/GCTA/'

You could also make the reverse complemented sequence in lower case using sed 'y/CGAT/gcta/'
If you need the sequence name, paste it back.

cut -f2 file | rev | sed 'y/CGAT/GCTA/' | paste file -

Starting with a fasta

Now starting with a fasta file. Avoid tabs and special characters in the sequence name.

paste - - < file > tmp
cut -f2 tmp | rev | sed 'y/CGAT/GCTA/' \
| paste tmp - | cut -f1,3 | tr '\t' '\n'


Comments

Popular posts from this blog

Data analysis step 8: Pathway analysis with GSEA

Installing R-4.0 on Ubuntu 18.04 painlessly

EdgeR or DESeq2? Comparing the performance of differential expression tools