SRA toolkit tips and workarounds
The Short Read Archive (SRA) is the main repository for next generation sequencing (NGS) raw data. Considering the sheer rate at which NGS is generated (and accelerating), the team at NCBI should be congratulated for providing this service to the scientific community. Take a look at the growth of SRA: Growth of SRA data (http://www.ncbi.nlm.nih.gov/Traces/sra/i/g.png) SRA however doesn't provide directly the fastq files that we commonly work with, they prefer the .sra archive that require specialised software ( sra-toolkit ) to extract. Sra-toolkit has been described as buggy and painful; and I've had my frustrations with it. In this post, I'll share some of my best tips sra-toolkit tips that I've found. Get the right version of the software and configure it When downloading, make sure you download the newest version from the NCBI website ( link ). Don't download it from GitHub or from Ubuntu software centre (or apt-get), as it will probably be an older