Posts

Showing posts with the label sra-toolkit

Download SRA data with Aspera command line utility

Image
Asperaconnect is NCBI's recommended data transfer client for large datasets >1GB. It uses the FASP protocol, here's a description from the NCBI guide.
"The FASP protocol from Aspera (www.asperasoft.com<http://www.asperasoft.com<UrlBlockedError.aspx>>) uses UDP, eliminating the latency issues seen with TCP, and provides bandwidth up to 1 gigabit per second (Gbps) to transfer data. It has a restart capability if data transfer is interrupted midstream and is well behaved, so if there is other data traffic on your network connections, it will back off in order to avoid starving other protocols. We have seen effective throughput up to 800 megabits per second (Mbps) to a single site.
The fasp protocol uses UDP port 33001-33009 for data transfer and you may need to contact your IT security staff if this port is not open to NCBI through your institutional firewalls.
NCBI is implementing Aspera for two use cases, occasional users who download files for direct use (Aspe…

SRA toolkit tips and workarounds

Image
The Short Read Archive (SRA) is the main repository for next generation sequencing (NGS) raw data. Considering the sheer rate at which NGS is generated (and accelerating), the team at NCBI should be congratulated for providing this service to the scientific community. Take a look at the growth of SRA:

SRA however doesn't provide directly the fastq files that we commonly work with, they prefer the .sra archive that require specialised software (sra-toolkit) to extract. Sra-toolkit has been described as buggy and painful; and I've had my frustrations with it. In this post, I'll share some of my best tips sra-toolkit tips that I've found.

Get the right version of the software and configure it When downloading, make sure you download the newest version from the NCBI website (link). Don't download it from GitHub or from Ubuntu software centre (or apt-get), as it will probably be an older version. In the binary directory (looks like /path/to/sratoolkit.2.4.3-ubuntu64/bin…