Mass download from google drive using R

Google drive is great for sharing documents and other small files but it's definitely not suited to moving  many large files around. For example I just received 170 fastq files that are about 200 MB in size. If you use the browser to download the whole folder, the web app will zip the contents for you which will take a LOOONG time. Alternatively you can download each and every one of those files one by one, which is annoying and prone to human error.

You can insist to your collaborators to transfer in a different way, but there are not that many user-friendly and economic approaches. Your biologist collaborators probably won't be able to use rsync to get the data to you safely. And fast convenient tools for moving around large files like Hightail cost a lot of money.

A good solution to this problem is to use the R package googledrive which enables command line automation of tasks that might take a long time manually. The package vignette has a good overview of the main commands but lacks the most important application for me: bulk downloads.

Let's say you have a folder of data files and you want to download them all. Try the following:

library(googledrive)

myfiles <- drive_ls(path="~/myfoldername")

sapply(myfiles$id , drive_download)

It will download all those files into the current working directory. The speeds are also significantly faster than in the browser.

Simple!


Popular posts from this blog

Data analysis step 8: Pathway analysis with GSEA

Extract properly paired reads from a BAM file using SamTools