2016 wrap-up

What a rollercoaster.

On the good side, we've seen advances in sequencing methods including improvements in Nanopore sequencing and single cell methods becoming more common. We also saw 10x genomics come to the party with an emulsion PCR approach that can be applied to produce synthetic long reads or single cell barcoding. There were major results from larger cohort studies exemplified by ExAC, which is revealing more about human genetic variation, while Blueprint and GTEx are revealing more about determinants of gene regulation. The #openaccess is growing rapidly in the bioinformatics community, along with the growth of preprint popularity which I hope is adopted more widely in the biomedical sciences.

On the not so good side, Illumina has made made no new instrument announcements nor any substantial updates to existing sequencing systems. There were a few papers (example) describing the methylation EPIC array announced in 2015. Prices for Illumina reagents continue to increase, especially in Australian dollars, which is starting to erode some of the goodwill in researchers.

Closer to home, I had quite a mixed year in science.

You'll recall that we announced Digital Expression Explorer in October of 2015. It was the first real attempt at mining the entire contents of transcriptome sequencing in SRA and making it a useful resource for non-bioinformaticians. While the site received alot of positive feedback from the community, our submission to Bioinformatics didn't go well. There were a few sticking points with the reviewers, the most important ones being the process of dataset QC and filtereing as well as the use of single reads even when the second read was present. We made major updates to the site to please them but the reviewers didn't really comprehend that when dealing with >10^6 datasets there will not be a pipeline that works perfectly for 100% of them. The rejection from Bioinformatics was bitter, but I'm not a quitter - there will be a new version landing in mid 2017 that will have more advanced features including transcript level quantification and customisable gene sets to choose from (Ensembl and RefSeq). While there have been similar efforts like ReCount, Rail-RNA, BioXpress and EBI Expression Atlas, noone has yet to release as many curated RNA-seq datasets from as many species as we have.

This year saw my first pre-print, which I have received quite a bit of feedback, some good and some bad. There might be a revision to the paper in late 2017 and eventually submission to a journal.

There was a small win for our group's paper on small RNA mappers, which was a huge effort. In each round of review, the reviewers requested more analyses and so the published paper is about twice as long as the initial submission. The original submission was much better ;)

We had a smashing success with the publication of the Gene Name Errors paper in Genome Biology. It was really a wake-up call to the field that we should be more careful with the data we publish and should be adopting better data analysis practises. It also shined a light on many of the issues that are emerging in supplementary content. As a whole, scientists rely strongly on supplementary materials but they are given little, if any scrutiny.

Throughout the the year there were middle authorships in papers related to our group's work on epilepsy, metabolic syndrome, stem cell differentiation and heart failure. These co-authorships reflect my bread-and-butter work: contribution in the generation and analysis of sequencing data in multidisciplinary collaborations.

This blog received its 200,000 visit! Although this achievement is somewhat diminished by huge amounts of bot traffic.
Periodic spikes in web visits are indicative of bot activity

2017: A new beginning for all of us*

Many of us lost someone this year. Could be a favorite musician, artist or someone close to us. Through loss we realise the fragility of life and the importance of not wasting our time here. 2017 is likely to throw us as much adversity to us as 2016 did, so take this chance to take stock and live this year as if it were your last. If 2016 taught me anything, it's that we should be working on stuff that interests people and dump obscure projects that aren't going anywhere (you'll have to be honest/blunt with collaborators). Reflect on the good things around you this year. I welcomed my newborn daughter in October and saw my 4yo son develop in so many ways. We even moved into a place of our own.

A few weeks ago, our group (El-Osta) moved from Baker IDI to Monash Uni situated a few doors down at the Alfred Hospital campus. Initially I was unenthused about this, but the change does allow the group to reinvigorate and take on new opportunities that would not be possible at Baker IDI. The move killed my only PI project, but we'll be seeking new funds to replace the lost Baker IDI internal grant.

I can give a few clues as to what I'll be working on in 2017: A new version of DEE; a bunch of new gene set resources; new work on the pitfalls of inbred mouse models; MinION data from our lab; single cell RNA-seq data from our lab; and perhaps a paper called "Supplementary data horror stories".

I'm looking forward to making 2017 count, its a new start, a chance to do better.

*Apologies to Left of Field Record for paraphrasing this record title

Popular posts from this blog

Data analysis step 8: Pathway analysis with GSEA

A selection of useful bash one-liners

HISAT vs STAR vs TopHat2 vs Olego vs SubJunc