Posts

Showing posts with the label edgeR

Incorporate dee2 data into your R-based RNA-seq workflow

Dee2.io is a portal for accessing gene expression data derived from public RNA-seq datasets. So far there are over 400k available datasets and its growing every day. While there are existing databases of such as Expression Atlas, Recount2 and ARCHS4, dee2.io offers a number of unique benefits. For instance, dee2 includes gene-wise counts fron STAR as well as transcript-wise quantifications from Kallisto. There are a few ways you can access these data. Firstly, there is a nice web interface that is mobile friendly. Secondly, there are data dumps available if you are running a large scale analysis.  But the purpose of this post is to demonstrate the improved R interface in action together with SRAdbv2 and statistics with edgeR and DESeq. The official documentation is available on GitHub.
Getting started This tutorial provides a walkthrough for how to work with dee2 expression data, starting with dataset searches, obtaining the data from dee2.io and then performing a differential analysi…

User friendly RNA-seq differential expression analysis with Degust

Image
There is a need to make bioinformatics tools more user friendly and accessible to a wider audience. We have seen that Galaxy, GEO2RGenevestigator and GenePattern have each developed a huge following in the molecular biology community, and this trend will continue with introduction of new RNA-seq analysis tools. Previously, I posted about differential gene expression analysis of RNA-seq performed by the DEB online tool. In this post, I introduce Degust, an online app to analyse gene expression count data and determine which genes are differentially expressed. Degust was written by David R. Powell (@d_r_powell) and was Supported by Victorian Bioinformatics Consortium, Monash University and VLSCI's Life Sciences Computation Centre.

In this test, I'll be using the azacitidine mRNA-seq data set that I have previously analysed. To make the count matrix, I used featureCounts.

First step in the process is to your RNA-seq count data. It can be done in tab or comma separated formats. …

Data analysis step 5: Differential analysis of RNA-seq

Image
So far in this RNA-seq analysis series of posts, we've done a whole bunch of primary analysis on GSE55125 and now we are at the stage where we can now perform a statistical analysis of the count matrix we generated in the last post and look at the genes expression differences caused by Azacitidine.

For this type of analysis we could load our data into R and perform the analysis ourselves, but for a simple experiment design with 2 sample groups in triplicate without batch effects or sample pairing I want to share with you an easy solution. DEB is a online service provided by the  Interdisciplinary Center for Biotechnology Research (ICBR) University of Florida that will analyse the count matrix for you with either DESeq, edgeR or baySeq. Their Bioinformation paper is also worth a look.

As with all aspects of bioinformatics, format is critical. You need to follow the specified format exactly. Here is what the head of my count matrix looks like:

geneUNTR1UNTR2UNTR3AZA1AZA2AZA3
ENSG000…