Data analysis step 7: Fast MDS plot

September 12, 2014

We will continue our series in the analysis of our azacitidine treated AML3 cell RNA-seq gene expression data set by generating a multidimensional scaling plot. This is a potentially useful way of showing variability in datasets, especially when the number of samples is large. Trawling the blogs, I found a really quick and easy way to do this in R (thanks Michael Dondrup@BioStars) that can be used to analyse the count matrix.


x<-scale(read.table("CountMatrix.xls", row.names=1, header=TRUE))

pdf("MDSplot.pdf")

plot(cmdscale(dist(t(x))), xlab="Coordinate 1", ylab="Coordinate 2", type = "n") ; text(cmdscale(dist(t(x))), labels=colnames(x), )

dev.off()

Multidimensional scaling (MDS) plot for public gene expression data (GSE55123).

The closer the labels are together, the more similar the samples are. So it is good to see that the untreated samples are clearly separated from the azacitidine treated samples. UNTR1 is a bit far away from the other two replicates as suggested by the heatmap in the previous post.

Search This Blog

Genome Spot

Data analysis step 7: Fast MDS plot

Popular posts from this blog

Data analysis step 8: Pathway analysis with GSEA

Uploading data to GEO - which method is faster?

Generate an RNA-seq count matrix with featureCounts