Posts

Showing posts with the label genecloud

Geneclouds: unconventional genetics data visualisation

Image
You have probably seen word clouds before - but have you tried with gene expression data?

I used the following bash script to process a DESeq spreadsheet from a a previous RNA-seq post. The script extracts the gene name and p-value of the genes with differential expression. I used awk to separate the up and down regulated genes into different files. The score used to inform the font size is the exponent of the p-value. So this works best when there are a lot of statistically significant genes p-values. The data looks like this:

AC011899.9:60
CAMK1D:54
GNG4:44
CGNL1:41
APCDD1:37
HSD11B1:33

Now go to the "advanced" tab of the Wordle page and paste in the data. Experiment, with the colours, layouts and be sure to increase the "maximum words" to get a real appreciation for the number of changes in your experiment. Here is an example I made using both the up and down regulated gene sets showing the effect of azacytidine on AML3 cells. The result is pretty amazing.
Scrip…