Geneclouds: unconventional genetics data visualisation
You have probably seen word clouds before - but have you tried with gene expression data?
I used the following bash script to process a DESeq spreadsheet from a a previous RNA-seq post. The script extracts the gene name and p-value of the genes with differential expression. I used awk to separate the up and down regulated genes into different files. The score used to inform the font size is the exponent of the p-value. So this works best when there are a lot of statistically significant genes p-values. The data looks like this:
AC011899.9:60
CAMK1D:54
GNG4:44
CGNL1:41
APCDD1:37
HSD11B1:33
Now go to the "advanced" tab of the Wordle page and paste in the data. Experiment, with the colours, layouts and be sure to increase the "maximum words" to get a real appreciation for the number of changes in your experiment. Here is an example I made using both the up and down regulated gene sets showing the effect of azacytidine on AML3 cells. The result is pretty amazing.
| awk '{printf "%4.3e\t%s\n", $3 , $2}' \
| sed 's/e-/@/' | cut -d '@' -f2- | awk '{print $2":"$1}' > ups.txt
I used the following bash script to process a DESeq spreadsheet from a a previous RNA-seq post. The script extracts the gene name and p-value of the genes with differential expression. I used awk to separate the up and down regulated genes into different files. The score used to inform the font size is the exponent of the p-value. So this works best when there are a lot of statistically significant genes p-values. The data looks like this:
AC011899.9:60
CAMK1D:54
GNG4:44
CGNL1:41
APCDD1:37
HSD11B1:33
Script used to process data
awk '$6>0 && $8<0.05 {print $1,$8}' DESeq.xls \| awk '{printf "%4.3e\t%s\n", $3 , $2}' \
| sed 's/e-/@/' | cut -d '@' -f2- | awk '{print $2":"$1}' > ups.txt
awk '$6<0 && $8<0.05 {print $1,$8}' DESeq.xls \
| awk '{printf "%4.3e\t%s\n", $3 , $2}' \
| sed 's/e-/@/' | cut -d '@' -f2- | awk '{print $2":"$1}' > dns.txt
| awk '{printf "%4.3e\t%s\n", $3 , $2}' \
| sed 's/e-/@/' | cut -d '@' -f2- | awk '{print $2":"$1}' > dns.txt