Pathway analysis: DAVID versus GSEA
Pathway analysis has become such a common procedure in bioinformatics, especially in studying gene expression. If you look at a survey of recent papers it seems that there is a bunch of ways that it can be done. In this post, I'll discuss the differences between two commonly used tools; DAVID and GSEA . The concepts behind the two algorithms are very different. DAVID determines overlaps between user-supplied gene lists and the curated databases, looking for overlaps that are bigger than that expected by random chance. You can improve the accuracy of the algorithm by providing a background file that contains all genes that were considered/detected in the experiment. The user-supplied gene sets are normally generated by selecting genes that pass a significance threshold. The DAVID procedure is similar to others available such as Ingenuity , AmiGO and GeneGO. The selection of significance values is largely arbitrary, but it is common to set the threshold at FDR adjusted p<0.05