Get the newest Reactome gene sets for pathway analysis
For first-pass pathway analysis I find Reactome to be the most useful database of gene sets for biologists to understand. For a long time I have been using Reactome gene sets as deposited to the GSEA/MSigDB website. Recently a colleague pointed me to the gene matrix file offered directly on the Reactome webpage (Thanks Dr Okabe). The latest Reactome gene set matrix file (gmt) can be found at this link https://reactome.org/download/current/ReactomePathways.gmt.zip There are some differences. Firstly there are more gene sets in the one from the Reactome webpage (accessed 2018-05-09) $ wc -l *gmt 674 c2.cp.reactome.v6.1.symbols.gmt 2022 ReactomePathways.gmt 2696 total Secondly, there are more genes included in one or more gene sets: $ cut -f3- c2.cp.reactome.v6.1.symbols.gmt | tr '\t' '\n' | sort -u | wc -l 6025 $ cut -f3- ReactomePathways.gmt | tr '\t' '\n' | sort -u | wc -l 10852 And overall the...