What is a CpG shore and how to I get them all?
CpG shores are the regions immediately flanking and up to 2 kbp away from CpG islands. These regions are interesting because methylation they are variably methylated in cancer and development (see reference).
To identify CpG shores, you first need a list of coordinates of CpG islands. If you have a linux computer with Emboss tools installed, you can determine CpG island positions with the cpgplot command. (if you don't have Emboss installed you can get CpG island coordinates from the UCSC table browser just be wary of chromosome naming ie, "chr1" vs "1")
cpgreport -score 17 -outfile genome.fa.cpg -outfeat /dev/stdout genome.fa | head
Will produce a file like this one:
##gff-version 3
##sequence-region 1 1 59999934
#!Date 2015-04-10
#!Type DNA
#!Source-version EMBOSS 6.6.0.0
1cpgreportsequence_feature10469112401299+.ID=1.1
1cpgreportsequence_feature112841130137+.ID=1.2
1cpgreportsequence_feature113431134732+.ID=1.3
1cpgreportsequence_feature114041140517+.ID=1.4
1…
To identify CpG shores, you first need a list of coordinates of CpG islands. If you have a linux computer with Emboss tools installed, you can determine CpG island positions with the cpgplot command. (if you don't have Emboss installed you can get CpG island coordinates from the UCSC table browser just be wary of chromosome naming ie, "chr1" vs "1")
cpgreport -score 17 -outfile genome.fa.cpg -outfeat /dev/stdout genome.fa | head
Will produce a file like this one:
##gff-version 3
##sequence-region 1 1 59999934
#!Date 2015-04-10
#!Type DNA
#!Source-version EMBOSS 6.6.0.0
1cpgreportsequence_feature10469112401299+.ID=1.1
1cpgreportsequence_feature112841130137+.ID=1.2
1cpgreportsequence_feature113431134732+.ID=1.3
1cpgreportsequence_feature114041140517+.ID=1.4
1…