Extract TSS regions from a GTF file with GTFtools
Since stumbling upon GTFtools recently, I found that it has other another interesting use - to generate coordinate sets around transcriptional start sites (TSSs). This is really important for ChIP-seq analysis when we want to compare for example the strength of enrichment of histone modificaions at TSSs and compare it to RNA expression. Using GTFtools, it is a one line command to extract these positions:
gtftools.py -t Homo_sapiens.GRCh38.94.gtf.tss.bed -w 1000 Homo_sapiens.GRCh38.94.gtf
gtftools.py -t Homo_sapiens.GRCh38.94.gtf.tss.bed -w 1000 Homo_sapiens.GRCh38.94.gtf
Where "-t" is the output file flag, "-w" is the desired TSS distance to cover, in this case +/- 1000 bp, and the last argument is the input gtf file which needs to be Ensembl or Gencode (other ones don't work due to differences in formatting)
If I had to do this without GTFtools, it would end up being more complicated, as TSS positions (exon 1 starts) would need to be extracted from the GTF file separately for the top and bottom strands and then merged.