A selection of useful bash one-liners
Here's a selection of one liners that you might find useful in data analysis. I'll update this post regularly and hope you can share some of your own gems.
Compress an archive:
tar cf backup.tar.bz2 --use-compress-prog=pbzip2 directory Sync a folder of data to a backup
rsync -azhv /scratch/project1 /backup/project1/ Extract a tar archive:
tar xvf backup.tar Head a compressed file:
bzcat file.bz2 | head pbzip2 -dc file.bz2 | head zcat file.gz | head
Uniq on one column (field 2):
awk '!arr[$2]++' file.txt Explode a file on the first field:
awk '{print > $1}' file1.txt Sum a column:
awk '{ sum+=$1} END {print sum}' file.txt Put spaces between every three characters
sed 's/\(.\{3\}\)/\1 /g' file.txt Join 2 files based on field 1. Both files need to be properly sorted (use sort -k 1b,1)
join -1 1 -2 1 file1.txt file2.txt Join a bunch of files by field 1. Individual files don't need to be sorted but the final output might need to be sor…
Compress an archive:
tar cf backup.tar.bz2 --use-compress-prog=pbzip2 directory Sync a folder of data to a backup
rsync -azhv /scratch/project1 /backup/project1/ Extract a tar archive:
tar xvf backup.tar Head a compressed file:
bzcat file.bz2 | head pbzip2 -dc file.bz2 | head zcat file.gz | head
Uniq on one column (field 2):
awk '!arr[$2]++' file.txt Explode a file on the first field:
awk '{print > $1}' file1.txt Sum a column:
awk '{ sum+=$1} END {print sum}' file.txt Put spaces between every three characters
sed 's/\(.\{3\}\)/\1 /g' file.txt Join 2 files based on field 1. Both files need to be properly sorted (use sort -k 1b,1)
join -1 1 -2 1 file1.txt file2.txt Join a bunch of files by field 1. Individual files don't need to be sorted but the final output might need to be sor…