A selection of useful bash one-liners
Here's a selection of one liners that you might find useful in data analysis. I'll update this post regularly and hope you can share some of your own gems. Compress an archive: tar cf backup.tar.bz2 --use-compress-prog=pbzip2 directory Sync a folder of data to a backup rsync -azhv /scratch/project1 /backup/project1/ Extract a tar archive: tar xvf backup.tar Head a compressed file: bzcat file.bz2 | head pbzip2 -dc file.bz2 | head zcat file.gz | head Uniq on one column (field 2): awk '!arr[$2]++' file.txt Explode a file on the first field: awk '{print > $1}' file1.txt Sum a column: awk '{ sum+=$1} END {print sum}' file.txt Put spaces between every three characters sed 's/\(.\{3\}\)/\1 /g' file.txt Join 2 files based on field 1. Both files need to be properly sorted (use sort -k 1b,1) join -1 1 -2 1 file1.txt file2.txt Join a bunch of files by field 1. Individual files don't need to be sorted but the