UNIX Shell/Command-Line Primer
I’ll start with un-intuitive things that took me the longest to learn. It can be hard to figure out what are command line programs, and what commands are built-in to bash… sometimes there is both a program version and a shell version! Eg, test
. It’s very UNIX-y to have even the most basic things be programs, like ls
. Some things must be built-in, like fg
/bg
(foreground and background jobs), for
, etc.
You get help with programs using --help
or man
. You get help with bash built-ins using help
(eg, help for
). I’m pretty sure built-ins override programs if they have the same name. You can check if a program exists with whereis
. help
on it’s own gives a listing of commands. Some examples of commands that are built-in and programs:
echo
true/false
[ (sic!)
kill
There are basically three categories of UNIX shell:
- simple standards-compliant (POSIX) shell: /bin/sh
- bash-compatible: /usr/bin/bash, zsh
- non-bash compatible: ksh, csh, fish, xonsh, etc
For things to run on any UNIX machine (including macOS), stick to POSIX shell… but in reality bash is almost always installed (is it default on macOS? I don’t know. It isn’t on FreeBSD). The bash/zsh/csh thing is like emacs/vi. You can see which you are running with echo $0
.
To make a script on UNIX, you make the file executable (chmod +x thefile.sh
), and add a “shebang” line (the #!
is called “shebang”):
#!/usr/bin/bash
You can put any program on shebang line, and the body of the file will be passed to that program:
#!/usr/bin/python3
or
#!/usr/bin/parallel
For portability, and to avoid issues like “python3 is installed under /opt, not /usr/bin”, you can do this:
#!/usr/bin/env python3
The “env” will look up the program by name on the $PATH.
To do a for loop in one line you do:
for l in `ls /somedir`; do echo $l; done
In a script, for better legibility:
for l in `ls /somedir`; do
echo $l
done
If statements:
if [ -e /some/file ]; do echo 'file exists'; fi
if [ -e /some/file ]; do
echo 'file exists'
fi
You’ll see those brackets (single [ or double [[), and it took me forever to learn about them… help [
and help if
are not helpful, you need help test
.
Specific Commands
There are a bajillion command line commands. Here are some helpful categories…
For doing csv/tsv/column data munging:
cut - select columns
grep/rg - select rows (by pattern), use '-v' to invert
join - what it says
sort - note the -n flag for numerical. lexical/case etc are a mess
uniq
sed - search/replace using regex
rg - more complex search/replace using the -o or -e options
awk - simple string substitutions, re-order columns with 'print'
tr - character-level search/replace or filtering (eg, lowercase)
xsv - a whole lot!
Here’s an example of the above; I run things like this all day:
cut -f1 ~/.bash_history -d' ' | sort | uniq -c | sort -n
For doing batch operations:
xargs - superceded by parallel
parallel - like 'for' but in parallel. can also work on piped stdin
I basically never use 'for' loops after learning this command
find - helpful for finding files, but can also run a command for each
file found
Other every-day tools:
jq - the JSON swiss army knife. A whole scripting language
http (httpie) - better than curl/wget for debugging (though maybe not
for big downloads)
xsv - can do conversions between csv types
rg/ripgrep - faster + better than grep
pv - show progress
screen/tmux - a whole world of it's own