Last modified: September 01, 2023
Unix-based systems, such as Solaris, as well as Unix-like operating systems and environments, such as Linux and Cygwin, have a rich set of commands and utilities, probably more than any other operating system I've used. However, there are a few commands that I use almost daily, and intimate knowledge of what they are, and what they do, can make working with these kinds of systems easier and more effective. Most of them, except for "man" and "ls", are stream commands; that is, they can take input from an I/O stream, and create output on another I/O stream. This makes them extremely powerful. (Actually, output from "man" and "ls" can go to a stream, they just don't read from them.)
I'm not going to go into programming and scripting languages here, as that's outside the scope of the story I want to tell. Discussions of Bourne Shell, Perl, and so on are for another day. There are shades of gray here: "awk" can be used as a programming language, but inclusion here is in the form of simple command line tasks. I may also include one or two Perl "one-liners" for reference. Oh, and I'm leaving interactive text editors out of the mix, also.
Again, it's important to understand that the most powerful use of these Unix commands is not standalone, but combined together into streams. The concept of pipes and redirection, where commands can feed input to one another, accept input from arbitrary data files, and provide output to new data files, is one of the things that makes Unix stand out (and, yes, I know that other operating systems, such as MS-DOS, have copied these ideas, but the original Unix implementations are still superior. I was going to put a quick guide to redirection here, but decided to instead put examples inline as I go along. For now, please remember that pipe ("|") takes the output of one command and makes it the input of another command, less-than ("<") redirects into a command from a file, and greater-than (">") redirects from a command into a file. Also keep in mind that there are three main input/output streams: "stdin", or standard input, which is stream 0; "stdout", or standard output, stream 1; and "stderr", or standard error output, stream 2. Additional I/O streams can be created as needed. Pipes and redirects do not handle the standard error stream by default, so error messages generated by commands will not get passed to subsequent commands, or put into files, and will be displayed on your console instead.
Finally, I've provided parallel examples in cases where there might be syntax differences between Bourne Shell, which is primarily used for writing scripts, and C-Shell, which is used interactively. For example, to redirect a command's standard output and standard error output to the same file, we'd use this in Bourne Shell:
ls > myfile 2>&1
and this in C-Shell:
ls >& myfile
And, now, on with my list of twelve useful Unix commands:
I originally left this off the list, then when I realized I had used it to double-check the options to all the other ones shown below, decided it was too important to omit. The usage is pretty simple:
man awk
Gives the manual page for the awk command. It has a cousin named "apropos", which shows man pages that might be relevant to a particular topic; for example, "apropos filesystem" lists man pages that might have relevant information about filesystems.
For all of the commands that follow, and for the man command itself, take time to read the man pages. There are many more interesting options than what I have presented here.
This may seem kind of obvious, but the obvious uses of "ls" can cloud some of the really cool things it can do. In its simplest form, it lists files in the current directory, in alphabetical order. However, it has a lot of useful options; here are just a few:
Some of these are so useful that we've set up some "alias" commands in the global login files, so that all users can have them as part of their interactive sessions:
The options shown above can be combined in many interesting ways:
At first glance, "cat" seems so simple, you'll wonder why I included it. It's sole purpose is to stream files to standard output, and/or accept streams from standard input. However, there are cases where it works better than just the shell "<" and ">" operators; one such is that it can provide a sequential stream from multiple files, whereas "<" only works with one file per instance:
cat file1 file2 file3 | grep "awordwearelookingfor"
Also, it has some options to massage the data, such as "-n" to precede each line with a unique number.
The "tail" command shows the last few lines of a file; the default is 10, but this can be changed. The most interesting command line option is "-f", which "follows" a growing file. In other words, it doesn't exit after printing the last few lines, but hangs on to the file waiting for more lines to appear, so they can be printed. This is extremely useful for monitoring log files.
The "-r" flag, which prints lines in reverse order, is interesting, but I'm not sure how useful it is.
A cousin to "tail", the "head" command, shows the first few lines of a file; the default is 10 lines, but this can be changed. When given multiple files, it precedes them with a header, which can actually be handy when preparing a summary list of a bunch of files. It's interesting to note that early versions of System V Unix left out this command, because they figured that the command "sed 10q" pretty much did the same thing, at least in the single file case.
Wow, is "grep" useful. I wish there was a handheld "grep" device that could scan through a stack of papers and look for words, like this command does for files. It's job is to scan data looking for patterns, and it does a great job of it. In fact, it's often used as a verb, as in, "I'm going to grep through my old emails for that address."
"grep" has two cousins: "egrep", which has an expanded set of search pattern options, and "fgrep", which does not honor any search pattern metacharacters (the "f" stands for "fixed", not "fast", contrary to popular belief).
Here are a few examples:
Some commands, like "ls", have built-in sorting options. For others cases, the "sort" command is necessary. I have to be honest, it isn't the easiest command to use. Trying to sort on specific columns can be a real chore, and I often have to use trial-and-error to get it right. Also, it may not be obvious, but the default ordering is by ASCII code, and sometimes sorting fields of numbers can be confusing until you remember the "-n" flag. The ordering can be reversed with the "-r" flag, and multiple identical lines can be pruned to just one instance with the "-u" (for "unique") flag.
As but one example, if I am examining a group of directories on a disk, and trying to figure out how much space they're using, I use this command:
du -sk * | sort -r -n
It gives me a list sorted from largest to smallest, in numeric order. The default sort field is column 1, which is where "du" puts the file size, so I don't have to specify it.
The "wc", or "word count", command, seems primitive but is actually quite useful. I almost always find myself using it with the The "-l" option, which makes it count lines instead of words. For example, in our nightly "cleanup" scripts, we only reboot systems if no users are logged in to them, using a Bourne Shell construct like this (simplified from the original):
if [ `who | wc -l` = "0" ]; then shutdown -y -g30 -i6 "Nightly Reboot In Progress" fi
I can't emphasize enough the usefulness of "diff". If you need to find any and all differences between two files, this is the right tool for the job. It will show lines that were altered, lines that were added, and lines that were deleted, between two versions of a file.
The notation takes a little getting used to, but it's enough to remember that lines starting with "<" point to the first file listed on the command line, and lines starting with ">" point to the second.
The combination of flags "-wb" is useful for ignoring any differences caused by "white space" (eg. combinations of space and tab characters), and the flag "-c" produces a "context diff", where the three lines before and after each difference are shown (this can make slogging through a complex set of changes much easier).
For example, want to see if a file is sorted or not? Try this:
sort file.txt | diff file.txt -
If it outputs anything, that means it isn't sorted. If it is sorted, then you won't see anything. The dash ("-") argument means "read from standard input instead of from a file."
A cousin to "diff", the "cmp" command, does a byte-by-byte file comparison. It's quicker than diff in cases where you only want to find what files are different, and not where they're different, and also the best way to determine if binary files are different ("diff" hates binary data).
If there's a diamond in the rough, it's probably the "find" command. It will walk down a directory tree, looking for anything that matches the specified criteria. I use this command almost daily. As the saying goes, though, "with great power comes great responsibility"; This command can be extremely useful, because it can repeat a command on many files in a directory tree, but can also be extremely destructive, for the same reason. If you're doing something complex, I strongly advise running it on a small test directory before inflicting it on a large number of files and directories.
This is the only case where I will explicitly advocate the use of the GNU version in some cases, because they've added some really useful (and obvious) extensions.
Also, this command is best described through example:
sed is known as the "stream editor". It's a non-interactive text editor that do a surprising number of things that you would normally expect to need to do with an interactive editor. It's important to remember that it only works on I/O streams, and does not modify files by itself. Again, it's probably best to show it through example:
awk is a pattern scanning language. It, along with sed, was part of the inspiration for the Perl language.
I tend to use it mostly for its ability to split lines into fields (For more complicated things, I usually switch to Perl). By default, it will split on arbitrary whitespace, but that can be changed by specifying a new delimiter with the -F flag. For example, to split out the subnet part of a UB IP address:
echo "128.205.25.5" | awk -F. '{ print $3 }'
If fed multiple lines, like a file, it will loop through all of them, applying the same command. It also uses sed-style patterns, so we can do things like:
awk -F. '/^128\.205\./ { print $3 }' < iplist
Which would only print the subnets for UB addresses, when given an arbitrary list of IP addresses (the backslashes before the periods are necessary, because they are normally a wildcard for any single character, and this tells them to be interpreted literally).
cut is a neat little command, used to split lines into pieces. I often use it instead of awk because it's small, and because it's a bit easier and has its own set of options. It can split on either single-character delimiters, or on fixed column positions, making it ideal for processing data.
Here are some simple examples:
This is another of those commands that looks simple on its own, but is powerful when combined with others in a stream.
It was hard to limit this list to twelve commands; I originally had ten, but then kept thinking of others (actually, I did take some liberties by listing similar commands in some cases). Commands such as ln, file, chmod, chown, chgrp, and ps almost made the cut, but I decided to leave them for a future list of either useful advanced user commands or useful system administrator commands.
The final item I want to present is one of the core Unix philosophies, "There's more than one right way to do it". There is no reason why we couldn't use:
cat iplist | grep '^128\.205\.' | cut -d. -f3
Instead of:
awk -F. '/^128\.205\./ { print $3 }' < iplist
It all boils down to personal preference, efficiency, portability, and readability. I tend to favor portability, which is why I often use these commands instead of a language like Perl, because you are almost guaranteed to find these commands on any Unix system you encounter. Your mileage may vary.