Talk Tech to Me: 4 Steps For Tracing Undocumented Linux (or UNIX) Shell Scripts

You’re likely going to come across shell scripts that are not well documented and difficult to trace as a result. Here are four ways to overcome this.

Talk Tech to Me, brought to you by CompTIAShell scripting is a core skill for any Linux (or UNIX) administrator, cybersecurity professional, developer or DevOp. As a result, shell scripting comprises the majority of the content tested within Domain 5 of CompTIA Linux+ (XK0-004).

Regardless of your job role, you’re likely going to come across shell scripts that are not well documented and difficult to trace (read and understand) as a result.

While these scripts discourage others from tracing them, you can easily trace even the most difficult shell scripts if you follow good tracing procedures. Moreover, by tracing a shell script, you build your knowledge of specific Linux/UNIX components, as well as scripting in general.

How to Trace an Undocumented Shell Script

To demonstrate this, let’s trace an undocumented treed shell script. If you execute this shell script, it performs a hierarchical listing of the contents of a directory name that you must specify as an argument (much like the MS-DOS tree command that has no equivalent on Linux/UNIX systems).

If you execute it without supplying an argument, it returns usage information as shown below:

$ ./treed
usage: ./treed directory
$

To view a list of files and subdirectories under the classfiles directory, you could run the script with the directory name as an argument:

$ ./treed classfiles
classfiles
|
|__ Miscellaneous
|   |__ mystery
|   |__ letter
|__ Poems
|   |__ Blake
|   |   |__ jerusalem
|   |   |__ tiger
|   |__ Shakespeare
|   |   |__ sonnet5
|   |   |__ sonnet2
|   |   |__ sonnet3
|   |   |__ sonnet4
|   |   |__ sonnet1
|   |__ Yeats
|   |   |__ mooncat
|   |   |__ old
|   |   |__ whitebirds
|   |__ rhyme
|   |__ nursery
|   |__ twister
|__ proposal1
|__ proposal2
$

The classfiles directory shown above contains two files:

  • proposal1
  • proposal2

It also contains two subdirectories:

  • The Miscellaneous subdirectory contains two files: mystery and letter.
  • The Poems subdirectory contains three files (rhyme, nursery and twister) and three subdirectories (Blake, Shakespeare and Yeats) that contain additional files.

Now, let’s examine the contents of the shell script that produced these results:

:
[ $# -eq 0 ] && { 
     
echo "usage: $0 directory" >&2
     
exit 1
}
base=$1
export base
echo "$base"
find $base -print | sed '
     
s:^'$base':|:
     
s:/\([^/]*\)$:?? \1:
     
s:/[^ ?/]*:   |:g
     
s:?:_:g
     
'

4 Shell Script Tracing Strategies

Were you able to trace it easily? If not, you’re definitely not alone! Let’s examine some shell script tracing strategies that we can apply to this shell script.

1. Start with What You Know

After studying for the CompTIA Linux+ certification, you’re well equipped with the following fundamental shell scripting concepts:

  • Standard input/output redirection
  • Positional parameters (shell script command line arguments)
  • if statements as well as their equivalent conditional ANDs and ORs.

Thus, you’ll probably be able to easily trace the following part of the shell script that tests whether the number of positional parameters ($#) is equal to zero AND (&&) prints a usage line to the screen using standard error (>&2). If this is the case, a false exit status (exit 1) will stop the shell script:

[ $# -eq 0 ] && { 
     echo "usage: $0 directory" >&2
     
exit 1
}

You’ve also learned about variables, so you know that $0 stores the shell script name, $1 stores the first positional parameter and that the following code copies the first positional parameter to a new variable. The new variable is called base. It’s made available to other commands run by the shell (export base) and also printed to the screen via standard output using the echo command (the first line you see when executing the shell script):

base=$1
export base
echo "$base"

So, by focusing on your existing shell script knowledge, you can trace the code that produces usage information if you don’t supply an argument to the shell script, as well as trace the code that prints the top of the directory listing in the shell script output.

2. Research Specific Command Usage

The next part of the shell is the hardest to trace, as it leverages two powerful Linux/UNIX commands:

  • find
  • sed (the stream editor)

The find command is fairly common knowledge for any Linux/UNIX user, so you’ll be able to identify that the following line generates a recursive list of files and subdirectories, starting from the directory supplied as an argument to the shell script (the $base variable). It then sends that recursive list to the sed command for processing via a pipe (|)

find $base -print | sed '

Where it often gets difficult to trace is in the single-quoted, multi-line argument that follows the sed command:

find $base -print | sed '
     
s:^'$base':|:
     
s:/\([^/]*\)$:?? \1:
     
s:/[^ ?/]*:   |:g
     
s:?:_:g
     
'

This isn’t about knowing shell scripting per se, but instead about understanding the complex usage of a command within the shell script (there are hundreds of such tools). Of course, you’ll need to spend some time looking through manual page (man sed) or some examples online.

While it may take you some time to learn how sed is used here, you’ll also be learning a powerful tool that you can use in other shell scripts for processing data using search-and-replace statements.

This is reflected in the old Linux/UNIX saying He said. She said. We all said “Use sed.”

After researching sed usage, you’ll learn that s stands for search, and the first character after s is the delimiter for each statement (: in this case).

The first sed command searches for lines that start with $base and replaces them with a pipe (|) character:

            s:^'$base':|:

The second sed command matches patterns starting with a slash ( /), having anything but a slash ([^/]) in the middle and ending with End-of-Line character ($), which represents a filename.

Next, it prepends a ?? to this filename as a marker for the third sed command:

            s:/\([^/]*\)$:?? \1:

The third sed command matches patterns starting with a /, until the next space, ? or / character. It then replaces this pattern globally throughout each line (g) with three spaces and a pipe (|). This recursively strips parent directories from each line but their own, replacing them with pipe symbols.

            s:/[^ ?/]*:   |:g

And the fourth sed command replaces all ? characters with underscore (_) characters, globally throughout each line:

            s:?:_:g

3. Verify Your Trace Using Script Output/Functionality

After tracing enough of your script, make sure you correlate it to the actual functionality of the script, which can often be achieved by examining the output generated by the script, or the tasks that the script performs.

After printing the $base variable to the screen in our example, the sed command modified the output of a recursive find listing by replacing directories at the beginning of the hierarchy with spaces, pipe symbols and underscore characters (via a ?? placeholder). You can easily verify your trace logic by examining the classfiles directory output generated by the script.

Furthermore, certain things you didn’t know could easily be deduced by examining the output of a script. For example, if you didn’t know what $0 represented, you could run the script without arguments to see the message generated and compare it to the script contents to see that it represents the script path.

4. Identify the Purpose of Remaining Script Syntax

Sometimes shell scripts contain syntax with associated functionality that isn’t immediately clear when tracing the shell script. In our example, the : at the beginning of the shell script serves no visible function. This : is functionally the same as the /bin/true command in that it generates a true exit status, and nothing more.

Normally a shell script starts with a hashpling/shebang such as #!/bin/bash to tell the executing shell to interpret all remaining lines using a BASH shell. However, starting a script with a : was common practice on old UNIX systems. It was a way to identify scripts that could safely execute in any shell on the system.

Another example of this is using standard output redirection following a code block (e.g., an if statement block). This syntax forces the standard output of all commands in the code block to a single file.

Regardless, any remaining script syntax can be easily be found by searching the internet and committed to memory for future use!

CompTIA is here to support you throughout your IT career. Get free resources, career advice, and special offers on CompTIA training and certifications!

Email us at blogeditor@comptia.org for inquiries related to contributed articles, link building and other web content needs.

Read More from the CompTIA Blog

Leave a Comment