I have begun the process of moving all my opensource projects to code.google.com.
Since I am also taking the opportunity to clean up and update the projects, this will
undoubtedly take a while.
The first project to move over is Hocuspocus, a simple webserver for remote controlling
Linux machine, which I run on most of my systems.
The website lives here:
http://muth.org/Robert/Hocuspocus/
The code can found here:
http://code.google.com/p/hocuspocus/
Saturday, September 1, 2012
Friday, August 3, 2012
Better Bash Scripting in 15 Minutes
The tips and tricks below originally appeared as one of Google's "Testing on the Toilet" (TOTT) episodes.
This is a revised and augmented version.
Safer Scripting
I start every bash script with the following prolog:
If a failing command is to be tolerated use this idiom:
“mkdir -p” and “rm -f”.
Also note, that the “errexit” mode, while a valuable first line of defense, does not catch all failures, i.e. under certain circumstances failing commands will go undetected.
(For more info, have a look at this thread.)
A reader suggested the additional use of "set -o pipefail"
Some more instructive examples:
Try moving all bash code into functions leaving only global variable/constant definitions and a call to “main” at the top-level.
Strive to annotate almost all variables in a bash script with either local or readonly.
$()also permits nesting without the quoting headaches.
and adds new functionality:
Operator Meaning
|| logical or (double brackets only)
&& logical and (double brackets only)
< string comparison (no escaping necessary within double brackets)
-lt numerical comparison
= string matching with globbing
== string matching with globbing (double brackets only, see below)
=~ string matching with regular expressions (double brackets only , see below)
-n string is non-empty
-z string is empty
-eq numerical equality
-ne numerical inequality
single bracket
double brackets
These new capabilities within double brackets are best illustrated via examples:
must not be quoted. If your expression contains whitespace you can store it in a variable:
This is where <() operator comes in handy as it takes a command and transforms it into something
which can be used as a filename:
in on stdin. The two occurrences of 'MARKER' brackets the document.
'MARKER' can be any text.
If parameter substitution is undesirable simply put quotes around the first occurrence of MARKER:
$n positional parameters to script/function
$$ PID of the script
$! PID of the last command executed (and run in the background)
$? exit status of the last command (${PIPESTATUS} for pipelined commands)
$# number of parameters to script/function
$@ all parameters to script/function (sees arguments as separate word)
$* all parameters to script/function (sees arguments as single word)
$@ handles empty parameter list and white-space within parameters correctly
$@ should usually be quoted like so "$@"
bash -n myscript.sh
To produce a trace of every command executed run:
bash -v myscripts.sh
To produce a trace of the expanded command use:
bash -x myscript.sh
-v and -x can also be made permanent by adding
set -o verbose and set -o xtrace to the script prolog.
This might be useful if the script is run on a remote machine, e.g.
a build-bot and you are logging the output for remote inspection.
#!/bin/bashThis will take care of two very common errors:
set -o nounset
set -o errexit
- Referencing undefined variables (which default to "")
- Ignoring failing commands
If a failing command is to be tolerated use this idiom:
if ! <possible failing command> ; then
echo "failure ignored"
fi
Note that some Linux commands have options which as a side-effect suppress some failures, e.g.
“mkdir -p” and “rm -f”.
Also note, that the “errexit” mode, while a valuable first line of defense, does not catch all failures, i.e. under certain circumstances failing commands will go undetected.
(For more info, have a look at this thread.)
A reader suggested the additional use of "set -o pipefail"
Functions
Bash lets you define functions which behave like other commands -- use them liberally; it will give your bash scripts a much needed boost in readability:ExtractBashComments() {
egrep "^#"
}
cat myscript.sh | ExtractBashComments | wc
comments=$(ExtractBashComments < myscript.sh)
Some more instructive examples:
SumLines() { # iterating over stdin - similar to awk
local sum=0
local line=””
while read line ; do
sum=$((${sum} + ${line}))
done
echo ${sum}
}
SumLines < data_one_number_per_line.txt
log() { # classic logger
local prefix="[$(date +%Y/%m/%d\ %H:%M:%S)]: "
echo "${prefix} $@" >&2
}
log "INFO" "a message"
Try moving all bash code into functions leaving only global variable/constant definitions and a call to “main” at the top-level.
Variable Annotations
Bash allows for a limited form of variable annotations. The most important ones are:- local (for local variables inside a function)
- readonly (for read-only variables)
# a useful idiom: DEFAULT_VAL can be overwritten
# with an environment variable of the same name
readonly DEFAULT_VAL=${DEFAULT_VAL:-7}
myfunc() {Note that it is possible to make a variable read-only that wasn't before:
# initialize a local variable with the global default
local some_var=${DEFAULT_VAL}
...
}
x=5
x=6
readonly x
x=7 # failure
Strive to annotate almost all variables in a bash script with either local or readonly.
Favor $() over backticks (`)
Backticks are hard to read and in some fonts easily confused with single quotes.$()also permits nesting without the quoting headaches.
# both commands below print out: A-B-C-D
echo "A-`echo B-\`echo C-\\\`echo D\\\`\``"
echo "A-$(echo B-$(echo C-$(echo D)))"
Favor [[]] (double brackets) over []
[[]] avoids problems like unexpected pathname expansion, offers some syntactical improvements,and adds new functionality:
Operator Meaning
|| logical or (double brackets only)
&& logical and (double brackets only)
< string comparison (no escaping necessary within double brackets)
-lt numerical comparison
= string matching with globbing
== string matching with globbing (double brackets only, see below)
=~ string matching with regular expressions (double brackets only , see below)
-n string is non-empty
-z string is empty
-eq numerical equality
-ne numerical inequality
single bracket
[ "${name}" \> "a" -o ${name} \< "m" ]
double brackets
[[ "${name}" > "a" && "${name}" < "m" ]]
Regular Expressions/Globbing
These new capabilities within double brackets are best illustrated via examples:
t="abc123"Note, that starting with bash version 3.2 the regular or globbing expression
[[ "$t" == abc* ]] # true (globbing)
[[ "$t" == "abc*" ]] # false (literal matching)
[[ "$t" =~ [abc]+[123]+ ]] # true (regular expression)
[[ "$t" =~ "abc*" ]] # false (literal matching)
must not be quoted. If your expression contains whitespace you can store it in a variable:
r="a b+"
[[ "a bbb" =~ $r ]] # true
Globbing based string matching is also available via the case statement:
Basics
Substitution (with globbing)
case $t in
abc*) <action> ;;
esac
String Manipulation
Bash has a number of (underappreciated) ways to manipulate strings.Basics
f="path1/path2/file.ext"
len="${#f}" # = 20 (string length)
# slicing: ${<var>:<start>} or ${<var>:<start>:<length>}
slice1="${f:6}" # = "path2/file.ext"
slice2="${f:6:5}" # = "path2"
slice3="${f: -8}" # = "file.ext"(Note: space before "-")
pos=6
len=5
slice4="${f:${pos}:${len}}" # = "path2"
Substitution (with globbing)
f="path1/path2/file.ext"
single_subst="${f/path?/x}" # = "x/path2/file.ext"
global_subst="${f//path?/x}" # = "x/x/file.ext"
# string splitting
readonly DIR_SEP="/"
array=(${f//${DIR_SEP}/ })
second_dir="${array[1]}" # = path2
Deletion at beginning/end (with globbing)
f="path1/path2/file.ext"
# deletion at string beginning extension="${f#*.}" # = "ext"
# greedy deletion at string beginning
filename="${f##*/}" # = "file.ext"
# deletion at string end
dirname="${f%/*}" # = "path1/path2"
# greedy deletion at end
root="${f%%/*}" # = "path1"
Avoiding Temporary Files
Some commands expect filenames as parameters so straightforward pipelining does not work.This is where <() operator comes in handy as it takes a command and transforms it into something
which can be used as a filename:
# download and diff two webpagesAlso useful are "here documents" which allow arbitrary multi-line string to be passed
diff <(wget -O - url1) <(wget -O - url2)
in on stdin. The two occurrences of 'MARKER' brackets the document.
'MARKER' can be any text.
# DELIMITER is an arbitrary string
command << MARKER
...
${var}
$(cmd)
...
MARKER
If parameter substitution is undesirable simply put quotes around the first occurrence of MARKER:
command << 'MARKER'
...
no substitution is happening here.
$ (dollar sign) is passed through verbatim.
...
MARKER
Built-In Variables
For reference
$0 name of the script$n positional parameters to script/function
$$ PID of the script
$! PID of the last command executed (and run in the background)
$? exit status of the last command (${PIPESTATUS} for pipelined commands)
$# number of parameters to script/function
$@ all parameters to script/function (sees arguments as separate word)
$* all parameters to script/function (sees arguments as single word)
Note
$* is rarely the right choice.$@ handles empty parameter list and white-space within parameters correctly
$@ should usually be quoted like so "$@"
Debugging
To perform a syntax check/dry run of your bash script run:bash -n myscript.sh
To produce a trace of every command executed run:
bash -v myscripts.sh
To produce a trace of the expanded command use:
bash -x myscript.sh
-v and -x can also be made permanent by adding
set -o verbose and set -o xtrace to the script prolog.
This might be useful if the script is run on a remote machine, e.g.
a build-bot and you are logging the output for remote inspection.
Signs you should not be using a bash script
- your script is longer than a few hundred lines of code
- you need data structures beyond simple arrays
- you have a hard time working around quoting issues
- you do a lot of string manipulation
- you do not have much need for invoking other programs or pipe-lining them
- you worry about performance
Instead consider scripting languages like Python or Ruby.
References
- Advanced Bash-Scripting Guide: http://tldp.org/LDP/abs/html/
- Bash Reference Manual
Thanks to Peter Brinkmann and Kim Hazelwood for their feedback on drafts of this post.
Tuesday, July 31, 2012
XaoS Port to Native Client
Work is underway for an update to Robby Roto and the underlying Mame engine which can likewise be found in the Chrome Web Store. Stay tuned!
Subscribe to:
Posts (Atom)