Simpleton Digest

Trials and tribulations of a professional computer nerd

Tag Archives: programming

Building a better @macro

This is a follow up to a previous post about creating a sort of macro in bash. If you’re interested please take a look at that post for a better introduction to what I’m doing.
After I wrote my last post where I describe how to create a macro system in bash there was one piece that I wasn’t very happy with. That piece is the use of the DEBUG trap. It worked ok, but it’s wasteful since it runs before every simple command. Further, it is wide open to colliding with someone else’s use of it.

Happily, I have found a better way. It also gives me an excuse to go into more detail as to how exactly the @macro alias is constructed. Let me introduce to you the new and improved @macro:

alias @macro='MCMD="${BASH_COMMAND#*#\'\''}" eval '\''eval "$(eval "$MCMD")" #'\'
# or, if you just want to paste it in without using an alias
MCMD="${BASH_COMMAND#*#\'}" eval 'eval "$(eval "$MCMD")" #'

Ok, so obviously there is a decent amount going on here so let’s start from building blocks and work our way up. Before we get started let’s go over what our goal is.

What we want is a @macro command that executes the command that follows it, then captures the output of that command, and then executes that output in the context of the current function. This can be used, for example, to generate argument parsing code or (with some extra code) an @assert macro.

$ # Basic @macro usage
$ echo -e 'if [ "$VAR" == hello ]\nthen\n    echo world\nfi'
if [ "$VAR" == hello ]
then
    echo world
fi
$ VAR=no
$ @macro echo -e 'if [ "$VAR" == hello ]\nthen\n    echo world\nfi'
$ VAR=hello
$ @macro echo -e 'if [ "$VAR" == hello ]\nthen\n    echo world\nfi'
world

Ok, on to building @macro.

BASH_COMMAND Revisited

$ # This code prints itself
$ MCMD="${BASH_COMMAND}" eval 'echo "$MCMD"'
MCMD="${BASH_COMMAND}" eval 'echo "$MCMD"'

This is the core trick of the alias and replaces the use of the DEBUG trap. If you read the part in the previous post about the DEBUG trap, you’ll recognize the $BASH_COMMAND variable. In that post I said that it was set to the text of the command before each call to the DEBUG trap. Well, it turns out it’s set before the execution of every command, DEBUG trap or no (e.g. run ‘echo “this command = $BASH_COMMAND”‘ to see what I’m talking about). By assigning it a variable (just for that line) we capture BASH_COMMAND at the outermost scope of the command, which will contain the entire command.

What about the eval? The reason we need an eval with it’s argument in single quotes is to prevent the evaluation $MCMD until after it’s been set. Normally, variable expansion happens before the command is actually executed:

$ VAR=bad
$ VAR=good echo "$VAR"
bad

However, if you stick your expression in a single quoted eval argument, then the shell expansion won’t happen until the eval is actually executed, which is after the variables are set.

$ VAR=bad
$ VAR=good eval 'echo "$VAR"'
good

Magic Comment

In the previous @macro snippet which printed itself when executed there was a slight bug that you may have noticed if you played around with it.

$ MCMD="${BASH_COMMAND}" eval 'echo "$MCMD"' 1 2
MCMD="${BASH_COMMAND}" eval 'echo "$MCMD"' 1 2 1 2
$ # Uh oh, arguments are printed multiple times

Looking at the code it’s easy to see why this is happening. First the arguments are being echo’d because they are part of $BASH_COMMAND, but then they are printed again because they are being passed as additional arguments to echo. To solve this we use the same trick developed in the original post and append a comment (‘#’) at the end of the eval’d command.

$ MCMD="${BASH_COMMAND}" eval 'echo "$MCMD" #' 1 2
MCMD="${BASH_COMMAND}" eval 'echo "$MCMD" #' 1 2
$ # Hurray!

One thing to mention about this is that a comment in an eval works differently than a normal comment in one important way: It only applies to the current command.

$ echo hello # blah; echo world
hello
$ eval 'echo hello #' blah; echo world
hello
world

This means we can safely add a comment to our @macro alias while still allowing macros to be on the same line as other commands.

Stripping the cruft

The next problem we’re faced with is that, in fact, we don’t actually care about most of the BASH_COMMAND. All we really want is everything after @macro, which is going to be the macro function and it’s arguments. In the last post I used the length of the expanded macro definition to strip the front from BASH_COMMAND. This turned out to have a fatal flaw (the specifics of which I will leave as an exercise to the reader but which involves adding variable definitions to the front of a @macro call).

The new method I came up with has the dual advantages of being both safer and simpler.

$ MCMD="${BASH_COMMAND#*#\'}" eval 'echo "$MCMD" #' 1 2
 1 2

That’s it. This uses the ${VAR#PATTERN} syntax which, when evaluated, expands to the value of VAR with the smallest prefix that matches PATTERN removed. Let’s see some examples.

$ VAR="abcdcdef"
$ echo "${VAR#abc}"
dcdef
$ echo "${VAR#*e}"
f
$ echo "${VAR#*cd}"
cdef

By using a pattern of *#’ (the ‘ must be escaped as \’ in the actual pattern) we strip off everything up to the comment symbol followed by the single quote, which marks the end of @macro alias. We also could have used ${VAR/*#\’} or ${VAR##*#\’} instead, both of which will match the longest prefix instead of the shortest, but that would open us up to bugs in situations like “@macro echo ‘#'” where the #’ pattern occurs in the arguments.

The problem with word splitting (or how I learned to stop worrying and love eval)

We’re very close now. The only remaining piece is how to get from having a string that holds our macro command ($MCMD) and evaluating it’s output. First off, because $MCMD is a string and not an array we have to deal with bash’s annoying word splitting rules.

If you’ve worked with bash at all you know that when you type a command into bash it will split your command into a sequence of “words” based on things like whitespace.

$ echo a     b
a b

You may also know that if you have a variable reference in a command, and it’s not in double quotes, it will also be word split.

$ A='a    b'
$ echo $A
a b
$ echo "$A"
a    b

Ok, so far so good. Here is the fun part though: the rules used to word split in these two situations are completely different. Specifically word splitting of variables is much more limited than when evaluating arguments of a command. This means that all kinds of fanciness that is allowed in command arguments completely breaks if you store the arguments in a variable first.

$ echo 'a    b' c
a    b c
$ # echo treats it as 2 arguments
$ A="'a    b' c"
$ echo $A
'a b' c
$ # echo treats it as 3 separate arguments, blindly separating by whitespace

Not only that, but if $A contained any variable references they wouldn’t be expanded into their values. When taken as a whole, the different splitting rules actually make a certain amount of sense, but in this context they can be quite frustrating.

As it turns out, the only way to evaluate a string as if it was a normal command is to use eval. For our macros, this applies to the contents of $MCMD, but it also applies to the code echo’d back by the macro command. This means we need to add two more eval statements, one for each.

$ MCMD="${BASH_COMMAND#*#\'}" eval 'eval "$(eval "$MCMD")" #' echo 'echo hello'
hello

First we call eval “$MCMD” which evaluates the macro command, then we use $() to take it’s output and pass that output as the argument to the second eval.

Wrapping up

All that’s left is wrapping it up in an alias and trying it out. In this example we create a fairly useless ‘add’ macro command which takes two numbers and then prints ‘$[ NUM1 + NUM2 ]’.

$ alias @macro='MCMD="${BASH_COMMAND#*#\'\''}" eval '\''eval "$(eval "$MCMD")" #'\'
$ add () { echo "echo \$[ $1 + $2 ]"; }
$ @macro add 1 2
3

As before, all of this can be found in my lib.sh project in a slightly more robust/featureful form. The (possibly historical) version of macro.sh that corresponds to what is talked about in this post can be found here.

Advertisements

Implementing macros in bash

Note: I’ve written a follow up to this post here that talks about a better implementation and goes into some more detail. It’s still worth it to read this post to get an introduction to what I’m doing.

This work is part of a little project I’ve started call lib.sh. My hope with this project is to develop a set of libraries for bash so that one can develop medium size programs in bash without abandoning good development practices. It grew out of a desire to be able to implement new syntax in bash without resorting to source code translation. What I’ve ended up doing is creating a small macro system for bash. The approach I develop here seems to work very well though it still does have it’s limitations.

One of the earliest features I wanted for lib.sh was to have a simple, lightweight argument parsing system.  Typically in bash when you have a function or a script the “right” thing to do is something like:

myfunc()
{
  if [ $# -ne 2 ]
  then
     echo 1>&2 "Usage: $FUNCNAME arg1 arg2";
     return 1
  fi
  local arg1="$1" arg2="$2"
}

This will accomplish a few things:

  1. The number of arguments is checked, and an informative “usage” message is printed.
  2. The arguments are assigned to named variables instead of $1, $2 …
  3. The arguments are held in local variables so that they won’t overwrite other variables, or be overwritten by subsequent function calls.

Still, it’s a lot of code, and it get’s very tedious to do this for every function you write (You are using functions, right?). Wouldn’t life be wonderful if you could write something like this:

myfunc()
{
   @args arg1 arg2
}

The big hurdle to accomplishing this is that to get access to local variables like $# and $FUNCNAME and to create local variables we need to execute code inside the myfunc function, and we have pretty much no way of doing this.

I’m going to spare you the details of the many false starts I had and go straight into the solution I found.  It involves a trick involving three separate features of bash: the DEBUG trap, aliases, and eval.

Eval

Our first stop is, of course, “eval”. As with most versions of eval, bash’s takes a string and executes it as a bash command.  This is how we’re going to execute in the calling function’s context.  Allowing for a slightly uglier syntax we can do something like this:

myfunc()
{
  eval "$(args arg1 arg2)"
}

We would then have the “args” function generate the desired argument parsing code and echo it to standard out, which would then be executed by eval.  Easy!

args ()
{
    # @args ...
    local i=1 argv=( "$@" )

    echo -n "local argv=( \"\$@\" )"
    while [ "$#" -ne 0 ]
    do
        echo -n " $1=\"\$$i\""
        let ++i
        shift
    done
    echo ';'
    cat <<END
if [ \$# -ne $(( i - 1 )) ]
then
    error "Usage: \$FUNCNAME ${argv[*]}"
fi
END
}

If you’re the type of uncaring programmer who could put up with that sort of syntactic monstrosity (kidding!), you can stop now. However if you are, like me, uncompromising when it comes to such things, we must push forward to find ways of improvement.

The @macro Alias

The first, and most obvious, step for improving bash syntax is using aliases.  A bash alias is a very limited sort of pre-processor macro.  For example:

$ alias hw='echo "Hello World!"'
$ hw
Hello World!
$

It’s two big limitations are

  • It can only do basic search and replace (no arguments like the C pre-processor)
  • It is only applied on the command name, not any of the arguments (this is not precisely true but close enough for what we’re doing).

Using aliases we could imagine doing something like:

alias @macro='eval "$($MACRO_COMMAND)"'

The only remaining problem is how do we take

@macro args arg1 arg2

and turn it into

MACRO_COMMAND="args arg1 arg2" @macro

The DEBUG Trap

The answer is the most obscure part of bash we’re using for this trick: the DEBUG trap. The DEBUG trap is a special quasi-signal provided by bash that allows debugger like functionality.  For those that don’t know, “trap” is a command in bash that allows you to have a piece of bash code run whenever a particular signal is raised (like Kill or Interrupt).  In addition to the standard signals, it also can take a set of quasi-signals which are executed when various things inside the bash program happen.

The DEBUG trap is executed before every single command (all of em).  To get a sense of how this works you can pull up your bash prompt and type in

trap 'echo "Running $BASH_COMMAND"' DEBUG

Then start typing random stuff into your shell prompt.  Before each command you run, bash will run the code in the 2nd argument above, setting BASH_COMMAND to the (mostly) exact text of your command.  You may begin to already see how this might be used to accomplish our goal.

There are a couple of little things to note about the DEBUG trap:

  1. All aliases are expanded before the trap is triggered, and
  2. All comments are removed from the end of the line.

The basic idea is going to be that we’re going to inspect each command before it’s run, and if it’s a @macro command, we’re going to pull the arguments out of BASH_COMMAND and stick them in MACRO_COMMAND.  To make this a bit easier we’re going to put this into a function and then call the function from the trap

MACRO_ALIAS='eval "$($MACRO_COMMAND)"'

MACRO_ALIAS_LENGTH="${#MACRO_ALIAS}"
alias @macro="$MACRO_ALIAS"

filter_macro_command()
{
  local command="$1"
  # Check if the start of command is the same as $MACRO_ALIAS
  if [ "${command:0:$MACRO_ALIAS_LENGTH}" == "$MACRO_ALIAS" ]
  then
    # If so, assign MACRO_COMMAND to everything after it.
    MACRO_COMMAND="${command:$MACRO_ALIAS_LENGTH}"
  fi
}

trap 'filter_macro_command "$BASH_COMMAND"' DEBUG

There are a couple of things here.  One is that instead of checking lines that begin with “@macro” we have to check for what @macro expands to.  This is because aliases are expanded before DEBUG is triggered.  By just storing the expanded value in a variable and then using bashes substring operators (${VAR:START:LENGTH}) it’s not too hard to do this.  The code will work like this

@macro args arg1 arg2

# Expanding alias
eval "$($MACRO_COMMAND)" args arg1 arg2

# Call DEBUG trap, match command and set MACRO_COMMAND=" args arg1 arg2"
eval "$( args arg1 arg2 )" args arg1 arg2

So, this is pretty close.  It mostly looks like what we had in the Eval section except we still have those annoying parameters sticking off the end of the command.  Luckily, eval comes to the rescue. By adding a ‘#’ to the end of the eval, we start a comment after the command is executed, which will cause bash to ignore the rest of the command.

MACRO_ALIAS='eval "$($MACRO_COMMAND) #"'
@macro args arg1 arg2

# Expanding alias
eval "$($MACRO_COMMAND) #" args arg1 arg2

# Call DEBUG trap, match command and set MACRO_COMMAND=" args arg1 arg2"
eval "$( args arg1 arg2 ) #" args arg1 arg2

Finally

Last but not least, we add a special alias for @args to give us that added layer of convenience.

alias @args='@macro args'
myfunc()
{
  @args arg1 arg2
}

In fact, we can go further and use @macro in a generic way, giving us a way to turn any function into a macro which will have it’s output evaluated inside the context of the calling function. To support this we can just create a simple function call macroify which, when given the name of a function, generates an appropriately named “macro alias” (which is just @ + name of the function).

macroify()
{
    @macro args funcname
    alias "@$funcname=@macro $funcname"
}

macrofiy args
# We now have @args

All of this is done, in a much more fleshed out manner, in macro.sh in the lib.sh project (and args.sh for the @args implementation).