Custom Bash Argument Parser

In this post we will go over implementing a custom argument parser for Bash. You can see the final product here. Some features that I wanted to have are:

  • Parsing command-line arguments in UNIX (dash), GNU (double-dash), and BSD (no dash) styles
  • Accepting zero or more values for an argument.
  • Required arguments.
  • Required values for an argument.

First, I wanted to go over some Bash syntax so we can understand how this script is going to work. Bash has a few tools for us to find out information about the passed arguments. $# will tell me how many values were passed. We will use this to iterate through each value to parse them out. To access each value passed we use $1, $2, $3, etc to get the first, second, third, etc value respectively. The last tool we are going to use related to pass arguments is the shift tool to shift each value over so $3 will be $2, $2 will be $1, and $1 will be discarded.

The first thing we need to do is iterate over each value passed. We want to get the first value along with the next one after that to see if it is a value or the next argument. We do this with the $1 and shift commands:

# while number of args is more than 0
while [ $# -gt 0 ]; do<
    # get the first arg
    opt="$1"
    # shift out the first arg
    shift;
    # get the new current arg
    current_arg="$1"; 

Now we have to see if the first argument is BSD, UNIX, or GNU style. To make this easy, I want to create an array that will store each argument parsed. If our argument is UNIX or BSD style and has more than one argument inside of it we want to split these up to process them separately. If the argument is in UNIX style, it will have one dash and be more than two characters long. We do this by:

[[ "$opt" =~ ^-{1}.* ]] && [ ${#opt} > 2 ]   

We also want to see if the argument is in BSD style so we an convert it to UNIX style for easier processing later. We do this by:

[[ ! "$opt" =~ ^-{1,2}.* ]]     

Once we have found out the argument is either in UNIX style with more than one argument inside or is in BSD style, we need to loop through each character in the argument to split it up into proper UNIX style format with one argument per index in the array. First though, we need to see if the 'parent' argument is in UNIX style because if it is then we need to ignore the first dash to normalize or logic to make it DRY. We will be looping for each character in the parent argument using an increment variable so we want to see if we should skip the first character or not. Putting it all together so far for this section we have:

# if UNIX style args or if BSD style args - parse out args and put each in UNIX format 
arg_array=()
if [[ "$opt" =~ ^-{1}.* ]] && [ ${#opt} > 2 ] || [[ ! "$opt" =~ ^-{1,2}.* ]]; then
    i=0;

    # if a dash does exist then we need to skip it
    if [[ "$opt" =~ ^-{1}.* ]]; then
        i=1;
    fi  

Now we need to actually iterate through each character in the parent argument to parse it into UNIX format. To get the number of characters in a string we use ${#}. We will loop through each character using an increment variable and to get each character out of a string we use ${opt:$i:1}. We have to remember to add a dash to the beginning of each argument. Finally, if the parent argument is GNU style or UNIX style with only one 'child' argument then we just need to add the value to the array without any parsing. The entire code for this is:

# if UNIX style args or if BSD style args - parse out args and put each in UNIX format
arg_array=()

if [[ "$opt" =~ ^-{1}.* ]] && [ ${#opt} > 2 ] || [[ ! "$opt" =~ ^-{1,2}.* ]]; then
    i=0;

    # if a dash does exist then we need to skip it
    if [[ "$opt" =~ ^-{1}.* ]]; then
        i=1;
    fi

    # for each char in opts add to array with appended '-'
    while (( $i < ${#opt} )); do 

        arg_array[$i]="-${opt:$i:1}"; 
        (( i++ ));

    done
else
# else only one arg so add to one index array
    arg_array=("$opt");
fi  

We now have everything parsed correctly for processing so we need to go through each value in our newly created array and save their values if present and/or change internal flags so our program knows the user passed a particular argument. To iterate through each element in an array we use ${arg_array[@]}". Inside the iteration, we use a case to check for valid arguments and what we do with with. The first example we will do is a standard argument with one optional value. To do this we check the next argument passed to see if it contains a dash or not. This of course will not allow you to pass a BSD style argument right after a UNIX/GNU style but you usually can't even mix these two styles up in a single command anyways. We already used logic to test if a value had any dashes so if the next argument does not contain a dash then we know it is a value and we save it to the 'APPLE' variable. We also used "-a"|"--apple" in our case check to allow UNIX and GNU style arguments. Remember we converted all BSD styles arguments to UNIX so that makes it easier on us here.

# for each arg in arg array see if valid arg
for i in "${arg_array[@]}"; do

    # check each arg for validity
    case "$i" in
        "-a"|"--apple" )

            #  if the new current arg has no dash then is a value so save it
            if [[ ! "$current_arg" =~ ^-{1,2}.* ]]; then
                APPLE="$current_arg";
            fi;;

The next argument type we will parse is an argument with a required value. The way we will check this is simply to see if the next argument has a dash in it. If it does then we need to error and exit the program here. If it does, we save that value and move on.

"-b"|"--banana" ) 

    #  if the new current arg has a dash then we are missing required arg value so exit
    if [[ "$current_arg" =~ ^-{1,2}.* ]]; then
        echo "WARNING: $opt requires an value passed with it.";
        exit 1;

    else
    # else the required arg value is there so get it
        BANANA="$current_arg";
    fi;;

The next case logic will be a required argument. The code for this will be the same as an optional argument but after we are finished processing all arguments we will check to see if the CHERRY variable has a value. If we don't pass a value for cherry we still need a way to at least know the cherry argument has been passed so I put a default value of 'true'.

"-c"|"--cherry" )
    CHERRY="true";
    #  if the new current arg has no dash then is a value so save it
    if [[ ! "$current_arg" =~ ^-{1,2}.* ]]; then
        CHERRY="$current_arg";
    fi;;

Our next argument will allow for zero or more values passed to it. We once again check for a value by regex for a dash. If no dash exists we know it's value. Instead of just checking once though with an if statement, we are going to loop through each next argument to see if there are more values to save. We do this by a while loop and shifting out the current argument before checking the next one. If another value exists we just append that next value to the current values collected for the -f flag.

"-f"|"--fig" )

    # while the new current arg is not the next arg, parse next value
    while [[ ! "$current_arg" =~ ^-{1,2}.* ]] && [[ $# -gt 0 ]]; do
        FIG="$FIG $current_arg";
        shift;
        current_arg="$1";
    done;;  

The last case logic we are going to to is check simply check if arguments were passed or not and save a Boolean flag if they are. An empty variable is false so all we need to do here is add a value (true) to set the flags for later processing. We are now done with of case logic so we need to shift out the argument if it is value since we are done with that value.

"-g"|"--grape" )
    GRAPE="true";;
"-k"|"--kiwi" )
    KIWI="true";;

The last thing we do is check to see if the required argument cherry was given and then display all the values we collected:

if [[ -z "$CHERRY" ]]; then
    echo "-c is a required argument";
    exit 1;
fi

echo "a is $APPLE";
echo "b is $BANANA";
echo "c is $CHERRY";
echo "f is $FIG";
echo "g is $GRAPE";
echo "k is $KIWI";  

The full code for this can be viewed on my GitHub here. I've also added a help flag to show argument examples.

Leave a Reply

Your email address will not be published. Required fields are marked *