Archive for July 7th, 2009

Loading Data Into Bash Variables

Tuesday, July 7th, 2009

Go away or I will replace you with a very small shell script. For those unfamiliar with Unix and Linux environments, bash is the command line shell that is standard on many distributions. These examples grew out of challenges attempting to automate EC2 processes. These basic principles of course can be applied more generally as needed. My goal is to simply provide the options I need most often in a single place. As I continue to automate routine tasks, I might just be able to replace myself, or others, with a series of scripts!

The Gaps

The Solutions

Loading data from a URL

With Amazon EC2 many startup values are available via a HTTP request to the internal address instance-data.ec2.internal. If you want to know more about these values, the Developer Guide is a good resource.

#!/bin/bash
MY_INSTANCE_ID=`exec wget -q -O - http://instance-data.ec2.internal/latest/meta-data/instance-id`
echo $MY_INSTANCE_ID

This script grabs the instance id and puts it into a variable and prints it back out by using the result of a remote execution to wget with the quiet (-q) and the output file set as standard output (-O -), the second dash is what sends the data to standard output so don’t forget it! Now anywhere in our script we want the instance id for string comparison, logging or whatever, we have it!

Loading data from a file

What if the data we want to load is in a file on the disk? This method is not good for processing giant apache access logs, but with smaller text files, it will work just fine.

#!/bin/bash
FILE_DATA=( $( /bin/cat file_data.txt ) )
for I in $(/usr/bin/seq 0 $((${#FILE_DATA[@]} - 1)))
	do
		echo $I $FILE_DATA[$i]
	done

What’s going on? The code is being loaded into an array, in bash, called FILE_DATA. It then loops over each element in the array using a for loop. Finally within the loop, we simply print the current index and then output the line we loaded. This would be roughly equivalent to running cat -n file_data.txt from the shell directly, but obviously gives us the flexibility to do further processing with the string contained in the variable.

Loading data from the user

Obviously this is not ideal for creating a process that runs on a cron job. However, if a script is being run by a user, they often need to tweak something about the way it runs that often can’t be detected automatically. In this case, you’ll want the user to key the data directly into your script.

#!/bin/bash
read -p "Enter Something: " VARIABLE
echo $VARIABLE

This example uses read with the optional prompt (-p) flag. This causes the text in the quotes to be displayed on the users standard output or terminal window.

Loading data from the command line

A step further is to let the user pass in data on the command line at run time. This of course can also be automated if needed. The following example leverages getopts to parse the parameters that were called in.

#!/bin/bash
OPT_A=0
OPT_B='Undefined'
while getopts ":ab:" OPTION
do
	case $OPTION in
	a ) OPT_A=1 ;;
	b ) OPT_B=$OPTARG ;;
	esac
done
shift $(($OPTIND - 1))
echo $OPT_A $OPT_B

The example script takes 2 different parameters a flag “-a” and a flag “-b” which expect data. In the example, default values are provided for each value, this gives the effect of making all flags optional. Using the flag -a would likely toggle a specific behavior within your script, perhaps loading a specific configuration file instead of the default one. If you wanted to collect data in each field, you simply add a colon “:” after each flag, ‘a’ in this example, following the getopts command. You would then update the case statement to reflect your expectation of data being present in $OPTARG. See the modified script below for clarification.

#!/bin/bash
OPT_A=0
OPT_B='Undefined'
while getopts ":a:b:" OPTION
do
	case $OPTION in
	a ) OPT_A=$OPTARG ;;
	b ) OPT_B=$OPTARG ;;
	esac
done
shift $(($OPTIND - 1))
echo $OPT_A $OPT_B

But wait… there’s more!

There’s also a simple way to pass data in that just stores the input from the command line into the $1, $2, $3, $4 and so on input variables.

#!/bin/bash
echo $2 $1

The script above when run as “./test_script hello world” will output “world hello”. This method can be handy for scripting quick tasks that you often use a series of parameters for. For example, adding the flags “-la” to “ls” as demonstrated below.

#!/bin/bash
ls -la $1

Script Configuration

So now that we can get different bits of data from all these different sources, what if all my scripts leverage the same data? Can’t I just have it as a single configuration file that I edit once? YES! This next example does just that. While it doesn’t technically load data into a variable, it does allow you to encapsulate your code, including a file full of variable assignments, into logical chunks. In my case, I was looking to avoid editing multiple scripts when configuration changes were needed.

First I created my configuration script, my_script.cfg, in the same directory I am running my example script below.

# Comments are allowed
OPT_1='Ubuntu'
OPT_2='Linux'
OPT_3='64bit'

Now the script that uses the configuration file above.

#!/bin/bash
OPT_1='RedHat'
OPT_2='Linux'
OPT_3='i386'
if [ -f my_script.cfg ];then 
	. my_script.cfg
fi
echo $OPT_1 $OPT_2 $OPT_3

Dissecting the script you’ll see that I first set some default values. Next the code checks for the existence of the configuration file. If found, it is included. It’s important to note that this is included because it actually allows you to run code within the configuration file. An EC2 instance might, for example, place all of the calls to instance-data.ec2.internal for metadata into a configuration file that’s simply included on scripts that use that information.

That’s it! Hope you find this resource helpful!

And for anyone looking to put those around you on alert, buy the t-shirt from Think Geek.

© 1998-2008 AF-Design, All rights reserved.