Loading Data Into Bash Variables
Tuesday, July 7th, 2009
For those unfamiliar with Unix and Linux environments, bash is the command line shell that is standard on many distributions. These examples grew out of challenges attempting to automate EC2 processes. These basic principles of course can be applied more generally as needed. My goal is to simply provide the options I need most often in a single place. As I continue to automate routine tasks, I might just be able to replace myself, or others, with a series of scripts!
The Gaps
- Loading data from a URL
- Loading data from a file
- Loading data from the user
- Loading data from the user
- Script Configuration
The Solutions
Loading data from a URL
With Amazon EC2 many startup values are available via a HTTP request to the internal address instance-data.ec2.internal. If you want to know more about these values, the Developer Guide is a good resource.
#!/bin/bash MY_INSTANCE_ID=`exec wget -q -O - http://instance-data.ec2.internal/latest/meta-data/instance-id` echo $MY_INSTANCE_ID |
This script grabs the instance id and puts it into a variable and prints it back out by using the result of a remote execution to wget with the quiet (-q) and the output file set as standard output (-O -), the second dash is what sends the data to standard output so don’t forget it! Now anywhere in our script we want the instance id for string comparison, logging or whatever, we have it!
Loading data from a file
What if the data we want to load is in a file on the disk? This method is not good for processing giant apache access logs, but with smaller text files, it will work just fine.
#!/bin/bash FILE_DATA=( $( /bin/cat file_data.txt ) ) for I in $(/usr/bin/seq 0 $((${#FILE_DATA[@]} - 1))) do echo $I $FILE_DATA[$i] done |
What’s going on? The code is being loaded into an array, in bash, called FILE_DATA. It then loops over each element in the array using a for loop. Finally within the loop, we simply print the current index and then output the line we loaded. This would be roughly equivalent to running cat -n file_data.txt from the shell directly, but obviously gives us the flexibility to do further processing with the string contained in the variable.
Loading data from the user
Obviously this is not ideal for creating a process that runs on a cron job. However, if a script is being run by a user, they often need to tweak something about the way it runs that often can’t be detected automatically. In this case, you’ll want the user to key the data directly into your script.
#!/bin/bash read -p "Enter Something: " VARIABLE echo $VARIABLE |
This example uses read with the optional prompt (-p) flag. This causes the text in the quotes to be displayed on the users standard output or terminal window.
Loading data from the command line
A step further is to let the user pass in data on the command line at run time. This of course can also be automated if needed. The following example leverages getopts to parse the parameters that were called in.
#!/bin/bash OPT_A=0 OPT_B='Undefined' while getopts ":ab:" OPTION do case $OPTION in a ) OPT_A=1 ;; b ) OPT_B=$OPTARG ;; esac done shift $(($OPTIND - 1)) echo $OPT_A $OPT_B |
The example script takes 2 different parameters a flag “-a” and a flag “-b” which expect data. In the example, default values are provided for each value, this gives the effect of making all flags optional. Using the flag -a would likely toggle a specific behavior within your script, perhaps loading a specific configuration file instead of the default one. If you wanted to collect data in each field, you simply add a colon “:” after each flag, ‘a’ in this example, following the getopts command. You would then update the case statement to reflect your expectation of data being present in $OPTARG. See the modified script below for clarification.
#!/bin/bash OPT_A=0 OPT_B='Undefined' while getopts ":a:b:" OPTION do case $OPTION in a ) OPT_A=$OPTARG ;; b ) OPT_B=$OPTARG ;; esac done shift $(($OPTIND - 1)) echo $OPT_A $OPT_B |
But wait… there’s more!
There’s also a simple way to pass data in that just stores the input from the command line into the $1, $2, $3, $4 and so on input variables.
#!/bin/bash echo $2 $1 |
The script above when run as “./test_script hello world” will output “world hello”. This method can be handy for scripting quick tasks that you often use a series of parameters for. For example, adding the flags “-la” to “ls” as demonstrated below.
#!/bin/bash ls -la $1 |
Script Configuration
So now that we can get different bits of data from all these different sources, what if all my scripts leverage the same data? Can’t I just have it as a single configuration file that I edit once? YES! This next example does just that. While it doesn’t technically load data into a variable, it does allow you to encapsulate your code, including a file full of variable assignments, into logical chunks. In my case, I was looking to avoid editing multiple scripts when configuration changes were needed.
First I created my configuration script, my_script.cfg, in the same directory I am running my example script below.
# Comments are allowed OPT_1='Ubuntu' OPT_2='Linux' OPT_3='64bit' |
Now the script that uses the configuration file above.
#!/bin/bash OPT_1='RedHat' OPT_2='Linux' OPT_3='i386' if [ -f my_script.cfg ];then . my_script.cfg fi echo $OPT_1 $OPT_2 $OPT_3 |
Dissecting the script you’ll see that I first set some default values. Next the code checks for the existence of the configuration file. If found, it is included. It’s important to note that this is included because it actually allows you to run code within the configuration file. An EC2 instance might, for example, place all of the calls to instance-data.ec2.internal for metadata into a configuration file that’s simply included on scripts that use that information.
That’s it! Hope you find this resource helpful!
And for anyone looking to put those around you on alert, buy the t-shirt from Think Geek.