The Importance of Scriptability

A bit of background

I work in an environment where we have a huge bunch of projects, and we all have different tasks in these projects. It's sometimes a bit chaotic, but somehow things seem to get resolved. One of the biggest problems we have is knowledge sharing, but generally we have a problem when person A needs to redo what person B did. This causes worries, head scratching, curses, WTF-moments and so forth. It's not a nice environment to be in because mostly you have to simply trust the originator to either do it or you have to beg that person repeatedly to show you how it's done. Most often it's just easiest for the originator to do the task again. Sadly this way means that one person sits on the knowledge and nobody knows how that black-magic-woodoo was performed.

How it became this way

Our organisation is very relaxed, maybe a bit too relaxed. We're trying to use a bunch of standard tools, but we're all people and as such we do things a bit differently, at best. At worst, we walk a completely different path that nobody knows and we're not good at sharing either. Most of us are somewhere there in between: We use the toolset, but we keep our own little personal toolbox next to us and we dive into there when we're not comfortable with the "official tools" (if there even is such a tool for the job).

One of the scenarios

The company deals with legacy data models and restructures them into a more standards compliant data model. Often we also convert our client's legacy data into our structure. This process is highly organic, mainly because the data recieved from the client can be extremely diverse; Excel files, Access databases, CSV files, SQL scripts, and so forth. Often a combination of it all.
This means that the "lucky" guy who does this often takes to whatever means there is. The CSV file is in the wrong character set; so it needs converting. Opening the file in the OpenOffice.org suite and resaving it usually cures this. The MS Access data is extracted using the command-line MDB-Tools package under Linux, well you get the picture.
What's not so obvious is that these method is incredibly difficult to document and especially reproduce. The number of words you have to type in a WiKi or README file to describe how you open up an MDB file in MS Access and export all tables to CSV data is incredible! Not to mention if you have to take screenshots to make it all understandable. And this is only until MS decides to go change the wording and/or the whole user interface.
And this is only one step in the whole process. Imagine having 3 different MDB files, 5 Excel sheets, and some "random" CSV files. All of this needs to be massaged and formed until it fits a relational database structure such as MySQL or forbid; A beautifully normalised, constrained and optimised PostgreSQL database.

Just add to the complexity

The above scenario is kind of describing one little piece of it. Imagine if this is something that has to be catered for in an organisation that consists of hundreds of developers. A large part of these will either contribute to the process, or some of them need to be able redo or replicate the process. This means that every person here has to both have all of the software and be able to follow the procedure from some sort of instructions. Now throw in the spanner (wrench) into the works by remembering that I said that every person likes to do his/her things a bit "in their own way". This means that when person 1 introduces tool Z, persons 2 through n, need to install it or at least have it available for install (and have the permissions to install it). And then the people have to learn to use it... you can see this snow-balling into the trash bin quickly. It just isn't sustainable.

But wait, there's more!

If we move our focus away from this particular excersise for a moment, then we can look at what actually propels the software world. And, in the extremes that's called shrinkwrap software, the opposite is consultingware. JoelOnSoftware makes a nice comment about this in his article Set your priorities:
If you ever find yourself implementing a feature simply because it has been promised to one customer, RED DANGER LIGHTS should be going off in your head. If you're doing things for one customer, you've either got a loose cannon sales person, or you're slipping dangerously down the slope towards consultingware. And there's nothing wrong with consultingware; it's a very comfortable slope to slip down, but it's just not as profitable as shrinkwrap software.

The path from consultingware to shrinkwrap is long and uphill. Still, if we ignore marketing and sales for a moment, the concept of going from consultingware to shrinkware is quite simple; automate everything, simplify everything and cater for a greater audience. To cater for a greater audience you have to build in functions that are requested by the majority of the users, making sure that they can still adopt their business to our software.
But the real keyword is to automate everything and simplify things as much as you can. And these two things have got one thing in common; scripts. Installation script, build script, upgrade script, maintenance script, the list can be almost endless.
If you let these small scripts evolve and live, you will eventually end up with a piece of software that's easy to deal with; in fact, the day the sales and marketing group wants you to dress the scripts up in Graphical User Interfaces (GUIs) It'll be a fairly trivial job, after all, you've got the logic sitting there already. This is evolution - in fact so closely that you're letting your software evolve; It grows.

The possibilities are endless!

Not only does these small scripts allow you to grow you software, but they provide a huge bunch of other benefits. Let's look at a few..

  • Traceability - you can easily (everything is relative!) follow the script and see what's going on. You can abort the script if you need to, to examine the software at a certain stage.
  • Repeatability - This one is very important. Once you have you script you can run it millions of times (in a loop if you want) and it will always provide the same results. Always! Put a human to do the same monotone task and you'll be faced with errors. Hence: Script = Good quality, Human repeating a task = Bad quality.
  • Documentability - This might not be obvious but it's related to the first point, but from a more human point of view. Ever heard a programmer joke about how badly the application is documented: "Oh, our application is fully documented in {insert your programming language here}, Hahah" There's actually a valid point here; Programming languages do document themselves, to a certain degree. A script will do this too, and stick in a few comments there and you'll be speaking straight to the next developer without any translation. And another good thing; This documentation isn't going to be lost. No more "The dog ate my WiKi page!"-excuses.
  • Free-up-brain-capacitybility - your brain cycles are valuable; don't waste them on mundane tasks that a computer can do. Once the script is written, executing it when needed is a doddle, and it'll never make an error. This allows you to focus on the more exciting tasks.
  • Remoteability - maybe strictly not a point of its own, but still important. You've probably done it; On the phone to a colleague/friend/in-law, you've tried to guide some person around the GUI of Windows to open an application and tried to get the person on the other end to do some task that you know how to do so easily, but describing it over the phone is a nightmare. You can't see the dialogue boxes, you don't speak the same language and so forth. Now wouldn't it be nice to say, "Open the shell, type this, and then run this script." (Maybe this won't work on your in-laws, but it should work on your colleagues)

How about that..?! Just benefits! Wonder why we even have GUI based systems...!!? No, writing scripts do have negative sides too, for example it can take a long time to get to a stage where you can actually write the scripts, and finding out all options can be a bit of a chore. Luckily this is more of a one-time investment.

How to get started

There's two things you really need to do The first one is to start writing the script for whatever task you find error prone and/or mundane. But the second one is that you have to excersise some self dicipline and don't go for that GUI based application. Here's a few examples:

  • Don't open up MySQL Query Browser or phpMyAdmin (yuck!), use the mysql command prompt.
  • Don't open Photoshop/Paint Shop Pro/The Gimp to resize your images; Use ImageMagick instead.
  • Don't save a web page using your favourite browser; Use WGet instead.

And so forth. Once you stop running for your first GUI application, you'll start to discover the vast world of command line utilities.

A few sample scripts to get you started

I've recently been working with MySQL so I've got a fresh batch of tiny MySQL scripts that I'm using to free up my brain cycles, but I'll add a few more for good measure.
Connect to MySQL
The purpose is to stop having to remember all your usernames and passwords and settings etc.

mysql -umyuser -pmypassword mydatabase

Put that in a file called, for example, myconnect_mydatabase.sh. Make it executable and private by running chmod 700 myconnect_mydatabase.sh. Then run it by typing ./myconnect_mydatabase.sh.Create a DB and its structure in one script
The purpose of this one is so you can easily recreate the database you're working on. Create a file, for example, create_mydatabase.sh

## This script will Create a DB and its schema and insert the default data into it.
echo "Creating the DB..."
mysqladmin -umyuser -pmypassword mydatabase --default-character-set=utf8
echo "Created the DB."
mysql -umyuser -pmypassword mydatabase <<SQLDONE
CREATE TABLE \`authors\`....
CREATE TABLE....
-- Done creating tables, now inserting default data.
INSERT INTO....
-- Done the SQL
SQLDONE
echo "Database schema created and populated!"
## End

Note that this script uses embedded SQL, so you have to escape the backticks, and use different comments depending on where you are.
Create GIF thumbnails of all JPEGs in a directory
ImageMagick is a very handy tool, it's a bit time consuming to experiment and get it right, but once it's right, you're done.
This example is taken from !ImageMagick v6 Examples -- Creating Thumbnails
What this code does is that it'll create a directory called thumbs, and resize all JPEG images in the current directory into max 100x100px GIF images. Put this in a file called, for example, make_thumbs.sh

mkdir thumbs
mogrify  -format gif -path thumbs -thumbnail 100x100 *.jpg

Easy peasy - no need to open up those hefty GUI programs and no need to drag all those images in there and save them one by one (if you haven't located the batch-processing tools yet).Download something to a specific directory
So you have friends that send you all sorts of "cool links", but you might not have time to look at them rigth now. Easiest thing is to create a little script that allows you to download the URL and store it in a special location. So whenever you need to save this URL, you log into your machine, type the script name, paste the URL and off you go...
First you create your directory, something like mkdir /home/myusername/webdownloads/ then you put the next script into a file, say webdownload.sh:

## Webdownload script
dlpath=/home/mysusername/tmp/webdownload/
echo 'Downloading ' $1 ' to ' $dlpath
wget -q --directory-prefix=$dlpath $1

Then simply run this like, ./webdownload.sh http://example.com/funnypic.jpg
And when you get home and/or find time to look at the funnies, just have a look in your special directory, and there it is. Easy peasy.

There you have it

Starting to write a few small scripts might change the way you go about your daily work. You'll notice after a while that you'll have a very lean but mean toolbox of small scripts that help you out. The same applies to creating a product from your "consultingware". Just keep writing those small scripts and they'll evolve to something bigger and better.
Note that the scrips presented on this page are just examples and they're intended for Linux (or a Unix flavour of some sort), some of them I haven't even tested (sorry). If you can't fix any erros that might lie in the scripts, I suggest you read up on shell script introductions. For MS Windows you have to alter them and use batch-files. I don't like those, so I'm not bothering with translating the scripts. RTFM/STFN.
There are tons and tons of examples out there, and there's a good chance you can simply copy-paste a script and start modifying it.