5 minute read

For the last few days I’ve been working in tech support for a large company: I’ve been hired to assist in backing up, mastering and restoring a couple of computers. As I pointed out the need to automate some trivial tasks, our manager gave us an Excel sheet containing data about the machines we need to work on.

The Excel sheet

I was given an Excel file containing four columns, and organized like this:

NAME SURNAME OLD HOSTNAME NEW HOSTNAME
John Carmack DOOM001 QUAKE001
John Romero DOOM002 QUAKE002
Fritz Box FRITZ!Box3490 FRITZ!Box7590

And I needed some tedious Batch code to match the old hostname with the new one and link them with the names, to efficiently name backup folders. I could have manually copied the data, except that the table was more than 80 rows long!

The solution

By looking at Microsoft Excel, however pretty it may be, I wasn’t going anywhere: I needed to get the data in an usable format. What about exporting it in CSV?

A CSV file (Comma Separated Values) is a format commonly used by datasheet programs to export raw data. The id.csv file generated by Excel from the table above looks like this:

NAME;SURNAME;OLD HOSTNAME;NEW HOSTNAME
John;Carmack;DOOM001;QUAKE001
John;Romero;DOOM002;QUAKE002
Fritz;Box;FRITZ!Box3490;FRITZ!Box7590

Now this is data I can manipulate!

Now a couple of solutions spring by: I could have created a program that crunched the data and created the necessary structures on the backup medium, or statically generate the Batch code. I went with the latter option, for comfort and ease of use.

I wanted the result to look like this:

IF %COMPUTERNAME%==DOOM001 SET %BACKUPNAME%=CARMACK.JOHN_QUAKE001

The implementation

To implement this tool I chose Ruby, which is a powerful, no fuss language. Let’s start by loading the CSV library and see how it works.

I whipped up this sample code:

require 'csv'

# Open the CSV file
input = CSV.read("id.csv") 

# Iterate on the CSV
input.each do |row|
    puts "Name: #{row[0]}, Surname: #{row[1]}"
end

And ran it. This was the result:

$ ruby csv2bat.rb
Name: NAME;SURNAME;OLD HOSTNAME;NEW HOSTNAME, Surname:
Name: John;Carmack;DOOM001;QUAKE001, Surname:
Name: John;Romero;DOOM002;QUAKE002, Surname:
Name: Fritz;Box;FRITZ!Box3490;FRITZ!Box7590, Surname:

Ok, so we have got a couple of things wrong:

  • The first row is garbage and should not be displayed
  • Columns are not separated correctly

The separator

The separator used by Microsoft Excel in this case was not a comma but a semicolon. To fix this issue you can either do Ctrl+H to replace all occurrences, which is inelegant, or think about a solution. Let’s look at the RDocs:

read(path, *options)

Use to slurp a CSV file into an Array of Arrays. Pass the path to the file and any options::new understands. This method also understands an additional :encoding parameter that you can use to specify the Encoding of the data in the file to be read. You must provide this unless your data is in Encoding::default_external(). CSV will use this to determine how to parse the data. You may provide a second Encoding to have the data transcoded as it is read. For example, encoding: "UTF-32BE:UTF-8" would read UTF-32BE data from the file but transcode it to UTF-8 before CSV parses it.

You may set any reading and/or writing preferences in the options Hash. Available options are:

  • :col_sep

    The String placed between each field. This String will be transcoded into the data’s Encoding before parsing.

Bang, looks like we can do it quite easily with no hit-and-miss substitutions. Our code now looks like this:

#!/usr/bin/ruby

require 'csv'

# Open the CSV file
input = CSV.read("id.csv", {:col_sep => ';'}) 

# Iterate on the CSV
input.each do |row|
    puts "Name: #{row[0]}, Surname: #{row[1]}"
end

Notice the {:col_sep => ','}: this is an hash, the quick way to do associative arrays in Ruby. :col_sep is a symbol, a way to name things that uses less memory than a string, but can easily be looked up and created on the fly. Our output is much prettier too:

$ ./csv2bat.rb
Name: NAME, Surname: SURNAME
Name: John, Surname: Carmack
Name: John, Surname: Romero
Name: Fritz, Surname: Box

The first row

I could manually strip the header from the file (in fact, originally I did), but this wouldn’t be elegant at all: let’s look at a quick way to do this automagically.

We could tackle the issue in different ways:

  • Check whether the row contains “NAME”, which shouldn’t occur elsewhere
  • Skip the first line

I decided to elaborate a bit on the first option and create a solution that could be useful in many cases:

#!/usr/bin/ruby

require 'csv'
key_index = 0 # Index of the key
key_blacklist = ["NAME"] # Items that should be skipped

# Open the CSV file
input = CSV.read("id.csv", {:col_sep => ';'}) 

# Iterate on the CSV
input.each do |row|
	key = row[key_index]
	puts "Name: #{row[0]}, Surname: #{row[1]}" unless key_blacklist.include?(key)
end

Now the output looks like this:

$ ./csv2bat.rb
Name: John, Surname: Carmack
Name: John, Surname: Romero
Name: Fritz, Surname: Box

In case I wanted to exclude my friend Fritz Box who doesn’t exist and has never been an Id employee, I could just add "Fritz"to my blacklist and the output would look like this:

$ ./csv2bat.rb
Name: John, Surname: Carmack
Name: John, Surname: Romero

The script could now be modified to accept those as command line parameters but that’s out of the scope of my discussion.

Getting a Batch file

The output is still not what I need: my goal was to generate Batch code straight out of my script. Let’s add the final touches and call it a day:

#!/usr/bin/ruby

require 'csv'

key_index = 0 # Index of the key
key_blacklist = ["NAME", "Fritz"] # Items that should be skipped

# Open the CSV file
input = CSV.read('id.csv', {:col_sep => ';'}) 

# Open the Batch file
File.open('csvlut.bat', 'w') do |out|
	# Iterate on the CSV
	input.each do |row|
		key = row[key_index]
        unless key_blacklist.include?(key)
			out.puts "IF %COMPUTERNAME%==#{row[2]} SET BACKUPNAME=#{row[1].upcase}.#{row[0].upcase}_#{row[3]}" 
		end
	end	
end

In case code’s not clear enough, I told the script to open a file in write mode, put a couple of lines of batch code in and terminate, relying on the fact that in Ruby everything is an object, so I can iterate on File.open() even if I’m opening only one file. The postfix unless, which is syntactic sugar for if !, was swapped with standard notation to reduce the length of line 17. The output looks like this:

$ ./csv2bat.rb
$ cat csvlut.bat
IF %COMPUTERNAME%==DOOM001 SET BACKUPNAME=CARMACK.JOHN_QUAKE001
IF %COMPUTERNAME%==DOOM002 SET BACKUPNAME=ROMERO.JOHN_QUAKE002

Wrapping up

Ruby feels like the optimal language to quickly perform text transformation and line operations on text files: it’s powerful enough to be up to up to the most tedious tasks, yet simple enough to whip up a quick-and-dirty solution that gets the job done in mere minutes.