Dec 30, 2008

Refactoring your code!

My current programming project is Objective-C application for MacOSX to generate climate model results in the form of a web site and a PDF. It’s actually a complete write of an existing application.

Why rewrite? The original code was built over many months, is stringy, and very hard to maintain. To get an idea of how hard, I’ve tried to rewrite this application five (yes, 5) times. The original application was designed to generate NCAR Command Lanaguage (NCL) scripts that generate hundreds of images based on climate model results. From there, the application generated makefiles that would continue the processing by running NCL for each script, convert the resulting postscript images into jpegs, and finally using FOP and Docbook to generate HTML and PDF documents with all the images and some related text.

There are several major weak points in this app. First, the conversion between postscript and jpeg images was originally handled by Imagemagik. Unfortunately, I’ve had a lot of trouble keeping this functionality working so I switch to generate javascript code to use Adobe Illustrator to do the conversions. In general, I was relatively happy with this solution (so much so, I gave up trying to get Imagemagik working). The unfortunate problem with using Illustrator for this work is that it no longer made the simple makefile approach to do the processing completely viable. I had to rewrite the makefiles to stop processing when the postscript files were ready, and allow running when the jpegs were ready.

Another problem with the code is that the makefiles and scripts were location dependent. In other words, if I moved my files to another location I would have had to regenerate everything if I needed to rebuild the image set.

The most significant problem was the Objective-C code itself. I’m a self-taught programmer, so I have a lot of bad programming habits. So, the code was stringy, repeated itself often, and easily broken. Thus, to make simple changes, such as changing the contents of the makefiles, was very painful. Hence, when I wanted to do something new in the project, I tended to opt to write a new version of the code… but it’s a LOT of code to rewrite!

What’s different this time? I realized I needed some new images in the reports. Furthermore, I needed a way to start adding more images as I came up with new ideas. Plus, I was just plain sick of the old code not being what I really needed.

The new app is not yet complete, but I’ve changed how I’m writing the code. The main thing is that I’m taking the time to decide if I can rewrite my code as I go by asking some simple questions: Am I repeating my code? Can I simplify the code by breaking out new methods? Can I join classes into a simpler class hierarchy? These questions are actually quite painful in many ways, often because the answer is “yes”. When this happens, I break down and stop moving forward and see how I can improve the code. This often breaks existing code and can take significant time to propagate the changes through the code. In some ways, it is very much like the classic quote, “I would have written a shorter letter if I had more time”.

However, the benefits have been huge. Instead of many classes repeating code, I now have a good class hierarchy where I need it with all common methods in the superclass. Instead of stringy code, I have much more clear and readable code. Still not perfect, but greatly improved. Debugging has become easier with better code isolation in methods. So far, I’ve simplified my original 14 NCL template scripts down to 4.

I’ve also learned a few new things about debugging. Exceptions are something that I’ve never fully understood in terms of when they should be used and when I watching for them in code. I’ve was hit by a NSMutableDictionary exception of trying to insert a null object. The problem is that I use a lot of dictionary calls in my code and the exception, nor gdb AFAIK, tells you where this occurs in the code. By implementing @try, @catch in most methods allows me to at least pinpoint the method. While not perfect, it certainly promotes the use if small and clear methods.

So, the lesson of this project is don’t be afraid of refactoring. Do it as soon as the need arises. It will likely take up more time up front, but the payoff may include shorter overall development time, greater stability, easier readability, and extending the application is easier.