KnowDotNet Visual Organizer

Getting Ready to Refactor

If you're going to do it, do it right

by William Ryan
Print this Article Discuss in Forums

About a month ago I was first exposed to Refactoring by Alex Yakhnin.  He was telling me about a new job that he recently took and some of the things they were doing there.  From the first time I saw some of this work I've had a deep respect for his programming skills and I decided to look into Extreme programming, Refactoring and Unit Testing a little more.  I was already sold on the concept of test driven development and after reading John Robbins Debugging Applications For Microsoft .NET and Microsoft Windows, the best book of its kind, I already had a decent background in unit testing and automating the testing process. What I hadn't had much exposure to was Refactoring.  I kind of thought it was one of those trendy things everyone gets into and then forgets about, like XML Comments ;-).  Nonetheless he made it sound so cool I wanted to pursue it.  Since then, I feel like a complete slacker for waiting this long to get up to speed on it.

After getting Whidbey installed and seeing a few demos, I started my pursuit with Whidbey, I mean Visual Studio 2005's Refactoring Tool.  Let's just say it was anticlimactic.  To be honest, the fewer bad habits you have the less need you'll have for refactoring to some degree, but we all inherit code that wasn't optimized, have code that just keeps growing, code stuff in a hurry, make compromises etc.  So I guess in some dream world you'd never have a need for refactoring but for any real world scenario, you'll probably find that there's a lot of upside to it.  Anyway, two of VS 2005's main features are Encapsulate Field and Extract Method.  Encapsulate field means that you make a Property out of a public variable.  As I've mentioned before, I think this is unforgivable so fortunately I have little need for it.  Extract Method basically lets you cut out a bunch of code, create a method and insert it in the method, and then call the method from where the code used to be.  As a rule I try to keep my code blocks under 30 lines so this wasn't much use to me either.  I immediately thought "Refactoring can't be that big of a deal if this is all there is to it.  How many developers use public variables over properties?  How many developers don't try to limit the size of their code blocks?"  Well the second one is easy to slip up on here and there, but it's usually something you go back and clean up quickly so I just didn't get it.  Fortunately, some of the wonderful ladies at Addison-Wesley press gave me a copy of Martin Fowlers SUPERB Book Refactoring:  Improving the Design of Existing Code.  Since then learning how to do it and implement it correctly has occupied a lot (too much according to my girlfriend) of my time.

Before I continue let me mention a few things.  Refactoring is not some simple thing you do to your code.  If you haven't put some effort into learning the subject, you don't know it.  It's not something you want to do half heartedly.  In many instances you got into a mess b/c you didn't want to acknowledge your bad habits, you were willing to make compromises on your code, you weren't willing to spend enough time thinking about your code up front or any of a bunch of other problems.  If you are going to take the time to "Improve the design of your existing code" then you really ought to think it out and do the best job you can at it.  

The fact is that Refactoring, by definition (which I'll get to in a minute) is improving the
design of existing code.  As such, you're going to need to understand your design and that will probably mean changing it in more than a few places.  To do anything of substance, you are going to have to learn something about refactoring and you are going to have to learn a lot about the environment you are coding in.  Often time bad design isn't the result of carelessness as much as ignorance of the programming language.  In addition, Refactoring isn't a magic bullet.  It can help many things and vastly improve your code, but it's only as good as the thought you put into it.  Bad design is bad design, even if it's refactored bad design.  Finally, Refactoring involves risk. You are taking a stable or somewhat stable code base and redesigning it.  This inherently creates risk.  Furthermore, in order to do anything of substance, you are going to have to make some serious changes to your code and this carries with it the possibility that you are going to break it.  As such, Refactoring without Testing (and automated testing if you want to be serious about it) is suicide.  In summary, Learn Refactoring, Learn your Language in depth, think seriously about your existing design and your future design, and most importantly TEST!

So what is Refactoring?  According to Fowler, "
Refactoring (noun) : a change made to the internal structure of software to make it easier to understand and cheaper to modify without changing its observable behavior"  Now you'll notice that the focus of Refactoring is existing code.  However by doing it correctly, you'll make future code a lot easier to write and you'll reduce or eliminate the age old problem of continuing down the wrong path b/c to do otherwise would break existing code.  After you've refactored, you should think of the lessons learned and use essentially the same principles to doing it correctly.  The more you learn from your mistakes, the better you'll be able to write in the future.

Why Refactor?  According to Fowler, there are four primary reasons, each of which are compelling in their own regard (I highly encourage you to read the original as I'm merely paraphrasing.  The following sections can be read in their entirety on pages 55-58 or Refactoring:
1)  Refactoring Improves the Design of Software
  No matter how good your design is originally, bugs will probably be found.  In addition, customers will want functionality added or changed.  Many times there is tremendous pressure to get it done immediately and I know of more than a few places where the prevailing mindset is that the Sales people already promised it so get it out ASAP.  The bugs can be fixed later.  In this environment, your design is almost guaranteed to deteriorate and unless you have a lot of time to plan and implement changes, this is almost inevitable to some degree or another
  Another fact is that the longer you work with something, the better you know it.  So what appears to be a good design to you a year ago is often less than par to the you of today.  One reason I hate source safe is that it contains proof of how many bad ideas I had two years ago ;-).  As we progress we learn.  And it's not uncommon to be able to do something in a fraction of the time and length it took you to do it a year ago.  This again puts you in a situation where doing it right might conflict with your original design and letting yourself get bogged down by poor original design often leads to deteriorating code as well.  And another fact is that you probably won't make the decision to kill the bad design before you've already built on it.  If you happened to time this perfectly it was probably the result of luck more than skill.
  There are many other factors that can lead to code getting worse over time but I think you get the idea here.
2) Refactoring Makes Code Easier to Understand
  This is huge!  Back in college I remember that I often got badgered by one of my professors about comments and variable names.  I tried really hard to play by the rules but he always had some snide comment to make about one, the other or both.  Few things were more grating than getting an "A" on a project but having it marked up with condescending comments about my comments or variable naming conventions.  As I moved into the professional world I realize that just like building GUI's, everyone is an expert.  Everyone's naming conventions are the 'best' way.  Some people take a minimalist approach which leads to cryptic code.  Some people go to the other extreme and have naming conventions so long that they fry your mind trying to follow them.  Most people fall somewhere in between. Then there are comments.  Some people comment everything. Some people comment most of the time. Some people never comment.  Some people forget to update their comments (the worst of all situations b/c they are misleading).  So in the end, I think the mature approach is coding so things are 'clear'.  Not clear to you but clear and intuitive to everyone.  Ego and pride has no place here and the world doesn't revolve around you.  The bottom line is that no matter how great your convention are, it's doubtful everyone else feels the same way, so another approach is needed.  Following Microsoft's guidelines is good idea, simply b/c they have a lot of experience at it.  Using FXCop to proof your variables and naming conventions is another good idea.  But the best is refactoring so your code is clear, the methods and variables are self-describing, comments are up to date, and unit tests are included so they can tell the usage story.  Face it, there are very few objects in the .NET Framework with acronyms, one or two letter names or 12+ latter names.   The DataAdapter is a glaring exception, but if you have method names that span 20 letters, something is probably wrong with the naming convention or the object design.
3) Refactoring Helps You Find Bugs
This doesn't need much explanation.  The more you know about your code and the more you improve the design, the more likely you are to spot bugs and eliminate them
4) Refactoring Helps You Program Faster
  Of course it does.  It takes a while to do, and takes some time to test, but the time saved modifying code saves hours. I had an application that had multiple methods that took in two Date Parameters and returned a DataTable.  I probably had 25 instances of this in my code at least and each one differed by only one line, a stored procedure name.  At the time I knew this was redundant but I convinced myself that it was wise b/c I could ensure that this method could only call the stored procedure I wanted it to and absolutely eliminated the risk of data getting mixed up if I made a mistake.  To some degree the logic wasn't terribly flawed b/c isolation did get me to that objective.  However, it took time to write all 25 procs.  When I realized I had not trapped the exceptions in the order I wanted (darn VB.NET for lack of reachability enforcement) I had to make changes to all 25 methods. I made two typos which broke the code but it wasn't readily visible b/c those were seldom caused.  In the end it caused a lot of maintenance headaches b/c anything I did had to be repeated 25 times, no matter how small.  So I replaced all of them with one method that added a string parameter for the procedure name, went to everywhere I called the proc and included the proc name (encrypted of course which was still quite easy to implement) and off I went.  It took some debugging and testing, and carefully reviewing my calls, but I still had my objective met and I certainly could make modifications a LOT faster.  This was one example of many but it's a good one. If I didn't change this, every time I wanted to modify that routine, I'd have to waste time doing it to 25 other procs. That's more error prone (which if I make a mistake will take time to detect, debug and fix) and takes a lot more time.

Well, this is pretty much the groundwork for what is refactoring and why.  I'm going to start posting articles (hopefully daily) with specific examples of how to refactor.  I'm going to use language agnostic examples that apply to every language as well as examples specific to .NET.  I'm going to try to cover everything in Fowler's book (which I highly recommend you buy) as well as many of my own.  I'm also going to discuss using NUnit for testing and I'll build in the unit tests with the examples so you can use the code out of the box.  To keep it interesting, I'm going to focus on real problems I've come across and what I did to fix them.  If along the way you have any questions or comments, please don't hesitate to let me know.

Writing Add-Ins for Visual Studio .NET
Writing Add-ins for Visual Studio .NET
by Les Smith
Apress Publishing