Learning from ‘Working Effectively with Legacy Code’ by Michael C. Feathers
Working Effectively with Legacy Code by Michael C. Feathers is a classic software development text. Not only is it still completely relevant when working with code, seven years after it was published, but also the approaches it advocates at the source code level are becoming essential to apply at the component and system integration levels too as our software systems become more complicated and polyglot, and we learn to fully automate our deployments and treat our infrastructure as code.
For an overview of the book itself, read my review of ‘Legacy Code’ on Amazon.co.uk. I describe the different sections and how each is useful in a particular context. The rest of this article will focus on other aspects of Legacy Code, particularly the ‘why’ of changing code, and the implications for Continuous Delivery, infrastructure-as-code, and deployment automation.
Thinking About Legacy Code
With a decade or so of hindsight, test-driven development (TDD) is probably misnamed. It should really be called ‘safe software development’ or ‘sustainable software development’ (or even just ‘software development’), because what TDD provides is the ability to evolve a system rapidly without fear of breaking code accidentally. Without the safety net of tests, a code base is effectively a minefield, with lurking code bombs waiting to explode around the developer, or (worse) around the end user when the system is live in Production.
Michael Feathers is very clear about the nature of legacy code: legacy code has little or no test coverage, and
code without tests is bad code.
Let’s consider the implication of this view. Even if your code was written yesterday, if there are no unit tests to characterise or define the behaviour of the code, then the code is legacy, and will be difficult to change without introducing bugs. On this basis, it becomes dangerous, disrespectful to others and perhaps even unprofessional to write code without tests.
We are also urged to expand our view of why we write tests, especially when working with legacy code: instead of specifying correctness, tests can be written to characterise behaviour of code (p. 188). This is particularly helpful when we are breaking dependencies, as using tests to characterise behaviour helps to build an understanding of how non-tested code actually operates. Some of these tests should probably be discarded once the characterisation is complete, to avoid weighing down the test suite with tests which have served their purpose.
Looking to Other Engineering Disciplines
In the largely abstract world of software, it is often helpful to use metaphors from physical engineering disciplines in order to clarify our approaches to writing (and re-writing) software. A test harness is a sub-component used in electrical engineering to replace the normal wiring loom (or ‘harness’) so that the main component can be tested in isolation (see photo).
An electrical test harness (http://www.flickr.com/photos/julianbleecker/292476285/)
The test harness usually has special additional features (such as extra pin-outs and diagnostic ports) to let us better inspect the system under test. In software terms, a test harness allows us to isolate a component from other components on which it will depend, so we use dummy or lightweight implementations of these dependencies for the test harness, and (often) additional or alternative diagnostics and logging information.
Likewise, Michael Feathers uses the real-world image of a workbench clamp (or vise/vice) to show how tests can be used to hold tightly or ‘clamp’ the existing behaviour under test, just as a component in a workshop would be clamped when being worked on to avoid unexpected movement or behaviour during testing.
A vice/vise/clamp (http://www.flickr.com/photos/60in3/2263199821/)
When working with legacy code, we want to ‘clamp’ the part of the code under test to enable us to work safely with it without unintended side effects.
Rapid Unit Tests
The author is absolutely clear about the need for rapidly-executing tests: a unit test which takes 100 milliseconds is ‘a slow test’. Unit tests should not call out to the file-system or databases; such tests are component tests, and should be run in a separate test group. In fact, it’s so important to have rapid feedback from unit tests that they should be run as often as possible: Michael Feathers was able (back in 2004/2005) to predict (p. 310) ‘hyper-aware’ IDE features such as ‘run tests on save’, provided by tools like Mighty-Moose. If you have not established a clean separation between different kinds of tests (unit, component, integration, etc.) then it’s likely that you are not running your unit tests often enough and therefore that bugs will creep into your code unnoticed. Keep your first-level tests super-quick, and run them as often as possible!
Structure of the Book
The book has three main sections. In the first section, The Mechanics of Change, Michael asks us to consider why we want to make changes to software: is the change to fix a bug, to implement a new feature, to improve the use of resources (memory, disk, CPU), or to improve the design via refactoring? Crucially, a change set committed to version control should contain only one of these change types, not a mixture; the change set will thus have a single responsibility, and make future branching and merging easier.
We are introduced to the ‘Legacy Code Change Algorithm’, which goes like this:
- Identify change points
- Find test points
- Break dependencies
- Write tests
- Make changes and refactor
The book clearly explains how and why we tackle each step in the process; the ‘why’ of code change I found particularly helpful.
The first section also introduces concepts such as ‘seams’ (alter behaviour without editing in that place) and ‘enabling points’ (the place where you choose a particular behaviour) which struck me as very effective mental constructs with which to approach legacy code.
The second part of the book, Changing Software, consists of a set of scenarios in which you might find yourself when confronted with a codebase without tests; there are chapters such as ‘My Application Has No Structure’ and ‘This Class is Too Big…’ Use this part of Legacy Code when you have already characterised the situation you face, and want to go directly to a solution.
The third and final section, Dependency-Breaking Techniques, covers much of the same ground as part two, but from a different perspective: it is organised as a ‘cookbook’ of ‘recipes’, and the entries here together form a kind of toolbox for working with legacy code, allowing you to ‘dip in’ to particular solutions without needing to have read the earlier sections.
For me, the only real omission in Legacy Code is a discussion of dependency injection (DI) techniques and frameworks by name, and this is likely due to the timing of its publication; in 2004/2005, DI frameworks such as Spring were still relatively new, and Martin Fowler’s bliki article on dependency injection had been only recently published. Fortunately, recipes such as Extract Interface and Push Down Dependencies are effectively forms of DI, as they move the direct dependencies out of a method or class, thereby enabling refactoring.
Implications for Deployment Automation and Continuous Delivery
In the context of Continuous Delivery and end-to-end automation, we could re-formulate Feathers’ maxim (code without tests is bad code) as:
Automation without tests is bad automation
So your nice new script which spins up some funky EC2 instances or uses PowerCLI to automate some on-premise ESX VMs is already legacy unless it has test coverage. We need mock or fake implementations of cloud APIs, Nginx instances, Chef/Puppet servers, and firewall appliances in order to avoid the trap of legacy code: code without tests.
Books such as the excellent Test-Driven Infrastructure with Chef by Stephen Nelson-Smith take us in the right direction (along with Cucumber-Chef), and the OpenStack cloud API has promising support from a wide range of vendors, but as far as I can see many people are still writing automation code directly against the ‘real’ API endpoints, without the kind of test coverage and ability to refactor that we’ve come to expect thanks to unit tests at the class or module level.
In fact, taken further, the implication of the approach in Legacy Code is that the whole system is legacy unless it has test coverage developed using TDD; this includes the load-balancer, the enterprise service bus, the virtualisation platform, log aggregator, metrics server and the various data stores. Tests at these levels are clearly not unit tests in the same way as those at the class or module level in Java, C# or Ruby, but comprehensive test coverage using mock or fake versions of infrastructure components is going to be essential if we are to evolve automation code effectively. Cucumber-Nagios is an example of a good tool in this space: specifications for infrastructure behaviour in natural language, automatable via Nagios.
The techniques and concepts explained in Legacy Code for decoupling and refactoring untested code, such as Extract Interface, Push Down Dependency, Seams & Enabling Points, and Sensing and Separation all can be re-purposed for tackling component- and system-level integration and automation.
Summary
To work effectively with legacy code, we need to bring it under test, and Working Effectively with Legacy Code is an excellent handbook for doing exactly this. However, the book also points the way towards fully-tested deployments and infrastructure automation, areas which are (in 2012) still relatively new to many people. For Continuous Delivery and similar lean delivery practices to be fully effective, we need to find ways to place all of our code, including deployment scripts, Chef/Puppet code, and ‘glue logic’ for LogStash/Graphite/Syslog/etc. under test; Legacy Code shows us the mindset we need to achieve this.