Build screens (or build monitors, or information radiators) are an important tool in helping to achieve Continuous Integration and in trapping errors early. When the number of build jobs becomes large, it can be tempting to hide ‘successful’ jobs to save space, but we found this to cause problems. I realised that people need to know the context for the red jobs if they are to take prompt action to fix failing builds, so it’s important to represent the full state of all builds by showing green jobs too.
Update: following a conversation on Twitter and a contrasting blog post on Build Cow by @EdMcBain, it’s clear that some context around the size of our software teams and build estate is needed to help clarify why developers ‘ignored red’. I have added a footnote with some details.
There are several open source tools for displaying build information on large ‘radiator’ screens, such as Cradiator, BuildScreen, XFD, Jenkins Radiator View, etc. For example, BuildScreen (from @martinbuberl) displays build status information like this:
A good amount of information is conveyed here: the name of the job (e.g. Playground), the name of the source code branch (e.g. Trunk), the build counter, and even the number of unit tests executed.
Cradiator for GO Build Information
At thetrainline.com we use ThoughtWorks GO for controlling build and deployment, and one of the best radiator tools for displaying build information from GO is Cradiator. Using the “Stack” skin with five build jobs, you can get a view something like this:
However, when the number of build & deploy jobs is larger than about 20, the display starts to become cluttered, with the text size very small. We had over 60 jobs on our main GO server, so the screen looked something like this:
In order to de-clutter the screen and allow the text to be readable again, the green (passing) jobs were hidden:
This looked much cleaner, and so we initially thought it must be an improvement. In fact, this was worse than before.
Removing Context for Information is Psychologically Bad
It turned out that HIDING GREEN JOBS INDUCED BAD BEHAVIOUR; more and more developers began to ignore the red jobs, shown clearly on the radiators only a few metres away from them. Why could this be? In their book Continuous Delivery, Jez Humble and David Farley talk about information radiators and note (p.137) that:
What you choose to measure will have an enormous influence on the behavior of your team (this is known as the Hawthorne effect).
In our case, we were showing only the failing jobs and hiding the true ‘state of the world’ by not showing the green jobs. It seems that this had a negative psychological effect on people, leading them to ignore what should have been strong warning signals. I suspect this was analogous to (or possibly even due to) sensory adaptation, a neurological effect which occurs when a strong stimulus is maintained over a log period of time, but where the signals from the nervous system reduce, allowing the brain to ‘acclimatize’ to the stimulus.
Because we had hidden the green jobs, the radiators screens only ever showed red; there was no way for people to know whether four ‘red’ jobs represented a good state of affairs or a bad one. Out of 100 jobs, four red jobs doesn’t seem too bad (although we’d want to fix them immediately), but out of six jobs, four red seems pretty poor. In the absence of contextual information about the ‘red’ jobs, developers had no way of evaluating how ‘important’ the ‘red’ signal was; the screens were always red, so five red jobs or 25 red jobs appeared roughly the same, visually speaking.
I See Trees of Green
We needed a way to give some context to the red jobs shown on the screens, that is, some green! I toyed with two designs; the first one showed the name of each green job inside a small tile at the bottom of the screen:
For around 20 jobs, this design would probably have worked well, but we had at least 60 jobs to represent, so I thought that a ‘stack’ of coloured slices, each coding for a single job, would scale better:
In the end, the guys who ended up coding up the changes to Cradiator decided to take it one step further, and include the ratio and percentage directly in a bar chart at the foot of the screen, which I really like:
This provides the context missing from the earlier views; we can now see the details of the failing jobs, along with an idea of how well we’re succeeding overall. By restoring the green jobs, the radiator screens have more contextual information, and look less ‘angry’, by including green as well as red jobs; the sensory adaptation effect was reduced.
In Closing
I’m currently working on getting our changes to the Cradiator code pushed back to Github, and I plan to include more details in a post on the Engineering blog at thetrainline.com.
Have you had similar experiences with build & deployment radiator screens? I’d love to read of alternative approaches.
Update – some context on the number of build pipelines
Here are the slides for a talk I gave at WebPerfDays 2012, which includes details of the size of the build estate (“more servers than in Production” – slide 10):
We have around 250 build agents covering perhaps 150 build pipelines, along with 7 dev teams in 2 time zones, which I think explains (but does not excuse) the behaviour we saw when the green pipelines were hidden; one of the things I am working on is how to make the build information more relevant to each team, so that teams no longer ‘ignore red’.
Hi, I like your percentage representation a lot, and will use it at work if you could please post the code?
Hi Rafalzki, thanks for your comments. I’d love to get the source code released, and I will post an update here when it’s on Github as a pull request (https://github.com/PandaWood/Cradiator).