Leaving the Platform – Branching and Releasing for Independent Subsystems

Matthew Skelton (@matthewskelton)'s avatar

For several years, much of the code for the systems at thetrainline.com has been versioned and deployed together as a single ‘platform’. Recently, we have begun to divide up the platform into smaller chunks, to enable us to deliver some parts more frequently and rapidly, leaving other parts to evolve more slowly (as needed). Moving from a single version number for all subsystems to multiple version numbers for independent subsystems has implications for how code is built and released; this blog post outlines some of the work we have done so far in this area.

My colleague Owain Perry and I recently presented on this topic at the London Continuous Delivery meetup group (http://londoncd.org.uk/) and the slides we showed relate to the details in this post:

View original post 1,214 more words

Roundup: Patterns for Performance and Operability

Patterns for Performance and OperabilityI recently posted a review of Patterns for Performance and Operability by Ford et al on the SoftwareOperability website. I think that this book is exceptionally useful in its treatment of both performance and operability, and anyone who cares about how well software works in Production should buy and read a copy (there are paper and eBook editions).

Two other reviews might be useful too: my colleague Anant East (Head of Architecture and Infrastructure, thetrainline.com) wrote up a detailed review of Patterns for Performance and Operability on the tech blog at thetrainline.com, and I posted a short review on Amazon.

Continuous Delivery Workshop with Neal Ford (@neal4d) – a Retrospective

I attended a workshop at DevWeek 2012 led by Neal Ford (@neal4d) on Continuous Delivery (CD). The day was excellent – Neal is a really engaging presenter – and I took copious notes, even though I’d already read most of the CD book. Fifteen months later, I thought it would be interesting to see how my notes from Neal’s workshop compared with my experience of Continuous Delivery, both within my job at thetrainline.com, and also in conversations with other people, particularly the good folks in the London Continuous Delivery meetup group.

The tl;dr version: go attend one of Neal’s excellent CD workshops, but be prepared for the challenges with Continuous Delivery to be much more social/organisational than technical.

Continue reading Continuous Delivery Workshop with Neal Ford (@neal4d) – a Retrospective

Chef on Windows – detecting and fixing WMI problems which prevent chef-client runs

This post covers some issues we had recently with chef-client on Windows due to missing WMI classes, and how we diagnosed, fixed, and mitigated the problem.

Matthew Skelton (@matthewskelton)'s avatar

At thetrainline.com we use Opscode Chef for managing our build infrastructure. Like many other tools running on Windows, the chef-clientohai framework relies on WMI for extracting information about the server machine on which scripts are being run. We found that Windows WMI repository corruption can cause chef-client runs to fail due to missing WMI classes, which causes the node to remain out of policy. The WMI repo can be repaired using winmgmt /salvagerepository, and the WMI errors can be monitored using the WMIDiag script to alert on WMI repository corruption before future chef-client runs. This post details how we detected and fixed the problem, and how to monitor for WMI repository corruption.

View original post 1,131 more words

Summary of DevOps Summit London 2013

Here is a video interview I gave summarizing some of the main themes and discussions which came out of the DevOps Summit on May 23rd in London: a shared goal for development and operations; anti-fragility as a way for DevOps and ITIL to complement each other; surface the quality of the software for all to see; tools can produce a step change in behavior for the better; and use the theory of constraints to focus improvements.

Continue reading Summary of DevOps Summit London 2013