I attended a workshop at DevWeek 2012 led by Neal Ford (@neal4d) on Continuous Delivery (CD). The day was excellent – Neal is a really engaging presenter – and I took copious notes, even though I’d already read most of the CD book. Fifteen months later, I thought it would be interesting to see how my notes from Neal’s workshop compared with my experience of Continuous Delivery, both within my job at thetrainline.com, and also in conversations with other people, particularly the good folks in the London Continuous Delivery meetup group.
The tl;dr version: go attend one of Neal’s excellent CD workshops, but be prepared for the challenges with Continuous Delivery to be much more social/organisational than technical.
I have arranged this post into sections corresponding with the notes I took, which follow the order of the workshop. In each section, I list Neal’s recommendations and advice first, then comment on specific aspects. The themes were:- Infrastructure-as-Code
- The Deployment Pipeline
- Testing Strategy
- Source Code Management
- Test-Driven Development (TDD)
- Teams
- Components
- Releases
- Databases
- Environments and Infrastructure
- Common Objections to Continuous Delivery
Introduction to Continuous Delivery
- Do not use arbitrary release dates; instead, give the business the power to release when features are ready.
- See software systems as the whole stack, including databases and infrastructure, not just the software.
- Use Puppet or Chef for building machine images – this frees up Ops people for embedding in software teams, where they can deliver the most value.
- More rapid releases allow you to adjust your product strategy more rapidly; users can guide product development. Example of Flickr – started as an online game, but product team noticed that users were making use of the advanced photo upload feature for sharing photos. Because the Flickr release cycle was short, they could respond rapidly to user feedback (explicit and implicit (via monitoring)) and innovate accordingly.
- Users dislike big changes, so keep the changes small!
- How long does it take you to get a one-line code change into Production? – Mary Poppendieck
- The ‘Agile Manifesto’ actually contains the phrase Continuous Delivery: “…early and continuous delivery of valuable software.“
- Definitions: Lead Time – from idea to working in Production. Cycle Time: from starting work on an idea to working in Production.
- Ideals of CD: Software is always production-ready; deployments are reliable and repeatable; anyone/everyone can self-service deployments; releases happen according to business needs.
Infrastructure-as-Code
- Any change in {Code, Infrastructure, Configuration} should trigger the creation of a Production-ready deployment package.
- Treat computer images as ‘compiled output’ (just as we do software).
- The definition of ‘Done’ == ‘Released to Production and working‘
Deployment Pipeline
- Value stream mapping – visualise the release process.
- The deployment pipeline – communicating a huge amount of information
Testing Strategy
- Build servers from bare metal each time, using Chef or Puppet.
- Puppet for Windows is now available [March 2012]. No excuse now for not building from source, whether running on Linux/Mac or Windows.
- Know the purpose of your automated tests and keep them separate. cf. Brian Marick‘s work
- Look for opportunities to pair up Devs with Ops and QA people.
- Aim to build vertical slices of software rather than horizontal layers
- Acceptance criteria need to be defined as a computer-parsable script e.g. Cucumber and other DSLs for Behaviour-Driven Development (BDD) like Rspec, Mspec, Nspec?
- The QA person/role becomes a key arbitrator of quality and aspects of design right theway through the project.
- Books: Specification by Example; Growing Object-Oriented Software, Guided by Tests (‘GOOS’);
Source Code Management
- “Maintain a single source repository”
- Branches: feature branches BAD. Release branches OKAY (if they are short-lived). Experimental branches (GOOD).
- Commit to trunk/master every day: how to avoid extra stuff getting out too early? RELEASE WHEN FEATURES ARE READY!!! Can also use feature toggles – remove the toggles when the feature is released. Also use branch by abstraction – programming to interfaces.
TDD
- The cyclomatic complexity for TDD code usually much lower than for non-TDD code. TDD code is better quality and more maintainable. TDD can be Test-Driven Design i.e. it is more than just way to write code, but also a way to design parts of the system more effectively.
- Neal Ford recommends to test private methods as well as public methods
- TDD should not be done in order to establish business requirements, but instead as a technical engineering practice. If you’re using TDD to help establish business requirements, then something else is missing.
- Value of TDD backed up by studies ‘in the wild’ of teams at Microsoft by Laurie Williams at NCSU – writing more code allows you to go faster and maintain a higher velocity than with non-TDD code.
Teams
- Conway’s Law is very powerful.
- 2-pizza team – how many people can you feed with 2 pizzas? 8 or 10. Teams should be no bigger than this.
- Each time can be responsible for several components, but should be working on only one at a time. Limit WIP.
Components
- Directed Acyclical Graph – if your subsystem dependencies do not fall into this pattern, then you need to revisit the relationships.
- Use tools such as NDepend or CDA to analyse dependencies.
- CI for components: each component has its own build and pipeline.
- Static vs Fluid vs Dynamic dependencies – guarded dependencies for when newer versions do not work.
- “Manual steps in software systems is like ‘building a snowflake'”
Releases
- “Just another deployment”
- Blue-Green: testing in the production environment
- Canary releases: direct a small number of “canary” users to a different version. If the ‘canary’ keeps ‘singing’ then you can roll more users onto the new feature.
- Dark Launch features ahead of when they will be visible. Very suitable for infra and DB changes.
- Emergency Fixes are no different from normal releases.
Databases
- Databases are often fossilised – too ‘precious’ to be interfered with.
- The ‘expand/contract’ pattern decouples db migration from app migration (upgrade)
- dbdeploy – really just a name for a good practice – db updates as code (small deltas). Crucially, metadata about the db version (e.g. which scripts have been run) is in the db itself. Commercial tools such as Liquibase allow vendor-independence.
- To achieve CI for db changes, use a baselne snapshot, and apply deltas serially. If a delta fails, stop the build (FAIL FAST). db delta scripts are stored in source code control alongside the related other code, so they are available to the CI system.
- The key thing here is that db changes (as scripts) are verified as part of the CI build.
- A good delta is 1 or 2 lines of SQL
- The expand/contract pattern allows decoupling of db chnges from application changes. Version 34 of the database might support versions 5, 6 and 7 of the applicaton. Extra columns/tables/functions are accumulated (expansion) and then after a certain application release, support for 5 & 6 is removed (contraction).
- Often, changes to the database can be ‘dark launched’ before any application code uses the features in production.
- See Refactoring Databases by Scott W Ambler
- Make use of DEPRECATION_COMMENT data type (or equivalent) when planning db changes so that warnings show up in db logs.
Environments and Infrastructure
- Ops folk are tired of the pain of fire-fighting crappy apps ‘thrown over the wall’
- Adopt DevOps features: devs to carry pagers, ops present in Inceptions, Showcases, Retrospectives,
- See Release It! by Michael Nygard from Pragmatic Programmer series
- “Destroy works of art” – throw away things which cannot be built again exactly (from source).
- Monitoring scripts are part of the ‘system’ so should be part of the deployment pipeline.
- ANY change to the system should go via the pipeline – technology which does not fit this requirement should be dropped. This is a big change!
- Visibility of non-co-located team members: use cheap web-cams to produce a sense of co-location! Acts as a virtual window into the other office; can also be used within the same building, if this is large.
- Cross-functional teams + ‘guerilla teams‘ which can move from project to project, helping with specific technologies (pipelines, Puppet, …) i.e. ‘enabling’ work.
- Key things for CD process: VISIBILITY + viral (JFDI)
- What is the definition of ‘done’? (Should be “In Production and being used”).
- Are you doing proper iterative development? Iterations should be 4 weeks max, ideally 2 weeks.
- Are you doing proper Scrum? Project Managers should ONLY: Deflect distractions (and provide nice snacks)
Objections to Continuous Delivery
- ITIL/SOX etc. do not naturally sit well with CD. However, CD can actually make things MORE compliant and serviceable:
- Visibility and control over locking down
- Automation over documentation
- Auditing
- Make it easy to back out changes: by automating processes.
What’s Next?
Neal’s workshop was hugely useful in cramming in a great deal of information, experience and advice into a short space of time. What neither Neal nor the Humble & Farley CD book really mention(ed) is the extent to which the technical challenges listed above, although non-trivial, are actually much less of an obstacle to Continuous Delivery than departmental politics, legacy procedures and thinking, and Capex-Opex battles, all of which are really social-organisational problems.
Steve Smith (@AgileSteveSmith) has given an excellent talk on the social-organisational challenges of introducing Continuous Delivery (first at London Continuous Delivery meetup group, and then at QCon New York) – I really recommend this talk to anyone who wants a flavour of the social challenges to CD!
Other sessions we’ve had at London Continuous Delivery meetup group which touch on the team/organisational aspects of CD include Team Transformation for CD at 7digital with Chris O’Dell (@ChrisAnnODell) and James Betteley (@jamesbetteley) on Measuring Progress with CD (kudos to SkillsMatter for the meeting space).
The team at Nokia Entertainment has had some notable success with Continuous Delivery and DevOps. Their agile team lead John Clapham (johnC_bristol) has helped to transform their software delivery from painful, infrequent releases to pain-free 30-minute deployments. Go and hear John speak about the team transformation at Nokia if you can.
Finally, buy the Continuous Delivery book, attend a ThoughtWorks CD workshop (Paul Stack @stack72 also runs an excellent CI and CD workshop), and (if you’re close to London) come to the #londoncd meetup group 🙂
Devs have lives too that need to be respected and taken into consideration when determining who is “on-call” when responding to P1 Incidents. Being expected to work long nights and weekends on a fairly regular basis is not sustainable and will result in rapid burnout.
I agree. The goal is to give Devs enough context and first-hand experience of incidents to provide them the context for more operable software. Poorly-implemented on-call rotas are a disaster for morale and defeat the purpose of having Devs on call, which is to improve the code.
Hi Matthew,
Nice write-up with lots of resources to dig in to. I’m already looking forward to your blog post on capturing runtime dependencies!
Niek.
Thanks Niek. I was talking to some more of the #londoncd folks last night about using ‘ghost packages’ for capturing runtime dependencies – I’ll try to get my thoughts in blog form some time in September! M