Tune logging levels in Production without recompiling code

IAP Software Development Practice JournalThis article first appeared in Software Development Practice, Issue 1, published by IAP (ISSN 2050-1455) 

Abstract

When raising log events in code it can be difficult to choose a severity level (such as Error, Warning, etc.) which will be appropriate for Production; moreover, the severity of an event type may need to be changed after the application has been deployed based on experience of running the application. Different environments (Development (Dev), User Acceptance Testing (UAT), Non-Functional Testing (NFT), Production, etc.) may also require different severity levels for testing purposes. We do not want to recompile an application just to change log severity levels; therefore, the severity level of all events should be configurable for each application or component, and be decoupled from event-raising code, allowing us to tune the severity without recompiling the code.

A simple way to achieve this power and flexibility is to define a set of known event IDs by using a sparse enumeration (enum in C#, Java, and C++), combined with event-ID-to-severity mappings contained in application configuration, allowing the event to be logged with the appropriate configured severity, and for the severity to be changed easily after deployment.

Continue reading Tune logging levels in Production without recompiling code

Fault tolerance, anomaly detection, and anticipation patterns by Jon Allspaw at QConLondon 2012

Jon Allspaw (@allspaw) from Etsy talked about the role that Anomaly Detection, Fault Tolerance and Anticipation play in producing highly scalable software systems (Fault tolerance, anomaly detection, and anticipation patterns, slides [PDF, 5MB]).

As head of technical operations at Etsy, whose web traffic is pretty substantial, Jon focused on resilience in software systems: what it is, and how to achieve it.

QConLondon 2012 blog posts
See all QConLondon 2012 blog posts…

Continue reading Fault tolerance, anomaly detection, and anticipation patterns by Jon Allspaw at QConLondon 2012

Site Reliabililty at Scale – Discussion Roundup

There have been several useful discussion threads on the LinkedIn Site Reliability at Scale group (http://www.linkedin.com/groups?home=&gid=4200099) recently:

Continue reading Site Reliabililty at Scale – Discussion Roundup

UK Scale Camp 2010 – Braindump

I’ve just returned from UK Scale Camp 2010 (@scalecampuk), organised by The Guardian (and the indefatigable Michael Brunton-Spall, ). Here are some notes:

Overview

I liked the “unconference” format (no formal programme; attendees vote for their favourite sessions in advance), and ended up in four of the many sessions:

  • DevOps on Windows
  • Log Analysis for Search Results
  • DB Changes without Downtime
  • Handling Errors at Scale

You are invited to ScaleCamp 2010

Very pleased to receive this email today:

From: Michael Brunton-Spall
Sent: 19 November 2010 16:07
To: Matthew Skelton
Subject: You are invited to ScaleCamp 2010 – 10th December at the Guardian offices, London

Hey,

We are so pleased to be able to invite you to Scale Camp 2010 on the 10th December at the Guardian Offices here in London.

I’m looking forward to some great conversations and debate, particularly around DevOps and how that can contribute to scaling a software platform.