Using LogParser and Log Replay to Bootstrap Load Test Scripts

In a previous post I mentioned how Microsoft LogParser can be used to discover HTTP endpoints in legacy applications. Another great use for LogParser is to generate files to drive load test scripts.

Loaded truck by jcurtis4082 (Flickr)

Load-Testing Legacy HTTP Applications

When working with legacy HTTP-based applications, we need all the help we can get from tools, as documentation is often scarce. If you’re faced with maintaining  or upgrading a legacy application, one key success measure is typically whether the upgraded application operates more quickly and can sustain higher traffic volumes than the legacy version; for this, we need to execute some performance tests.

To ensure that the tests we run are representative of the real usage pattern of the application in Production , we can switch on detailed logging and then use Microsoft LogParser to discover the active endpoints. Armed with this log data, we can use LogParser to generate files ready for use as input to tools such as RadView WebLOAD or HP LoadRunner; in effect we are ‘replaying’ the event logs back to the application, hence ‘event log replay’.

Extracting Load Input with LogParser

Rather than using LogParser’s flexible templating options to create load scripts directly (possible, but more complicated), we’ll simply plain-text create data files which can drive the load scripts, using the TSV (tab-separated values) output option:

C:\Program Files\Log Parser 2.2>Logparser -q:ON -i:IISW3C -o:TSV 
 "SELECT LogRow, s-computername, cs-uri-stem, cs-uri-query 
 FROM ex091219.log 
 WHERE sc-status=200" > ex091219-quiet-query.txt

Let’s break down that query further:

  1. The –q option prevents LogParser from outputting summary information after the data, which keeps the data neatly in columns.
  2. The –i and –o options specify input and output formats respectively; here, we’re processing standard IIS log files to tab-separated value format suitable for spreadsheets and for load test scripts.
  3. We then use a SQL-like SELECT statement to extract just three columns:
    1. LogRow: this helps us maintain the order of the original requests, which is often important for load test results to be valid
    2. s-computername: the hostname of the machine which logged the original request
    3. cs-uri-stem:  the base URI of the request
    4. cs-uri-query: the query-string parameters present in the original request
    5. Finally, we filter by HTTP status code of 200 (for successful requests only), and redirect the results to a text file.

We can open up the resulting file directly as a spreadsheet for manual inspection:

LogParser_Excel-2

Using Parsed Log Data in Load Test Scripts

With the log data extracted, we can drive load test scripts from the tab-separated data file. Each load test framework will achieve this slightly differently; for instance, to read a single line from the text file within a WebLOAD script and then make an HTTP call, you’d write something like this:

var arrayData = GetLine("C:\\data\\ex091219-quiet-query.txt")
var serverName = arrayData[1]
var uri = arrayData[2]
var query = arrayData[3]
var fullUrl = http:// + serverName + uri + "?" + query
wlHttp.Get(fullUrl)

This script snippet reads a single line from the data file and advances the read cursor in the file ready for the next call to GetLine(). The elements of the arrayData array correspond conveniently to the columns in the tab-delimited data file (zero-indexed), and so we can build an HTTP GET request very easily. With appropriate flow control and loop structures, the entire data file can be read in and used to drive the load test.

The processing of web server log data into data files suitable for load test input can of course be automated via simple scripts.

Limitations

This technique works well with HTTP GET (and similar verbs like DELETE) which does not carry a data payload. Making this work for POST and PUT is more tricky (as web servers do not usually log the payload values) but can be simulated by similar techniques and some hand-crafting in a spreadsheet.

Summary

By collecting HTTP logs during real-world application use, and then using Microsoft LogParser to process these logs into tab-delimited text data files, we can easily create load test scenarios which mimic real use of the application, by using the same requests data to drive the application under load. This technique, called ‘event log replay’, provides some certainty that the application is being exercised in a lifelike way, and can be automated for more repeatable QA processes.

Join the discussion...

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.