Endpoint Discovery using Microsoft LogParser

A few years ago, I was working on a project for a UK client who needed to replace or rewrite a legacy inventory management  web application written in classic ASP. The problem: no documentation, and complicated, spaghetti source code with many apparently duplicate or redundant ASP files and ASMX web service endpoints.

Finger pointing by bhollar (Flickr)

Which ASP pages were actually in use? We had to find a way to limit our application migration efforts to  only those pages which were used by the application. A colleague at the time introduced me to what must be one of the best-kept secrets in the Windows developer world: Microsoft Log Parser.

Using LogParser with IIS Logs

Described as … a powerful, versatile tool that provides universal query access to text-based data such as log files, XML files and CSV files, think of MS Log Parser  as AWStats for almost any data source and without the Perl headache, albeit without the nice HTML reports out-of-the-box.

By tailoring the log file queries, not only can LogParser be used to determine which endpoints are in use within a particular application, but can be also used as the basis for Functional or Load test scripts (I will cover this scenario in a separate post).

Details

In order to determine the endpoints in use by the application (typically ASP files, with some ASMX web service endpoints[1]), we configured the IIS web server in our test environment to log extended information about the incoming requests:

IIS-Extended-Logging-Properties

We then ran a series of user journeys through the application, covering all the steps which seemed possible, based on what we understood of the application. This generated several hundreds of megabytes of IIS logs.

We then used LogParser to extract the details (here, in tab-separated format) for those requests which resulted in an HTTP 200 response:

$ LogParser -i:IISW3C -o:TSV "SELECT * FROM ex090715.log WHERE sc-status=200" > ex090715.txt

We could then load the results into Excel for manual inspection:

LogParser_Excel

Satisfied that we had valid results from the LogParser query, we could then correlate the cs-uri-stem column data with files on disk: those files on disk with no corresponding entry in the cs-uri-stem column were possible candidates for removal, whilst clearly those files on disk with an entry in the log data were actively being used as part of the application.

This was particularly useful for determining which ASMX web services were being used by the application, because there was extensive use of hand-rolled AJAX (JavaScript + XML) to call service endpoints, so looking for standard Sys calls and the like was not possible.

The Result – HTTP Endpoints Identified

By using the Microsoft LogParser tool and detailed IIS logging, we were able to identify the vast majority of ASP and web service endpoints used by the application.

(We also devised a way to use LogParser to help generate load test scripts – I will cover this in a future post).

This technique could be extended by processing IIS logs from across all web and application servers, and thereby identifying all active HTTP endpoints within the server estate, as a cheap and cheerful Splunk or LogStash alternative.



[1] There were also dynamic SQL queries generated in client-side JavaScript, amongst other interesting techniques!

Advertisements

Join the discussion...

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s