Tuesday, May 29, 2012

Automated tests, Type II errors, and the Null Hypothesis

In Part 1 I described how I use automated tests to fire up Selenium, hit a webpage, and log errors to ELMAH via jQuery. But with an elaborate test harness scenario like this, I argue that the onus should be squarely on the JS side to prove itself. I'll show you what I mean.
"Extraordinary claims require extraordinary evidence" - Carl Sagan
Flickr photo
FlickrCarl Sagan (Cosmos), by trackrecord. Carl Sagan was the man.

One of the advantages (as I see it) of the co-opting of ELMAH error logging into my automated tests is that errors end up where they should, in ELMAH. As in, in the "ELMAH_Error" table in your database, or whatever back-end storage option you choose. Because you are using ELMAH, right? (At this stage it probably looks as if I'm keyword spamming for "ELMAH". Oops, said ELMAH again.) Some might see this as noisy, an adulteration of the logging stream for your application's unhandled exceptions, but if I'm going to use my database to coordinate my automated test sessions, then the ELMAH table is the most natural place to put it. And if you don't want automated test kruft there, you can easily identify it and filter it out later. Or at the end of the test. Or hire a cleaner if you're too busy.

Automated test and JS errors go to ELMAH to die. Might wanna fix up those Google+ errors!

Pass the parcel

So how does the error get in there in the first place? Well, as I describe in part 1, the session id gets generated during the test, slapped on the end of the URL of the Frankenpage:


The Services controller Test() method creates a ViewBag.SessionId from the modelbound querystring sessionId value. The Razor engine generates a meta tag
<meta name="sessionId" content="@ViewBag.SessionId">
which the jQuery can get:
var sessionId = $("meta[name=sessionId]").attr("content");
Of course, there's a few different ways you could do all this. Maybe a cookie. But I think this is the easiest.

The cons

A couple of times on outings with Tina, my wife, I've said something like "You go to that shop, I'll go to this one, and I'll text you if there's a change of plan, otherwise I'll give you a missed call, and we'll meet at such-and-such a place, and if that's closed, I'll text you where to meet." Too many moving parts. And we've come unstuck one or two times because of it. Phones and batteries don't always work. Same for (my) short-term memory.

This testing technique has, as it stands, too many moving parts. It relies on a few things working. For instance, it relies on the fact that the JS file is correct, that the controller and Razor between them are serving up the test page with the meta tag containing the session id, etc. To that end, you could assert that the page is serving in the first place by adding in a HttpClient check to see if all is OK, that is that the Http Response Code is 200, because if it doesn't respond with 200, I don't have a page.
void GivenIHaveAFrankenpageWithAllMyJavascriptServices()
    _sessionId = Guid.NewGuid().ToString();
    _servicesTestUrl = string.Format("http://localhost/Services/Test/{0}?sessionId={1}", environment, _sessionId);

    // serves OK in the first place?
    var client = new Microsoft.Http.HttpClient();
    var response = client.Get(_servicesTestUrl);
    Assert.AreEqual(HttpStatusCode.OK, response.StatusCode);
Fine, the page serves OK, but there's still plenty that could have gone wrong. Like the JS file didn't load. Or it did, but some bug happened and the testing didn't happen. So you get a false negative, a Type II error. Something went wrong, but you didn't spot it, and everything looked like it worked. Even though the thing that went wrong might have had nothing to do with the actual case you're trying to test, in an automated test, you need to know that the test is broken, or else it has no value whatsoever.

Switching the null hypothesis

Like I say in the first part of this two-part post, with JavaScript you want to test the crap out of it if it's critical to your app. So I need to shift the burden of proof to the JavaScript side. I need to reflect the fact that it is statistically more likely to fail than not, so it has to shoulder that burden. By engineering it so the initial arrangement of the test is now to create an ELMAH error, I am advancing a theory that the test will fail unless proven otherwise: the null hypothesis. This is completely different to the way is set up the test in the first blog post.

So, the null hypothesis (denoted H0 by statisticians, so now you know) is that the test will fail. Or rather, will stay failed, since I've changed the test to log an error as part of its setup, and the rest of the test must clear that error. This is because it is the simplest option: there are many ways for this flaky arrangement of test class/rigged-up HTML and Razor/JS/Web API and ELMAH to fail, and only one way for it to succeed. So I engineer the initial conditions to create an error, then challenge the rest of the test harness - the JS, HTML, ELMAH etc - to prove me wrong.
namespace ConnemaraComTests.Services
    public class BlogJavascript : Base
        private string _servicesTestUrl, _sessionId;

        public void MyJavascriptServicesShouldBeWorkingOK()
            this.Given(_ => GivenIHaveAFrankenpageWithAllMyJavascriptServices())
                    .And(_ => AndThereIsAnErrorTokenForThisTest())
                .When(_ => WhenIBrowseToThatPage())
                .Then(_ => ThenTheJavascriptShouldClearTheErrorToken())

        void GivenIHaveAFrankenpageWithAllMyJavascriptServices()
            _sessionId = Guid.NewGuid().ToString();
            _servicesTestUrl = string.Format("http://localhost/Services/Test/{0}?sessionId={1}", environment, _sessionId);

        void AndThereIsAnErrorTokenForThisTest()
            using (var context = new ConnemaraComContext())
                var errorGuid = Guid.Parse(_sessionId);
                context.Errors.Add(new Error
                                           Application = "/LM/W3SVC/1/ROOT",
                                           Host = "RAFTUS-PC",
                                           Message = _sessionId,
                                           Type = "Automated JS test",
                                           TimeUtc = DateTime.Now,
                                           ErrorId = errorGuid

                Assert.IsTrue(context.Errors.Count(x => x.ErrorId == errorGuid) == 1);;

        void WhenIBrowseToThatPage()
            var driver = new FirefoxDriver();

            // wait 5 seconds

        void ThenTheJavascriptShouldClearTheErrorToken()
            var errorId = Guid.Parse(_sessionId);
            using (var context = new ConnemaraComContext())
                Assert.IsFalse(context.Errors.Any(x => x.ErrorId == errorId));

Now, if it doesn't get to the JS, it can't clear the error, and the automated test fails. Or, it makes it to the JS, but something happens there. Or in the Web API controller. There is no chance of a false positive, a Type I error. There's simply no way an accidental success can fall through the cracks.

The JS file is just an instruction to run the tests, checks to see that the divs have loaded content from the public APIs they're supposed to, and then if all is well an instruction to delete the original error token.
window.onload = function () {

var ServicesLibrary = {

    ErrorMessage: "",

    RunTests: function () {
        var self = this;
        // wait for 5 seconds before trying anything
            self.CheckContentHasLoaded({ key: "Twitter", elements: $("#Twitter span"), dataAttribute: "tweet-id" });
            ... and other services besides Twitter

            // if at the end of all this there's still no error message
            if (self.ErrorMessage == "")
        }, 5000);

    DeleteError: function(){
        var sessionId = $("meta[name=sessionId]").attr("content");
            type: 'DELETE',
            url: "/api/Errors/",
            data: { id: sessionId }
So easy a sea sponge could grok it. OK, so it's a little elaborate. But if there's any grokking to be done, let it be this. There's an automated test which sets up an initial failing condition which the JavaScript part of the test must pass. Because the null hypothesis must not stand.

Note the nice bddify report at the bottom of the test session window

No comments:

Post a Comment