Friday, September 2, 2011

Testing Mobile Web Apps With WebDriver

Intro

We’ve been building mobile web apps using JQuery Mobile with the main target being webkit-based mobile browsers - Android and iPhone, basically. We’re big fans of functional testing, so we spent some time figuring out how to get a good testing infrastructure set up for mobile app development. That’s what I’ll be sharing here.


The approach


Our team members have a lot of experience using Cucumber to write automated acceptance tests, and we’re big fans. In previous projects we were building native iOS apps and testing them using Cucumber and Frank. We wanted to bring over some of the testing infrastructure we already had set up, so we decided on a similar approach. We still use Cucumber to drive tests in the iOS simulator, but instead of talking to a native app using Frank we are talking to our mobile web app via Selenium 2’s remote WebDriver protocol. In order to do this we run a special iOS app provided as part of Selenium 2 called iWebDriver which hosts an embedded UIWebView and listens for WebDriver commands, in much the same way as a Frankified native app listens for Frankly commands. We launch the iWebDriver app using the same SimLauncher gem which we’ve previously used to launch our native app for pre-commit and CI testing. To make our lives easier we use Capybara, which provides a nice ruby DSL for driving web apps and some useful cucumber step definitions out of the box. We configure Capybara to use the selenium-webdriver gem to drive the iWebDriver-hosted web app via the WebDriver remote protocol.


Selenium 2 == WebDriver

Note that when I talk about Selenium here, I am not referring to the Selenium 1.0 system (Selenium RC, Selenium IDE and friends). We are using Selenium 2.0, which is essentially the WebDriver system plus some Selenium backwards compatibility stuff which isn’t used in this setup. We don’t run any kind of Selenium server, we simply run iWebDriver in the iOS simulator. iWebDriver exposes the WebDriver remote API, and our cucumber tests drive our mobile app directly within iWebDriver using that API.


WebDriver on mobile is still bleeding edge

On the whole we have been happy using WebDriver and the iOS simulator to drive our functional tests. However, the WebDriver infrastructure for mobile platforms does feel a little rough around the edges. We have rare cases where our CI test run will fail, apparently due to iWebDriver crashing. When we were first getting our app under test we saw what appeared to be an iWebDriver bug where it was overwriting the global _ variable that underscore.js uses. We worked around that by using underscore.js in no-conflict mode. We also had to add a small but hacky monkey-patch to the selenium-webdriver gem to work around a bug in the way CocoaHTTPServer (the embedded HTTP server that iWebDriver uses) handled a POST with an empty body. That bug has been fixed in more recent releases of CocoaHTTPServer, but frustratingly we got no response when we reported the issue and suggested an upgrade of the dependency to resolve the issue. UPDATE: Jari Bakken notes in the comments that a similar workaround to our monkey-patch has now been added to the selenium-webdriver gem. We also found that iWebDriver locks up when showing a javascript alert. We worked around this by injecting a stub version of window.alert() in our web page at the start of each test. This worked, but is obviously less than ideal.


We briefly experimented with using the Android WebDriver client, but it lacked support for CSS-based selectors at the time. That may have changed since. If you are just using Capybara this is not an issue since Capybara uses XPath under the hood. However we quickly found that our use of JQuery Mobile meant we needed to write a fair amount of custom selectors, and our automation engineers had a preference for CSS. Ideally we would have run our test suite against both the iOS simulator and the Android emulator, but this lack of CSS support led to us deciding to just test on iOS. Given that both platforms are webkit-based this was an acceptable tradeoff.


The last drawback worth mentioning around using the iOS simulator is that it can only be run under OS X, and it doesn’t seem possible to run multiple instances of the simulator at once. This means that parallelizing your tests will involve running multiple OS X host machines, either physical boxes (we have used mac minis successfully) or virtual machines.


Recommendations

All in all I would still recommend a testing approach similar to what I’ve outlined. The extra safety-net that our suite of functional tests provided was just as valuable as in a non-mobile project. While they are not totally polished, iWebDriver and Cucumber are the best way I know of currently to build that safety net.


One thing we didn’t do which I would recommend is to also have your test suite execute against a desktop webkit-based browser, using ChromeDriver for example. Tests will run a lot quicker on that platform, giving you faster feedback. You will likely need to make minor modifications to the way your tests are written, but if you’re running tests in ChromeDriver from day one then you should be able to tackle any small issues as and when they arise. That said, I would absolutely make sure you are also running your full test suite against at least one mobile platform as frequently as you can - there will be issues in mobile webkit browsers that aren’t apparent on a desktop browser.

Saturday, July 2, 2011

Javascript Promises

One of the interesting aspects of Javascript development is its asynchronous, event-driven nature. Any operation which will take a significant amount of time (e.g. a network request) is non-blocking. You don’t call a function and block until the operation completes. Instead functions are non-blocking - they return before the operation is complete. To obtain the result of the operation you provide callbacks which are subsequently invoked once the operation completes, allowing you to move on to the next stage in your computation. This is a very powerful, efficient model that avoids the need for things like explicit threading and all the associated synchronization problems. However, it does make it a bit harder to implement a set of operations that need to happen one after another in sequential series. Instead of writing:

 
var response = makeNetworkRequest(),
processedData = processResponse(response);
writeToDB( processedData );

you end up writing:

 
makeNetworkRequest( function(response){
  var processedData = processResponse(response);
  writeToDB( processedData, function(){
    // OK, I'm done now, next action goes here.
  });
});

This is the start of what is sometimes referred to as a callback pyramid - callbacks invoking subsequent operations passing in callbacks which are invoking subsequent operations, and on and on. This is particularly common in node.js code, because most server-side applications involve a fair amount of IO-bound operations (service calls, DB calls, file system calls) which are all implemented asynchronously in node. Because this is a common issue there have been a rash of libraries to help mitigate it. See for example “How to Survive Asynchronous Programming in JavaScript” on InfoQ.

One approach to help cope with these issues in asynchronous systems is the Promise pattern. This has been floating around the comp-sci realm since the 70s, but has recently found popularity in the javascript community as it starts to build larger async systems. The basic idea is quite simple. When you call a function which performs some long-running operation instead of that function blocking and then eventually returning with the result of the operation it will instead return immediately, but rather than passing back the result of the operation (which isn’t available yet) it passes back a promise. This promise is a sort of proxy, representing the future result of the operation. You would then register a callback on the promise, which will be executed by the promise once the operation does complete and the result is available. Here’s the same example as I used before, implemented in a promise style:

 
var networkRequestPromise = makeNetworkRequest();
networkRequestPromise.then( function(response){
  var processedData = processResponse(response),
  dbPromise = writeToDB(processedData);
  dbPromise.then( function(){
    // OK, I'm done now, next action goes here.
  });
});

I created local promise variable here to show explicitly what’s happening. Usually you’d inline those, giving you:

 
makeNetworkRequest().then( function(response){
  var processedData = processResponse(response); 
  writeToDB(processedData).then( function(){
    // OK, I'm done now, next action goes here.
  });
});

For a simple example like this there’s really not much advantage over passing the callbacks as arguments as in the previous example. The advantages come once you need to compose asynchronous operations in complex ways. As an example, you can imagine a server-side app wanting to do something like: “make this network request and read this value from the DB in parallel, then perform some computation, then write to the DB and write a log to disk in parallel, then write a network response”. The beauty of a promise is that it is an encapsulated representation of an async operation in process which can be returned from functions, passed to functions, stored in a queue, etc. That’s what allows the composition of async operations in a more abstract way, leading to a much more manageable codebase.

Sunday, May 8, 2011

retroactive quality metrics with git

The Background

For a recent project retrospective we wanted to chart some metrics over the course of the entire project. Things like number of unit tests, test coverage, how long builds took to run, number of failing tests, etc. Taken in isolation these metrics aren't incredibly exciting, but when you plot them over time and hold that against other metrics like team morale, story point velocity, open defects, etc then often some interesting correlations can emerge.

The Challenge

So, we wanted metrics for internal quality, but we actually hadn't done the best job at collecting those metrics, particularly at the start of the project. So under pressure to get ready for our project retro we decided that we'd have to pass on graphing the internal quality metrics.

After the retrospective - which was a very valuable exercise regardless - I decided to figure out a way to capture this kind of data retroactively. I reasoned that if we had a script which generated a metric for the codebase at an instance in time then we can easily leverage The Awesome Power Of Git™ to collect that same metric over an entire set of commits.

Find someone else who's solved 80% of your problem

I came across a script in Gary Bernhardt's dotfiles repo on github called run-command-on-git-revisions. Sounds promising, doesn't it? This script will take a given range of git commits and run whatever command you provide on each commit in the range one by one. Just what I needed.

Solve the remaining 20%

I just needed a few additional tweaks. First I created a little 5 line ruby script which ran some test coverage metrics on our codebase, outputting a single line of comma-separated data. Then I created a modified version of Gary's script. The modified script only outputted raw CSV data, and tacked on the timestamp of each git commit as it processed it. Once that was done all I had to do was kick off the script with the appropriate range of commits, asking it to run my metric reporter script on each commit. I took the output and pulled it into graphing software and now have a pretty graph of how our test code has trended over time.

Sunday, May 1, 2011

Inspect the state of your running iOS app's UI with Symbiote

What's Symbiote?

Frank comes with a useful little tool called Symbiote. It's a little web app which is embedded inside your native iOS application. Its purpose is to let you inspect the current state of your app's UI, and to test the UIQuery selectors which Frank uses to help it automate your app. Essentially Symbiote is Firebug for your native iOS app.

Recently I've added some improvements to Symbiote, adding some new features and making it easier to use. This screencast demonstrates most of the new-and-improved Symbiote.

Main features

View hierarchy

Symbiote provides you with a tree describing the current state of the view hierarchy of your running app. This gives you an overview of the general structure of your app's UI, and also helps you write UIQuery selectors which drill down from high level UI features to individual UI elements. For example, you might want to find a specific UITableView, and from there drill down to a specific row within that table view.

View properties

After selecting a specific view in the view hierarchy you can see details of that view in a properties tab. Inside this tab is a table describing all the properties exposed by the specific view.

UI Snapshot, with view locator

The View Locator tab shows a snapshot of the current app UI. As you mouse over views in the view hierarchy that view will be highlighted in the view locator, making it easy to visualize which parts of the view tree map to which parts of the app's UI.

Selector testing

Frank uses UIQuery selectors during testing to select a specific view or set of views. This is very similar to how Selenium uses XPath or CSS selectors. In order to test these, Symbiote allows you to provide free-form UIQuery selectors and then have Frank flash any matching views in your app's live UI. This makes it a lot easier to test selectors when writing your automated tests.

Accessible elements

Frank makes use of accessibility labels to make it easier for it to find elements in the UI. To that end, Symbiote provides a list of all the views in your current UI which have accessibility labels. You can click on these elements and see the corresponding UI element flash in your app's running UI, using the same mechanism as the free-from selector input field.

Cool, where do I get it?

That's easy. Just get your app Frankified, and you're good to go!

Gracias!

We wouldn't have what we have today without the original UI cleanup of Symbiote by Cory Smith, and the UX and CSS help of Mike Long.

Tuesday, April 19, 2011

Tutorial screencast on Frankifying your app

I maintain Frank, a tool which lets you write automated acceptance tests for your iOS app using Cucumber. I've been trying to reduce the hurdles in getting started with Frank. My latest attempt is to record a tutorial screencast showing how to take your existing app and 'Frankify' it.

'Frankifying' an app is the process of adding a separate Frank target to your app which has the Frank server embedded into it, allowing it to respond to automation commands. It can seem like a bit of an intimidating process at first glance, but it's actually very simple. Hopefully running through the whole process in a 7 minute screencast demonstrates that.

Derek Longmuir has also contributed a nice in-depth tutorial writeup (which I was following along with when I recorded the screencast).

Tuesday, January 4, 2011

Working with Indirect Input and Output in Unit Tests

Testing how a unit of code interacts with its environment

When testing a unit of code you need to both observe how the unit of code is interacting with the outside world and also control how the world interacts with it.

In a particuarly simple example, you might want to check that when an adder function is given a 2 and a 4 then it returns a 6. In this case you’re controlling what the System Under Test (SUT) pulls in from its environment (the 2 and the 4) and also observing what it pushes out (the 6). In a more sophisticated example you might be testing a method which talks to a remote service, verifying that if it receives an error code when trying to read from the network then it logs the appropriate error message to a logger. Note that we’re still doing essentially the same thing here - controlling what the SUT pulls in from the environment (an error code in this case) and observing what it pushes out (a call to a logger in this case).

These two examples both illustrate the same fundamental practice of controlling input and observing output, but they are dealing with different kinds of input and output. In the first example of the adder function we are controlling the Direct Input provided to the SUT, and observing the Direct Output. In the second example we are controlling Indirect Input and observing Indirect Output (these terms are courtesy of the quite great XUnit Test Patterns book).

Four categories of inputs and outputs

As you can see, direct input and output is so called because it is provided directly to the SUT by the unit test. On the other hand, Indirect Input and Output can only be controlled and observed indirectly, via the SUT’s dependencies (aka Depended Upon Components). In one of the previous examples we were testing some code which needed to talk to a remote service. In that case the SUT would have had a dependency on some sort of lower level network service. We used this dependency to inject Indirect Input in the form of an error code being returned when the network service was called. Our SUT also had a dependency on some sort of logging service. We used that dependency to observe Indirect Output by checking that a logging method was called with the logging information we expect in the circumstances we simulated using the Indirect Input of an error coming back from the network service.

How do we manage indirect input and output?

We control indirect input and measure indirect output within our Unit Tests by using Test Doubles. This term encompasses the test-specific doohickeys commonly referred to as Mocks and Stubs, as well as the more esoteric variants such as Spies, Dummies, Fakes, etc.

In my personal opinion the vocabulary for these things is pretty confusing, and the definitions do not appear to be universally agreed upon and consistent. Also, I don’t often find myself too concerned with the implementation-specific differences which for some definitions serve to distinguish between e.g. a Stub vs a Mock. To me a much more important distinction is in what role a specific Test Double is playing in the test at hand. Is it helping to inject Indirect Input, or is it helping to observe Indirect Output? Or is it simply there to replace a required dependency for which we don’t want to use a real implementation? In an ill-advised mixing of terminology, I categorize these roles as Mocking, Stubbing, or Faking. I know this aligns quite poorly with other definitions for these terms, but they’re the best I have for now.

Test Double Roles

Stubbing

This refers to using a Test Double to control your SUT by providing specific Indirect Input. For example, you might supply your SUT with a testing-specific implementation of a Car repository which you have pre-configured to return a specific Car instance. This Car instance would be the indirect input which you are providing to your SUT. Another classic example would be injecting a fake Clock into your SUT so that you can test how it behaves at 1 second before midnight, or on February 29, or on December 21 2012.

Mocking

This refers to using a Test Double to observe some piece of Indirect Output produced by your SUT. Perhaps you’re creating a system that lets people tweet messages to the Jumbotron at a baseball game, and you need to make sure that you filter the tweets for naughty words. You could supply your SUT with a mock implementation of a RudeWordFilter class, and check that its filtering methods are being called correctly.

Faking

A Faking Test Double is one which is just being used to satisfy the SUT’s dependencies, and which is not being used to provide Indirect Input or to observe Indirect Output. Maybe the method you’re testing writes entries to a logger as it goes about its business. Your current test doesn’t care about this behavior, so you provide a Null Object implementation of the Logger to the SUT. This Null logger will simply ignore any logging methods which are called on it. Note that I emphasized that the current test wasn’t interested in how the SUT uses the logger. Other tests likely would be, and in those tests the Test Double which provides the Logger dependency would likely play a Mocking role rather than a Faking role.

Test Double Roles vs Test Double Implementation

It’s important to note here that the way in which these Test Doubles are implemented is orthogonal to the role they are playing. You can use a ‘mocking framework’ (e.g. Mockito, JMock, Typemock, EasyMock) both to supply Indirect Input and to observe Indirect Output. On the other hand you could just as well use hand-written classes (sometimes referred to as stubs) to achieve both these goals. This orthogonality between the technique you’re using to create your Test Doubles and the role a Test Double is playing is an important but subtle point, and is part of why I’m really not happy with the confusing terminology I have been using.

Acknowledgements

Lots of the ideas in this post are inspired by the patterns in the wonderful, encyclopedic XUnit Test Patterns book. It’s not exactly a 5 minute read, but nevertheless I highly recommend it. Most of the patterns in the book are also covered in very high-level overview form on the accompanying website, xunitpatterns.com.

I think my categorization of the Mocking, Stubbing and Faking roles lines up pretty closely with what Roy Osherove describes in his book, The art of unit testing, but I haven’t read the book so I can’t say that with any confidence.

Saturday, November 20, 2010

Creating and publishing your first ruby gem

Introduction

In this post I’m going to cover the basics of creating and publishing a gem using the bundle gem command provided by Bundler. We’re going to use bundler to create a gem template for us. We’ll then take that skeleton gem, add some functionality to it, and publish it for all the world to use.

For the purposes of this tutorial I need a very simple example of something which you could conceivably want to release as a gem. How about a simple Sinatra web app which tells you the time? Sure, that’ll work. We’ll call it Didactic Clock. In order to make this server implementation need more a couple of lines of code we’ll add the requirement that the clock tells you the time in a verbose form like “34 minutes past 4 o’clock, AM”.

Preparing to create a gem

A great way to create and test gems in a clean environment is to use the awesome rvm and in particular rvm’s awesome gemset feature. I assume you’re already set up with rvm. If not go get set up now!

First off we’ll create a seperate gemset so that we can create and install our gem in a clean environment and be sure that someone installing our gem will have all the dependencies they need provided to them. We’re going to be creating a gem called didactic_clock, so we’ll name our gemset similarly. We’ll create the gemset and start using it by executing:

 rvm gemset create didactic_clock
 rvm gemset use didactic_clock

From now on I’ll assume we’re always using this clean-room gemset.

Creating the skeleton

First lets install bundler into our gemset:

gem install bundler

Now we’ll ask bundler to create the skeleton of a gem. In this tutorial we’re going to be creating a gem called didactic_clock. We’ll ask bundler to create a skeleton for a gem with that name by calling:

bundle gem didactic_clock

You should see some output like:

  create  didactic_clock/Gemfile
  create  didactic_clock/Rakefile
  create  didactic_clock/.gitignore
  create  didactic_clock/didactic_clock.gemspec
  create  didactic_clock/lib/didactic_clock.rb
  create  didactic_clock/lib/didactic_clock/version.rb
Initializating git repo in /Users/pete/git/didactic_clock

Modifying our gemspec

Bundler creates a basic .gemspec file which contains metadata about the gem you are creating. There are a few parts of that file which we need to modify. Let’s open it up and see what it looks like:

 
   # -*- encoding: utf-8 -*-
   $:.push File.expand_path("../lib", __FILE__)
   require "didactic_clock/version"
 
   Gem::Specification.new do |s|
    s.name        = "didactic_clock"
    s.version     = DidacticClock::VERSION
    s.platform    = Gem::Platform::RUBY
    s.authors     = ["TODO: Write your name"]
    s.email       = ["TODO: Write your email address"]
    s.homepage    = "http://rubygems.org/gems/didactic_clock"
    s.summary     = %q{TODO: Write a gem summary}
    s.description = %q{TODO: Write a gem description}
 
    s.rubyforge_project = "didactic_clock"
 
    s.files         = `git ls-files`.split("\n")
    s.test_files    = `git ls-files -- {test,spec,features}/*`.split("\n")
    s.executables   = `git ls-files -- bin/*`.split("\n").map{ |f| File.basename(f) }
    s.require_paths = ["lib"]
   end

You can see that Bundler has set up some sensible defaults for pretty much everything. Note how your gem version information is pulled out of a constant which Bundler was nice enough to define for you within a file called version.rb. You should be sure to update that version whenever you publish any changes to your gem. Follow the principles of Semantic Versioning.

Also note that there are some TODOs in the authors, email, summary, and description fields. You should update those as appropriate. Everything else can be left as is for the time being.

Adding a class to our lib

We’ll start by creating a TimeKeeper class which will report the current time in the verbose format we want the Didactic Clock server to use. To avoid polluting the client code’s namespace it is important to put all the classes within your gem in an enclosing namespace module. In our case the namespace module would be DidacticClock, so we’re creating a class called DidacticClock::TimeKeeper. Another convention which is important to follow when creating gems is to keep all your library classes inside a folder named after your gem. This avoids polluting your client’s load path when your gem’s lib path is added to it by rubygems. So taking both of these conventions together we’ll be creating a DidacticClock::TimeKeeper class in a file located at lib/didactic_clock/time_keeper.rb. Here’s what that file looks like:

 
    module DidacticClock
    class TimeKeeper
     def verbose_time
      time = Time.now
      minute = time.min
      hour = time.hour % 12
      meridian_indicator = time.hour < 12 ? 'AM' : 'PM'
 
      "#{minute} minutes past #{hour} o'clock, #{meridian_indicator}"
     end
    end
   end

Adding a script to our bin

We want users of our gem to be able to launch our web app in sinatra’s default http server by just typing didactic_clock_server at the command line. In order to achieve that we’ll add a script to our gem’s bin directory. When the user installs our gem the rubygems system will do whatever magic is required such that the user can execute the script from the command line. This is the same magic that adds the spec command when you install the rspec gem, for example.

So we’ll save the following to bin/didactic_clock_server

 
   #!/usr/bin/env ruby
 
   require 'sinatra'
   require 'didactic_clock/time_keeper'
 
   # otherwise sinatra won't always automagically launch its embedded 
   # http server when this script is executed
   set :run, true
 
   get '/' do
    time_keeper = DidacticClock::TimeKeeper.new
    return time_keeper.verbose_time
   end

Note that we require in other gems as normal, we don’t require rubygems, and that we don’t do any tricks with relative paths or File.dirname(__FILE__) or anything like that when requiring in our TimeKeeper class. Rubygems handles all that for us by setting up the load path correctly.

Adding a dependency

Our little web app uses Sinatra to serve up the time, so obviously we need the Sinatra gem installed in order for our own gem to work. We can easily express that dependency by adding the following line to our .gemspec:

s.add_dependency "sinatra"

Now Rubygems will ensure that sinatra is installed whenever anyone installs our didactic_clock gem.

Building the gem and testing it locally

At this point we’re done writing code. Bundler created a git repo as part of the bundle gem command. Let’s check in our changes to the git repo. git commit -a should do the trick, but obviously feel free to use whatever git-fu you prefer.

Now we’re ready to build the gem and try it out. Make sure you’re still in the clean-room gemset we created earlier, and then run:

rake install

to build our didactic_clock gem and install it into our system (which in our case means installing it into our didactic_clock gemset). If we run gem list at this point we should see didactic_clock in our list of gems, along with sinatra (which will have been installed as a dependency).

Now we’re ready to run our app by calling didactic_clock_server from the command line. We should see sinatra start up, and if we visit http://localhost:4567/ we should see our app reporting the time in our verbose format. Victory!

Publishing our gem

The last step is to share our creation with the world. Before we do that you’ll need to set up rubygems in your system to publish gems. The instructions at rubygems.org are easy to follow.

Bundler provides a rake publish task which automates the steps you would typically take when publishing a version of your gem, but it’s fairly opinionated in how it does so. The task will tag your current git commit, push from your local git repo to some upstream repo (most likely in github), and then finally build your gem and publish your .gem to rubygems.org. If you don’t have an upstream repo configured then you’ll probably get an error like:

   rake aborted!
   Couldn't git push. `git push  2>&1' failed with the following output:
 
   fatal: No destination configured to push to.

So, now would be the time to set up an upstream repo. Doing that with github is really straightforward. Once you have your local git repo configured with an upstream repo you can finally publish your gem with rake publish.

Now anyone who wants to install your gem can do so with a simple gem install command. Congratulations! Fame and fortune await you!

Conclusion

Hopefully I’ve shown that creating and publishing a well-behaved gem is pretty simple. The didactic_clock sample I created is up on github, and of course the gem is published on rubygems.org and can be installed with gem install didactic_clock.