Mobile Test Automation
Introduction
In this chapter, we will make a case for Mobile Test Automation and why it might make sense to invest in Automation strategies from the beginning of mobile app development. We will also compare and contrast the benefits of automation over manual testing practices. But first, lets understand a few evolutionary concepts, some of which are borrowed from web application test automation world.
WebDriver Protocol
WebDriver is a remote control interface that enables introspection and control of user agents. It provides a platform- and language-neutral wire protocol as a way for out-of-process programs to remotely instruct the behavior of web browsers. Read more here
Similarly WebDriver Wire Protocol also referred to as Json Wire Protocol - Selenium, the popular web test automation technology has pioneered this. Read more here
Why are we talking about these?
Much like how micro-service APIs over monolithic applications are preferred in distributed systems, to take advantage of horizontal scaling, even in web test Automation space similar innovation has happened in the past decade.
Selenium with its relatively lightweight architecture is fast replacing many licensed and enterprise tools.
There are websites that have covered the web test automation use cases excellently. All of the information and code can be found on this website
How does it work
The client library (test automation scripts) communicate with an intermediary server, that in turn translates into commands understood by the browser. Hence it eventually emulates an end user's actions on the browser (actually it operates on DOM)
So as we can see in the architecture above, as long as we write automation scripts that the intermediary server can understand (WebDriver), the server takes care of translating into actual commands emulate user actions on the browser.
How does it scale
While in the above picture, we have seen that a single intermediary server (chromedriver.exe, IE Driver Server.exe) can maintain the session with an active browser and execute commands as we direct in the automation scripts.
What if we want to execute tests in parallel? How about multiple instances of browsers on same machine or on remote machines and so on?
Selenium GRID
To solve the problem of parallel execution and leverage horizontal scaling, selenium grid came into existence. The architecture is as below
Couple of noteworthy points:
- The HUB is a http server that listens on a port and redirects the requests to the appropriate GRID node (which maps to a webdriver)
- Since the communication happens on http over tcp/ip, we can scale this model to the boundaries of the ports available on each machine in the control chain
- The HUB is the entry point and communication between HUB and nodes is purely over REST apis (GET, POST etc.)
- The matching of a request from client is done by the HUB based on webdriver protocol algorithm (DesiredCapabilities object)
How does it relate to mobile automation
So now, we can remotely execute commands on a browser as long as we have the intermediary server translating it to the right commands emulating user behavior
The same concept is applied to mobile test automation. Appium server is like chromedriver or IEDriverServer or Safari Driver , that can communicate with the UI Automation library for the Android or iOS ecosystem.
Internally, appium knows how to talk to UIAutomation library for Android. The UIAutomation library will take commands from Appium server and execute it on the app - which eventually translates to set, get, click and all user actions.
As the author of automation scripts, as long as we know how to identify elements (various locator/selector strategies and this space is quite mature now) and perform operations on those elements, the intended behavior is automated
Test Automation Libraries
Over the past few years, the test automation tools that got some traction for Android apps were (not limited to) MonkeyTalk, Robotium, UiAutomator, calabash-android, selendroid, Appium etc.
Similarly in the iOS space, the tools were (not limited to) calabash-ios, Frank, UIAutomation, ios-driver, KIF, Appium etc.
Android Espresso, a unit testing library that promises to do UI testing is also relatively new as of writing this book. Let's wait and see how Expresso does, however the fact that it is inside-out (unit testing) means that it might not cover the scope of integration testing (i.e. exactly testing the path that an end user experiences)
iOS XCTest : XCTest follows sample principles like KIF, Calabash, Appium et al. Seems like Apple is going to retire UI Automation soon in favor of their XCUITest UI testing library as per this link, however it is going to take a couple of years before completely transitioning over.
This book uses Appium as Test Automation tool/framework because we found it aligned with our needs of testing native, web and hybrid apps. Also because it follows webdriver protocol. Some more reasons below. Please read on Appium website on Appium's philosophy and competitive analysis with other tools & frameworks.
Android Expresso and iOS XCTest help because tests are in the same language as that of source code, that makes it easier to debug/troubleshoot and fosters collaboration between technical staff (dev & test engineers).
But we need access to source code, so executing the tests may work fine, but multiple vetting of build pipeline aka. CI (and hence testing from outside) gives relatively higher confidence before release.
So a better trade-off as per me is to use ATDD/BDD style because it lets lesser technical staff product owners/ business stake holders define “executable” acceptance criteria helping the 3 amigos (dev, test , product) be on same page and fosters collaboration.
Availability of developers time, test coverage etc. are also trade-off parameters.