Google Open Source Blog: webdriver

Showing posts with label webdriver. Show all posts

Wicked Good XPath: a faster JavaScript XPath library

Tuesday, September 4, 2012

We are proud to release Wicked Good XPath, a Google-authored pure JavaScript implementation of the DOM Level 3 XPath specification. We believe it to be the fastest XPath implementation available in JavaScript.

To use Wicked Good XPath, simply download the wgxpath.install.js file and include it on your webpage with a script tag. For example:
<script src="wgxpath.install.js"></script>

Then call wgxpath.install() from your JavaScript code, which will ensure document.evaluate, the XPath evaluation function, is defined on the window object. To install the library on a different window, pass that window as an argument to the install function.

Despite the growing popularity of CSS selectors, XPath remains a useful mechanism to locate elements in an HTML document. It has particularly heavy usage in the context of frontend web testing tools like Selenium and Web Puppeteer. Sometimes there is simply no way to reference an element on the page other than with an XPath expression.

For those who have never used XPath, here is a taste:

On a Google search results page, the XPath expression //li[@class=”g”][3] identifies the third search result. Here is a snapshot from the Chrome extension XPath Viewer showing the third result selected when that expression is used.

A key challenge when using XPath, however, is that it is not natively supported on every browser. Most notably, Internet Explorer does not provide native XPath support for HTML documents. As a result, users turned towards pure JavaScript implementations of XPath. In 2005, Google engineers released AJAXSLT, which included a correct, but slow, XPath evaluator. Running web tests on IE using AJAXSLT was time-consuming.

Fast-forward to 2007, Cybozu Labs released JavaScript-XPath, a new JavaScript XPath implementation that was 10 times faster than AJAXSLT. The web testing tools began using it instead, and life was good... for a while. The JavaScript-XPath quickly fell out of maintenance, and bugs became tough to get fixed. Also, since it wasn't written in Google Closure, it was tricky for us Googlers to integrate into our JavaScript applications. A rewrite was necessary.

However, we went beyond merely porting the library to Google Closure and fixing a couple bugs. We identified some significant additional performance improvements, such that our version runs about 30% faster than the original. On top of that, the Closure compiler was able to minify our code down to a mere 25K, 40% smaller than JavaScript-XPath's 42K. Finally, the code is structured and documented in a way that we believe will make future maintenance quicker and easier.

We would like to thank our two Google interns, Michael Zhou and Evan Thomas, for doing the bulk of the development on this project.

By Greg Dennis and Joon Lee, Google Engineers

Test Your Mobile Web Apps with WebDriver - A Tutorial

Friday, October 28, 2011

Mobile testing has come a long way since the days when testing mobile web applications was mostly manual and took days to complete. Selenium WebDriver is a browser automation tool that provides an elegant way of testing web applications. WebDriver makes it easy to write automated tests that ensure your site works correctly when viewed from an Android or iOS browser.

For those of you new to WebDriver, here are a few basics about how it helps you test your web application. WebDriver tests are end-to-end tests that exercise a web application just like a real user would. There is a comprehensive user guide on the Selenium site that covers the core APIs.

Now let’s talk about mobile! WebDriver provides a touch API that allows the test to interact with the web page through finger taps, flicks, finger scrolls, and long presses. It can rotate the display and provides a friendly API to interact with HTML5 features such as local storage, session storage and application cache. Mobile WebDrivers use the remote WebDriver server, following a client/server architecture. The client piece consists of the test code, while the server piece is the application that is installed on the device.

Get Started

WebDriver for Android and iPhone can be installed following these instructions. Once you’ve done that, you will be ready to write tests. Let’s start with a basic example using www.google.com to give you a taste of what’s possible.

The test below opens www.google.com on Android and issues a query for “weather in san francisco”. The test will verify that Google returns search results and that the first result returned is giving the weather in San Francisco.

 public void testGoogleCanGiveWeatherResults() { 
 // Create a WebDriver instance with the activity in which we want the test to run. 
 WebDriver driver = new AndroidDriver(getActivity()); 
 // Let’s open a web page 
 driver.get("http://www.google.com");

 // Lookup for the search box by its name 
 WebElement searchBox = driver.findElement(By.name("q")); 
 // Enter a search query and submit 
 searchBox.sendKeys("weather in san francisco"); 
 searchBox.submit();

 // Making sure that Google shows 11 results 
 WebElement resultSection = driver.findElement(By.id("ires")); 
 List<WebElement> searchResults = resultSection.findElements(By.tagName("li")); 
 assertEquals(11, searchResults.size()); 
 // Let’s ensure that the first result shown is the weather widget 
 WebElement weatherWidget = searchResults.get(0); 
 assertTrue(weatherWidget.getText().contains("Weather for San Francisco, CA")); 
}

Now let's see our test in action! When you launch your test through your favorite IDE or using the command line, WebDriver will bring up a WebView in the foreground allowing you to see your web application as the test code is executing. You will see www.google.com loading, and the search query being typed in the search box.

We mentioned above that the WebDriver supports creating advanced gestures to interact with the device. Let's use WebDriver to throw an image across the screen by flicking horizontally, and ensure that the next image in the gallery is displayed.

 WebElement toFlick = driver.findElement(By.id("image")); 
// 400 pixels left at normal speed 
Action flick = getBuilder(driver).flick(toFlick, 0, -400, FlickAction.SPEED_NORMAL) 
   .build(); 
flick.perform(); 
WebElement secondImage = driver.findElement(“secondImage”); 
assertTrue(secondImage.isDisplayed());

Next, let's rotate the screen and ensure that the image displayed on screen is resized.

 assertEquals(landscapeSize, secondImage.getSize()) 
((Rotatable) driver).rotate(ScreenOrientation.PORTRAIT); 
assertEquals(portraitSize, secondImage.getSize());

Let's take a look at the local storage on the device, and ensure that the web application has set some key/value pairs.

 // Get a handle on the local storage object 
LocalStorage local = ((WebStorage) driver).getLocalStorage(); 
// Ensure that the key “name” is mapped 
assertEquals(“testUser”, local.getItem(“name”));

What if your test reveals a bug? You can easily take a screenshot for help in future debugging:

 File tempFile = ((TakesScreenshot) driver).getScreenshotAs(OutputType.FILE);

High Level Architecture

WebDriver has two main components: the server and the tests themselves. The server is an application that runs on the phone, tablet, emulator, or simulator and listens for incoming requests. It runs the tests against a WebView (the rendering component of mobile Android and iOS) configured like the browsers. Your tests run on the client side, and can be written in any languages supported by WebDriver, including Java and Python. The WebDriver tests communicate with the server by sending RESTful JSON requests over HTTP. The tests and server pieces don't have to be on the same physical machine, although they can be. For Android you can also run the tests using the Android test framework instead of the remote WebDriver server.

Infrastructure Setup

At Google, we have wired WebDriver tests to our cloud infrastructure allowing those tests to run at scale and making it possible to have them run in our continuous integration system. External developers can run their mobile tests either on emulators/simulators or real devices for Android and iOS phones and tablets.

Android emulators can run on most OSes because they are virtualized, so we run them on our generic cloud setup. Though there are many advantages to using Android emulators because they emulate a complete virtual device (including the virtual CPU, MMU, and hardware devices), it makes the test environment slower. You can speed up the tests by disabling animations, audio, skins, or even by running in the emulator headless mode. To do so, start the emulator with the options --no-boot-anim, --no-audio, --noskin, and --no-window. If you would like your tests to run even faster, start the emulator from a previously created snapshot image. That reduces the emulator startup time from 2 minutes to less than 2 seconds!

iOS simulators can't be virtualized and hence need to run on Mac machines. However, because iOS simulators don't try to emulate the virtual device or CPU at all, they can run as applications at "full speed," this allows the test to run much faster.

Stay tuned for more mobile feature in Selenium WebDriver, and updates on the Selenium blog. For questions and feedback not only of the Android WebDriver but also its desktop brethren, please join the community.

By Dounia Berrada, Engineer on the EngTools team

Introducing Native Driver

Wednesday, June 22, 2011

NativeDriver is an implementation of the WebDriver API which drives the UI of a native application rather than a web application. I am happy to announce that the Android version is available for download and we are welcoming all users and contributors. We are hosting on Google Code (http://nativedriver.googlecode.com/). An iPhone (iOS) version is under development and will be available soon.

WebDriver exposes browser functionality as a clean, object-oriented API, and Google uses WebDriver to test web applications on many platforms. (For an introduction to WebDriver, see this blog post.)

You are probably wondering why anyone would use the WebDriver API to test native applications. Our reasoning is:

user interactions with a native application and a web application are essentially the same: click, type, switch window, read text
test writers will have to write the same test for each platform they want to support
no one wants to learn yet another API
WebDriver already has a user base and a tool ecosystem that can migrate easily to a new API if it is WebDriver-like
therefore: let’s re-use our favorite UI testing API

NativeDriver is our attempt to apply WebDriver’s simplicity and success to native applications. It extends the WebDriver API in a few key places, and re-interprets the existing API for native applications.

Here is some code from a NativeDriver test against the Google Maps Android app (the test can be seen in action in this video):

AndroidNativeDriver driver = new AndroidNativeDriverBuilder()
.withDefaultServer()
.build();
driver.startActivity("com.google.android.maps.MapsActivity");

// Open the Places activity by clicking the places button
// (to the right of the search box)
AndroidNativeDriver btn
= driver.findElement(By.id("btn_header_places"));
btn.click();

// Dismiss the Places window
// Equivalent to pressing the Android Back button
driver.navigate().back();

// Rotate the device to show the UI in landscape mode
driver.rotate(ScreenOrientation.LANDSCAPE);

Except for the startActivity method, and the use of a builder object to create the driver, all the API calls made are standard WebDriver API calls. Creating the driver with a builder is necessary because the driver can also be set up with an Android Debug Bridge (ADB) connection.

Android NativeDriver uses Instrumentation to monitor and manipulate the application under test. Instrumentation is a standard feature of Android but it has some limitations. For instance, it cannot drive UI which is part of another process. (If it could a malicious application could hijack the device.) This is where ADB saves the day. The ADB is a connection from outside the device and is not tied to a particular application, so with it we can inject events across applications. ADB also made it possible to add screenshot support, and we plan to utilize it in new ways as NativeDriver matures. This is one way Android NativeDriver tests are more powerful than standard Instrumentation tests.

You can try out Android NativeDriver right away with the tutorials for running a sample test or instrumenting your own application. You may also want to join the users or developers mailing list.

Good luck and happy native testing!

By Matt DeVore, Engineering Productivity Team

Introducing WebDriver

Friday, May 8, 2009

WebDriver is a clean, fast framework for automated testing of webapps. Why is it needed? And what problems does it solve that existing frameworks don't address?

For example, Selenium, a popular and well established testing framework is a wonderful tool that provides a handy unified interface that works with a large number of browsers, and allows you to write your tests in almost every language you can imagine (from Java or C# through PHP to Erlang!). It was one of the first Open Source projects to bring browser-based testing to the masses, and because it's written in JavaScript it's possible to quickly add support for new browsers that might be released

Like every large project, it's not perfect. Selenium is written in JavaScript which causes a significant weakness: browsers impose a pretty strict security model on any JavaScript that they execute in order to protect a user from malicious scripts. Examples of where this security model makes testing harder are when trying to upload a file (IE prevents JavaScript from changing the value of an INPUT file element) and when trying to navigate between domains (because of the single host origin policy problem).

Additionally, being a mature product, the API for Selenium RC has grown over time, and as it has done so it has become harder to understand how best to use it. For example, it's not immediately obvious whether you should be using "type" instead of "typeKeys" to enter text into a form control. Although it's a question of aesthetics, some find the large API intimidating and difficult to navigate.

WebDriver takes a different approach to solve the same problem as Selenium. Rather than being a JavaScript application running within the browser, it uses whichever mechanism is most appropriate to control the browser. For Firefox, this means that WebDriver is implemented as an extension. For IE, WebDriver makes use of IE's Automation controls. By changing the mechanism used to control the browser, we can circumvent the restrictions placed on the browser by the JavaScript security model. In those cases where automation through the browser isn't enough, WebDriver can make use of facilities offered by the Operating System. For example, on Windows we simulate typing at the OS level, which means we are more closely modeling how the user interacts with the browser, and that we can type into "file" input elements.

With the benefit of hindsight, we have developed a cleaner, Object-based API for WebDriver, rather than follow Selenium's dictionary-based approach. A typical example using WebDriver in Java looks like this:

// Create an instance of WebDriver backed by Firefox
WebDriver driver = new FirefoxDriver();

// Now go to the Google home page
driver.get("http://www.google.com");

// Find the search box, and (ummm...) search for something
WebElement searchBox = driver.findElement(By.name("q"));
searchBox.sendKeys("selenium");
searchBox.submit();

// And now display the title of the page
System.out.println("Title: " + driver.getTitle());

Looking at the two frameworks side-by-side, we found that the weaknesses of one are addressed by the strengths of the other. For example, whilst WebDriver's approach to supporting browsers requires a lot of work from the framework developers, Selenium can easily be extended. Conversely, Selenium always requires a real browser, yet WebDriver can make use of an implementation based on HtmlUnit which provides lightweight, super-fast browser emulation. Selenium has good support for many of the common situations you might want to test, but WebDriver's ability to step outside the JavaScript sandbox opens up some interesting possibilities.

These complementary capabilities explain why the two projects are merging: Selenium 2.0 will offer WebDriver's API alongside the traditional Selenium API, and we shall be merging the two implementations to offer a capable, flexible testing framework. One of the benefits of this approach is that there will be an implementation of WebDriver's cleaner APIs backed by the existing Selenium implementation. Although this won't solve the underlying limitations of Selenium's current JavaScript-based approach, it does mean that it becomes easier to test against a broader range of browsers. And the reverse is true; we'll also be emulating the existing Selenium APIs with WebDriver too. This means that teams can make the move to WebDriver's API (and Selenium 2) in a managed and considered way.

If you'd like to give WebDriver a try, it's as easy as downloading the zip files, unpacking them and putting the JARs on your CLASSPATH. For the Pythonistas out there, there's also a version of WebDriver for you, and a C# version is waiting in the wings. The project is hosted at http://webdriver.googlecode.com, and, like any project on Google Code, is Open Source (we're using the Apache 2 license) If you need help getting started, the project's wiki contains useful guides, and the WebDriver group is friendly and helpful (something which makes me feel very happy).

So that's WebDriver: a clean, fast framework for automated testing of webapps. We hope you like it as much as we do!

by Simon Stewart, Engineering Productivity Team