The Green Report | The transition from Touch to W3C Actions in Selenium

The transition from Touch to W3C Actions in Selenium

Mar 5th 2023 6 min read

easy

mobile

Last year, when Appium 8.0 released, it introduced a couple of changes, the main one being the migration to Selenium 4.

With these changes, Appium also deprecated the use of the TouchActions and MultiTouchActions classes. Appium will fully drop support for these classes in a future release, and developers are recommended to use the W3C actions instead.

Why?

The TouchActions API was originally introduced as an alternative to the W3C Actions API, but it was not standardized and had limited support across different browsers and platforms. Therefore, to ensure better compatibility and standardization across different platforms, the W3C Actions API has been adopted as the standard API for performing user interactions in Selenium WebDriver.

W3C Actions API

The W3C Actions API is a standardized API for performing complex user interactions in web and mobile applications. It allows developers to perform various types of interactions such as mouse clicks, mouse movements, keyboard inputs, drag and drop, multi-touch gestures, and more.

The API is based on a sequence of actions that are defined using a builder pattern. The builder pattern allows developers to chain different actions together to create a sequence of actions that can be executed in a specific order. The sequence of actions is then executed using the perform() method.

How do we migrate?

If we access the documentation for Appium and navigate to the touch interactions section, we will discover brief examples for common interactions such as the single tap gesture:

                             
TouchActions action = new TouchActions(driver);
action.singleTap(element);
action.perform();

To migrate this method to be W3C compliant, we need to do the following:

locate the element we want to tap
create a PointerInput object that represents a touch input device
create a Sequence object that represents a sequence of pointer actions that will be executed
add a pointer action to the sequence that will move the pointer to the element we want to interact with
add a pointer action that simulates a touch down event on the element
add a pointer action that simulates a touch up event on the element
execute the sequence of pointer actions

Let's see how we can do this in Java.

Single tap code example

First, we will define a public method called performSingleTap():

                             
public void performSingleTap() {}

Next, we need to locate our element on the screen. We will use the getLocation() and getSize() methods from Selenium to calculate the singleTapButton elements' center coordinates:

                             
Point sourceLocation = singleTapButton.getLocation();
Dimension sourceSize = singleTapButton.getSize();
int centerX = sourceLocation.getX() + sourceSize.getWidth() / 2;
int centerY = sourceLocation.getY() + sourceSize.getHeight() / 2;

Now we need our pointer input that will represent a touch input device:

                             
PointerInput finger = new PointerInput(PointerInput.Kind.TOUCH, "finger");

We will call our sequence object tap and associate it with the finger input device:

                             
Sequence tap = new Sequence(finger, 1);

What remains is to add the pointer actions. First, we move to our button:

                             
tap.addAction(finger.createPointerMove(Duration.ofMillis(0), PointerInput.Origin.viewport(), centerX, centerY));

Then we perform a touch down event on the button using the left mouse button:

                             
tap.addAction(finger.createPointerDown(PointerInput.MouseButton.LEFT.asArg()));

And one for the touch up event:

                             
tap.addAction(finger.createPointerUp(PointerInput.MouseButton.LEFT.asArg()));

Once all pointer actions are added we can execute the tap sequence of pointer actions using the perform() method of the driver object, which is an instance of the WebDriver interface. The List.of(tap) method call creates a list of one Sequence object to be executed by the perform() method:

                             
driver.perform(List.of(tap));

Double tap code example

We have just seen the single tap example, and it can be inferred that the double tap example will be quite similar. The only difference, aside from repeating the single tap actions twice, is the inclusion of brief pauses between the pointer up and down actions.

When a user performs a double tap on a touch screen, there is usually a slight delay between the first tap and the second tap. The inclusion of small pauses between the pointer up and down actions in the code mimics this delay. Without these pauses, the double tap action may be executed too quickly, potentially leading to unexpected behavior or errors in the tested application.

                             
doubleTap.addAction(finger.createPointerDown(PointerInput.MouseButton.LEFT.asArg()));
doubleTap.addAction(new Pause(finger, Duration.ofMillis(100)));
doubleTap.addAction(finger.createPointerUp(PointerInput.MouseButton.LEFT.asArg()));
doubleTap.addAction(new Pause(finger, Duration.ofMillis(50)));
doubleTap.addAction(finger.createPointerDown(PointerInput.MouseButton.LEFT.asArg()));
doubleTap.addAction(new Pause(finger, Duration.ofMillis(100)));
doubleTap.addAction(finger.createPointerUp(PointerInput.MouseButton.LEFT.asArg()));

Drag & Drop code example

Performing a drag-and-drop action is not that much different from the example above. This time, we need to locate the element we want to drag, and we also need to locate the destination area.

To do that, we can reuse the code from the above examples and calculate the center coordinates of the source and target elements.

This time, the actions sequence will be as follows: we move the pointer to the source element, perform a touch-down event on the element, move the element to the target location, and finally perform a touch-up event:

                             
dragNDrop.addAction(finger.createPointerMove(Duration.ofMillis(0), PointerInput.Origin.viewport(), centerX, centerY));
dragNDrop.addAction(finger.createPointerDown(PointerInput.MouseButton.LEFT.asArg()));
dragNDrop.addAction(finger.createPointerMove(Duration.ofMillis(700), PointerInput.Origin.viewport(),centerX2, centerY2));
dragNDrop.addAction(finger.createPointerUp(PointerInput.MouseButton.LEFT.asArg()));

Scroll/Swipe code example

To perform a scroll/swipe action on the screen we first need to calculate the starting position. For our example we will use the middle of the screen:

                             
int startX = driver.manage().window().getSize().getWidth() / 2;
int startY = driver.manage().window().getSize().getHeight() / 2;

And we also need to calculate the ending position of the scroll/swipe. Since we will perform a vertical swipe we will only calculate the vertical coordinate. We will set it to be 20% of the screen height from the top:

                             
int endY = (int) (driver.manage().window().getSize().getHeight() * 0.2);

The pointer input and sequence objects are pretty much the same as before:

                             
PointerInput finger = new PointerInput(PointerInput.Kind.TOUCH, "finger");
Sequence scroll = new Sequence(finger, 0);

The actions for this function are as follows: moving the pointer to the starting position, simulating a touch-down event on the screen, moving the pointer from the starting position to the ending position for 600 milliseconds, and performing a touch-up event:

                             
scroll.addAction(finger.createPointerMove(Duration.ZERO, PointerInput.Origin.viewport(), startX, startY));
scroll.addAction(finger.createPointerDown(PointerInput.MouseButton.LEFT.asArg()));
scroll.addAction(finger.createPointerMove(Duration.ofMillis(600), PointerInput.Origin.viewport(), startX, endY));
scroll.addAction(finger.createPointerUp(PointerInput.MouseButton.LEFT.asArg()));
driver.perform(List.of(scroll));

How to test it out?

To test the functionality of the methods, I created a small Android application that will be available with code examples. The application has four buttons:

If you tap on the 'single tap' button, the text at the bottom of the main activity will be updated
If you double tap on the 'double tap' button, the text at the bottom of the main activity will be updated with a different text
If you tap on the 'drag and drop' button, you will be navigated to a new screen where you can drag an element to a white box to update the text at the bottom
If you tap the 'scroll' button, you will be directed to a new screen that features a scrollable list where you can verify if an element is visible on the screen or not.

android application for testing the interactions

You can test this out on any app you have, this is just in case you are looking for one that covers all the examples from above. 🙂

To ensure better compatibility and standardization across different platforms, the W3C Actions API has been adopted as the standard API for performing user interactions in Selenium WebDriver.

The TouchActions API has been deprecated in the latest version of Selenium WebDriver because it has been replaced by the W3C Actions API.

As mentioned previously, you can find the code examples, tests, and Android app on our GitHub repository. Have fun!