The Green Report | WebdriverIO v9 with WebDriver BiDi

WebdriverIO v9 with WebDriver BiDi

Oct 6th 2024 9 min read

medium

With the release of WebdriverIO v9, browser automation takes a major leap forward thanks to the introduction of WebDriver BiDi (Bidirectional Protocol). This groundbreaking feature transforms how automation scripts interact with browsers, enabling real-time, two-way communication and opening up new possibilities for more dynamic and responsive testing.

Traditionally, WebDriver has operated using a request-response model, which limits the control and speed of certain browser interactions. With WebDriver BiDi, WebdriverIO users can now harness real-time events like network interception, console logging, and DOM changes without the delays and overhead of constantly polling the browser. It also provides better support for automating modern browser features, such as managing shadow DOMs, service workers, and handling client-side errors more effectively.

Why WebDriver BiDi is a Game-Changer for Automation

WebDriver BiDi introduces a new era in browser automation by fundamentally changing how automation tools communicate with browsers. Unlike the traditional WebDriver protocol, which relies on a one-way request-response pattern, WebDriver BiDi allows for two-way, real-time communication. This key enhancement opens up several powerful benefits:

Cross-Browser Consistency: One of the biggest challenges in browser automation is ensuring consistent behavior across different browsers. WebDriver BiDi aims to streamline cross-browser testing by providing a unified protocol that works with all major browser engines, including Chromium, Firefox, and WebKit. This means automation scripts can interact more reliably across browsers, reducing the need for browser-specific workarounds.
Real-Time Event Listening: With WebDriver BiDi, automation scripts can listen for browser events such as console logs, network requests, and DOM changes in real-time. This real-time monitoring allows for more dynamic tests, where scripts can respond immediately to changes in the browser environment, making it ideal for debugging or complex, interactive web applications.
Advanced Debugging Capabilities: The two-way communication of WebDriver BiDi enhances debugging by providing direct access to the browser's internal state. Automation scripts can capture logs, track performance, and monitor network traffic on the fly, giving testers deeper insight into issues without needing third-party tools.
Improved Performance: By reducing the overhead of repeated requests and allowing scripts to receive data from the browser as it happens, WebDriver BiDi makes automation tests faster and more efficient. This is particularly beneficial for testing large or complex applications that rely on real-time interactions.

BiDi-Enhanced Features in WebdriverIO v9

WebDriver BiDi powers several advanced features in WebdriverIO v9, enhancing the overall automation experience with real-time browser control and better cross-browser support. Let's explore some of the key BiDi-powered features that bring new capabilities to our automation scripts.

1. Custom Headers in URL Navigation

The url command has been expanded to allow passing custom headers during navigation. This can be especially useful when simulating user sessions or logging in automatically by setting authentication tokens.

                
await browser.url('https://example.com', {
  headers: {
    Authorization: 'Bearer your_token_here'
  }
});

In this example, the Authorization header is passed to the browser request, making it easier to automate tasks such as user authentication or session management.

2. Basic Authentication Automation

WebdriverIO v9 simplifies basic authentication by allowing credentials to be passed directly into the url command.

                
await browser.url('https://example.com/protected_page', {
    auth: {
        user: 'testUser',
        pass: 'testPassword'
    }
});
await expect($('h1=Welcome to the protected page')).toBeDisplayed();

With just a few lines, we can now automate scenarios that require basic authentication, reducing the need for manual login steps.

3. Improved Argument Serialization

Another enhancement driven by WebDriver BiDi is the improved argument serialization, which now enables more complex data types and structures to be passed between our automation scripts and the browser. Previously, passing objects, functions, or certain nested structures to browser-side JavaScript functions often led to limitations or required workarounds. WebDriver BiDi removes these barriers, allowing us to pass complex arguments more reliably.

                
const userInfo = {
  name: "Jane Doe",
  age: 28,
  address: { city: "Berlin", country: "Germany" },
};
                  
const info = await browser.execute(
  (user) =>
    `User: ${user.name}, Location: ${user.address.city}, ${user.address.country}`,
  userInfo
);
console.log(info);

4. Running Initialization Scripts

We can now inject JavaScript into the page before it fully loads using the onBeforeLoad parameter in the url command. This is useful for manipulating Web APIs or mocking data before the application interacts with them.

                
await browser.url('https://mywebsite.com/location-page', {
  onBeforeLoad(win) {
    win.navigator.geolocation.getCurrentPosition = (successCallback, errorCallback) => {
      const position = {
        coords: {
          latitude: 40.7128,
          longitude: -74.0060, 
          accuracy: 100, 
        }
      };
      successCallback(position);
    };
  }
});
                    
await expect($('.location-lat')).toHaveText('Latitude: 40.7128');
await expect($('.location-lon')).toHaveText('Longitude: -74.0060');

In this custom example the geolocation.getCurrentPosition API is mocked to return a fake location (latitude and longitude for New York City).

5. Add Initialization Script to All Contexts

The new addInitScript command allows us to inject a script that runs every time a new context is created, such as when a new iframe is loaded or a navigation occurs.

                
const script = await browser.addInitScript(() => {
  document.addEventListener('click', (event) => {
    const clickedElement = event.target;
    console.log(`Clicked element: ${clickedElement.tagName}, ID: ${clickedElement.id}`);
  });
});
                    
// This will log any click events across all contexts (if a button, link, or any element is clicked)
script.on('data', (data) => {
  console.log(`User clicked: ${data}`); // logs clicked element information
});

In this example the script monitors user clicks on any element within the page, logging the tag name and ID of the clicked element.

6. Cross-Browser Request Mocking

Previously limited to Chromium browsers, request mocking now works across all browsers, thanks to BiDi support. This makes it easier to modify requests and responses.

                
const mock = await browser.mock("**/products/**", {
  statusCode: 200,
  headers: { "Content-Type": "application/json" },
});
                  
mock.respond({
  products: [
    { id: 1, name: "Custom Product 1", price: "100" },
    { id: 2, name: "Custom Product 2", price: "150" },
  ],
});
                  
// Assert that the mocked products are displayed correctly
await browser.url("https://example.com/shop");
await expect($(".product-1")).toHaveText("Custom Product 1");
await expect($(".product-2")).toHaveText("Custom Product 2");

In this code example, a mock is set up to intercept requests to the **/products/** endpoint, responding with a custom list of products, which is then verified on the webpage to ensure the correct product names and prices are displayed.

7. Shadow DOM Handling and Automatic Shadow Root Piercing

WebdriverIO v9 introduces seamless handling of Shadow DOM, including both open and closed shadow roots. We can now easily automate interactions with elements nested inside Shadow DOM structures.

                
<div class="app">
  <custom-button>
    <shadow-root>
      <button>Click me</button>
    </shadow-root>
  </custom-button>
  <div class="notification"></div>
</div>

With automatic shadow root piercing, interacting with shadow DOM elements becomes as easy as interacting with regular DOM elements, significantly improving test reliability.

                
const shadowButton = await browser.$('custom-button button');
await shadowButton.click();
await expect(browser.$('.notification')).toHaveText('Button clicked!');

8. Viewport Control and Mobile Emulation

In WebdriverIO v9, the setViewport and emulate commands make it easier to simulate mobile devices and control the viewport without affecting the browser window size.

                
await browser.emulate('device', 'iPhone 14 Pro Max');
const width = await browser.execute(() => window.innerWidth);
console.log(width); // outputs the viewport width for iPhone 14 Pro Max

This command ensures our application renders correctly on mobile devices, providing a straightforward method to test mobile layouts.

9. Fake Timers for Clock Control

With the clock option in the emulate command, we can now simulate different times and control how time progresses in the browser.

                
const clock = await browser.emulate("clock", { now: new Date(2020, 0, 1) });
console.log(await browser.execute(() => new Date().getFullYear())); // outputs: 2020
await clock.tick(7200000); // advances time by two hours
console.log(await browser.execute(() => new Date().getHours())); // outputs: 2

This is useful for testing time-sensitive functionalities, such as countdowns or calendar applications.

Looking Ahead: The Future of WebdriverIO with BiDi

The integration of WebDriver BiDi support into WebdriverIO heralds a transformative phase for this powerful testing framework. As we look ahead, several exciting possibilities emerge that promise to enhance the capabilities of WebdriverIO and streamline the automation testing process.

Unified Cross-Browser Testing: WebDriver BiDi simplifies the challenge of cross-browser automation by offering a standardized protocol across all major browser engines - Chromium, Firefox, and WebKit. This consistency leads to more reliable test behavior across different browsers and reduces the reliance on browser-specific adjustments.
Real-Time Event Handling: With the capability to listen for live browser events such as console logs, network requests, and DOM updates, BiDi allows for dynamic testing workflows. Scripts can react instantly to changes in the browser environment, enabling more interactive and responsive test scenarios, particularly in debugging and complex web applications.
Enhanced Debugging Tools: The bidirectional nature of BiDi equips testers with deeper access to the browser's internal workings. Automation scripts can directly capture logs, monitor network traffic, and track performance in real-time, offering a more thorough understanding of issues without the need for external debugging tools.
Optimized Performance: By facilitating real-time data transfer from the browser and minimizing the back-and-forth of request handling, WebDriver BiDi accelerates test execution. This performance boost is particularly advantageous for large or dynamic applications that depend on immediate interactions.

Conclusion

The introduction of WebDriver BiDi in WebdriverIO v9 represents a significant advancement in browser automation, enabling more efficient, reliable, and responsive testing processes. With features like cross-browser consistency, real-time event handling, and enhanced debugging capabilities, BiDi empowers testers to create more dynamic and accurate test scenarios. As automation continues to evolve, embracing these new capabilities will be crucial for maintaining high-quality applications.

While this release brings exciting possibilities, it is essential to acknowledge that, like any major update, it may contain bugs or unintended issues. We encourage all testers to actively explore the new features and report any errors or inconsistencies they encounter to the WebdriverIO team. Your feedback is invaluable in helping improve the tool and ensuring it meets the needs of the testing community.