The Green Report | Flipping the Switch: Automating Feature Flag Testing

Flipping the Switch: Automating Feature Flag Testing

Dec 15th 2024 12 min read

medium

functional

Feature flags have become a powerful tool for enabling controlled feature rollouts, A/B testing, and experimentation. However, for QA automation engineers, features hidden behind these flags present unique challenges. How do you ensure comprehensive test coverage for functionality that might be disabled in one environment and active in another? In this blog post, we'll explore strategies for automating the testing of features behind feature flags, covering toggle-aware test setups, testing both enabled and disabled states, and programmatically managing flags to streamline our testing efforts.

What Are Feature Flags, and Why Do They Matter?

Feature flags, also known as feature toggles, are a powerful tool in software development that enable teams to control the activation of specific features without deploying new code. By toggling features on or off dynamically, developers can introduce new functionality incrementally, conduct A/B tests, or manage experimental features across different environments.

At their core, feature flags function as conditionals within the codebase. For example, a feature flag might dictate whether users see a new homepage design or continue to interact with the existing one. These flags are often controlled via configuration files, APIs, or feature management platforms like LaunchDarkly or Flagsmith, making them highly versatile and easy to manage.

Comparison of application functionality with a feature flag OFF (left) versus ON (right)

Why Do Feature Flags Matter?

Controlled Rollouts: Feature flags allow teams to release features to specific user groups or environments, reducing the risk of widespread failures.
A/B Testing: They make it possible to test multiple versions of a feature and measure user response before committing to a final implementation.
Rapid Experimentation: Developers can experiment with new ideas while maintaining the stability of the production environment.
Safe Rollbacks: If a feature causes issues, it can be turned off instantly without the need for a new deployment.

While feature flags are beneficial for development, they introduce complexity for QA teams. Testers must ensure that features behave as expected both when the flag is enabled and disabled. Additionally, feature flags can impact test reliability if their states aren't consistent across environments. Automation engineers, in particular, face the challenge of writing tests that adapt to flag status dynamically while ensuring complete coverage.

Setting Up Toggle-Aware Tests: Know When to Test

In a world of feature flags, blindly running automation tests without accounting for the flag's state can lead to inconsistent results and wasted test execution time. To tackle this, toggle-aware tests are essential. These tests dynamically adapt based on whether a feature is enabled or disabled, ensuring reliable results and focused coverage.

What Are Toggle-Aware Tests?

Toggle-aware tests are designed to check the status of a feature flag before executing a test. If the feature is enabled, the test proceeds to validate the new functionality. If the feature is disabled, the test either skips or verifies that the feature is absent.

This approach ensures that our tests stay relevant to the application's current state, saving resources and providing actionable insights.

Implementing Toggle-Aware Tests

Here's how to set up toggle-aware tests in a JavaScript testing framework like Playwright.

Step 1: Retrieve the Feature Flag Status

Use an API call, environment configuration, or mock data to determine whether the feature flag is active.

                
async function getFeatureFlag(flagName) {
  const response = await fetch(`https://feature-flags/status?flag=${flagName}`);
  const data = await response.json();
  return data.enabled; // Returns true or false
}

Step 2: Conditional Test Execution

Once we have the flag's status, we dynamically decide whether to run the test.

                
const { test } = require('@playwright/test');

test('New Feature Test', async ({ page }) => {
  const featureEnabled = await getFeatureFlag('new_feature');
  if (featureEnabled) {
    await page.goto('https://example.com/new-feature');
    await page.getByTestId('new-feature-button').click();
    const message = await page.locator('#feature-result').textContent();
    expect(message).toBe('Feature works!');
  } else {
    test.skip('Feature is disabled, skipping the test.');
  }
});

Advantages of Toggle-Aware Tests

Reduced Test Failures: Prevents false negatives caused by running tests for disabled features.
Efficient Test Execution: Focuses resources on valid test cases, skipping unnecessary ones.
Clear Reporting: Indicates whether a test was skipped due to the feature's status, improving traceability.

When to Use Toggle-Aware Tests

Toggle-aware tests are particularly useful in the following scenarios:

Dynamic Testing Environments: When feature flags are toggled frequently across environments (e.g., staging, production).
Partial Rollouts: When a feature is only enabled for specific user groups or regions.
Beta Testing: When testing experimental features that are not yet fully rolled out.

Two Sides of the Toggle: Testing Both Scenarios

Feature flags inherently create two distinct states for a feature: enabled and disabled. Thorough testing requires covering both scenarios to ensure the application behaves as expected regardless of the flag's state. This is critical for catching regressions, ensuring seamless functionality, and verifying that the feature is properly hidden when inactive.

Why Test Both Scenarios?

Feature Enabled: When the feature is active, tests should validate its functionality, interactions, and integration with other components.
Feature Disabled: When the feature is off, tests should ensure that the application remains stable, the feature is hidden or inaccessible, and there are no side effects.

To test both scenarios effectively, we can either:

Option 1: Parameterized Tests

Parameterized tests allow us to dynamically toggle the flag's state during test execution, reducing redundancy in our code.

                
const { test, expect } = require("@playwright/test");

const scenarios = [
  { state: "enabled", flagValue: true },
  { state: "disabled", flagValue: false },
];
                    
scenarios.forEach(({ state, flagValue }) => {
  test(`Feature ${state}: Validate behavior`, async ({ page }) => {
    // Mock or programmatically set the feature flag
    await setFeatureFlag("new_feature", flagValue);
                    
    await page.goto("https://example.com");
                    
    if (flagValue) {
      // Validate functionality when the feature is enabled
      await page.getByTestId("new-feature-button").click();
      const result = await page.locator("#feature-result").textContent();
      expect(result).toBe("Feature works!");
    } else {
      // Validate absence or stability when the feature is disabled
      await expect(page.getByTestId("new-feature-button")).not.toBeVisible();
    }
  });
});

Option 2: Separate Test Suites

Separate suites can be useful for environments where flag states are fixed (e.g., staging vs. production) or when we prefer distinct test runs.

                
test.describe("Feature Enabled", () => {
  test.beforeEach(async () => {
    await setFeatureFlag("new_feature", true);
  });
                      
  test("Validates new feature functionality", async ({ page }) => {
    await page.goto("https://example.com");
    await page.getByTestId("new-feature-button").click();
    const result = await page.locator("#feature-result").textContent();
    expect(result).toBe("Feature works!");
  });
});
                      
test.describe("Feature Disabled", () => {
  test.beforeEach(async () => {
    await setFeatureFlag("new_feature", false);
  });
                      
  test("Validates feature absence", async ({ page }) => {
    await page.goto("https://example.com");
    await expect(page.getByTestId("new-feature-button")).not.toBeVisible();
  });
});

Best Practices for Testing Both Scenarios:

Isolate Feature Flag Behavior: Ensure your tests only validate the impact of the feature flag without overlapping with other functionality.
Leverage Automation for Consistency: Use automation scripts to programmatically toggle feature flags in your test environment to ensure predictable conditions.
Use Assertions Generously: Validate both presence and absence of elements, functionality, and side effects in each scenario.

Automating Feature Flag Updates: Seamless Test Control

Manually toggling feature flags before running tests can be time-consuming and error-prone, especially when dealing with multiple environments or frequent test executions. Automating the process of managing feature flags ensures consistent test conditions and allows our automation suite to dynamically adjust as needed.

Why Automate Feature Flag Updates?

Consistency: Guarantees the desired flag state for every test execution.
Efficiency: Saves time by eliminating manual pre-test setup.
Scalability: Supports large test suites and multiple environments with minimal overhead.
Flexibility: Allows on-the-fly adjustments to feature flag states during test execution.

1. Use an API for Feature Management

Many feature management tools provide APIs to programmatically update feature flags. This is the most common and efficient method.

                
async function setFeatureFlag(flagName, isEnabled) {
  const response = await fetch('https://feature-flags/update', {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify({ flag: flagName, enabled: isEnabled })
  });
  if (!response.ok) {
    throw new Error(`Failed to update feature flag: ${flagName}`);
  }
}

2. Mock Feature Flags Locally

In some cases, mocking the feature flag's state locally in our test environment is a simpler alternative.

                
test('Test with mocked feature flag', async ({ page }) => {
  await page.addInitScript(() => {
    window.featureFlags = { new_feature: true }; // Mock feature flag
  });
                        
  await page.goto('https://example.com');
  await expect(page.getByTestId("new-feature-button")).toBeVisible();
});

This approach works well when the feature flag logic is implemented client-side and can be overridden during the test.

3. Use Configuration Files

For environments where feature flags are controlled via configuration files, updating these files programmatically before tests run can ensure the correct state.

                
const fs = require('fs');

function updateConfig(flagName, isEnabled) {
  const configPath = './config/feature-flags.json';
  const config = JSON.parse(fs.readFileSync(configPath, 'utf-8'));
  config[flagName] = isEnabled;
  fs.writeFileSync(configPath, JSON.stringify(config, null, 2));
}
                    
// Example usage in a test setup
test.beforeAll(() => {
  updateConfig('new_feature', true); // Enable the feature
});

4. Updating Feature Flags via Local Storage

Some applications manage feature flags in the browser's local storage. In such cases, automation scripts can directly manipulate local storage to update a specific flag while preserving the state of others. This approach is quick and ensures minimal interference with the application state.

                
test('Update specific feature flag in local storage', async ({ page }) => {
  // Step 1: Navigate to the application
  await page.goto('https://example.com');
                    
  // Step 2: Retrieve and update the feature flag state
  await page.evaluate(() => {
    // Fetch all feature flags stored in local storage
    const flags = JSON.parse(localStorage.getItem('featureFlags') || '{}');
                    
    // Update the specific flag without altering others
    flags['new_feature'] = true; // Enable the 'new_feature' flag
                    
    // Save the updated flags back to local storage
    localStorage.setItem('featureFlags', JSON.stringify(flags));
  });
                    
  // Step 3: Reload the page to apply changes
  await page.reload();
                    
  // Step 4: Validate the updated flag's effect
  await expect(page.getByTestId("new-feature-button")).toBeVisible();
});

Best Practices for Automating Feature Flags:

Centralize Flag Management: Create reusable utilities for managing feature flags across tests.
Handle Failures Gracefully: Include error handling in your automation scripts to ensure tests don't proceed with incorrect flag states.
Document Flag Dependencies: Clearly document which flags are required for each test to avoid confusion.
Clean Up After Tests: Reset feature flags to their default state to maintain consistency for subsequent tests.

Conclusion

Feature flags are powerful tools for managing feature rollouts, experimentation, and application behavior. Incorporating feature flag awareness into our test automation ensures comprehensive coverage, reduces risks, and streamlines the testing process.

By setting up toggle-aware tests, covering both enabled and disabled scenarios, and automating flag updates through APIs, mocks, configuration files, or local storage, we can create a robust and efficient testing strategy. With these practices, feature flags become an asset, not an obstacle, in delivering high-quality software.