The Green Report | Frontend Load Testing Against the Thundering Herd Effect

Frontend Load Testing Against the Thundering Herd Effect

Sep 21st 2025 10 min read

medium

load

performance

api

I recently came across the details of Cloudflare's September outage and couldn't help but think about how automated testing might have caught the issue before it reached production. There are several approaches that could help, including frontend performance monitoring, load testing the dashboard, and validating API call patterns. In this post, however, we'll focus on the first one: frontend performance monitoring.

Introduction: The Thundering Herd in Frontend Applications

In September, Cloudflare experienced a major outage that took down its dashboard and impacted several APIs. The root cause was a seemingly small frontend bug: a React useEffect hook that repeatedly triggered calls to the Tenant Service API. Instead of fetching data once, the dashboard ended up spamming the backend with requests, overloading a critical service and causing a chain reaction of failures.

This type of failure is often referred to as the Thundering Herd effect. In frontend applications, it happens when many users or sessions unintentionally generate excessive, repeated API calls at the same time. The result is a surge of traffic that the backend cannot handle gracefully. While the problem may look like a backend scaling issue, it often starts in the frontend.

Common triggers for this effect include misconfigured React hooks (as in Cloudflare's case), aggressive retry mechanisms, and user actions like frequent page refreshes that re-initiate API requests. Because dashboards and client-facing apps rely heavily on multiple APIs, even a minor frontend bug can quickly escalate into a backend outage.

The goal of this post is to show how frontend load testing can help detect these patterns early, before they ever reach production. By simulating real user sessions and tracking API call behavior, QA teams can spot the warning signs of a Thundering Herd long before customers feel the impact.

Simulating the Bug in a Mock Dashboard

To understand how this kind of issue looks in practice, we built a small mock dashboard to reproduce the Cloudflare bug. The demo is a simple HTML page (index.html) that simulates a React component with a problematic useEffect dependency. Instead of running once, the hook gets triggered repeatedly because it depends on an object that is recreated on every render:

                
async useEffect_Organizations() {
  // BUG: object recreated on every render, so effect runs repeatedly
  const problematicDependency = {
    timestamp: Date.now(),
    randomValue: Math.random()
  };

  this.rerenderCount++;
  console.log(`useEffect re-run #${this.rerenderCount} - fetching organizations...`);

  await this.fetchOrganizations();

  // Trigger more state changes, leading to more re-renders
  if (this.rerenderCount < 15) {
    setTimeout(() => {
      this.setState({ loading: !this.state.loading });
    }, 100);
  }
}

Each re-render makes the dashboard call the /api/organizations endpoint again, multiplying requests far beyond what a normal session should generate.

The mock dashboard makes this behavior visible with two key elements:

API Calls Log: A panel on the side records every API request the dashboard makes. Calls to /api/organizations are highlighted in red, while other requests show up in green:
```
                        
<div class="api-log">
  <h3>API Calls Log</h3>
  <div id="api-log-content"></div>
</div>
                        
```

Bug Indicator: A floating badge at the bottom corner keeps a running count of organization calls. The background color changes from orange to red as the number climbs:

                        
<div class="bug-indicator" id="bug-indicator">
  useEffect Bug ACTIVE<br>
  <small id="org-call-count">Org calls: 0</small>
</div>

This setup closely mirrors the problem Cloudflare faced: a frontend bug that silently hammered the backend with repeated API requests. In the demo, you can actually watch the Thundering Herd effect emerge in real time as the API log fills with excessive calls. What seems like a harmless frontend issue—one misconfigured hook—quickly becomes a backend reliability problem.

Detecting the Herd with k6 Load Testing

Catching this kind of bug requires more than just unit tests. We need a way to simulate real user sessions, track API usage, and flag abnormal patterns. This is where k6, a popular open-source load testing tool, comes in handy.

I created a trimmed test script that focuses specifically on detecting excessive calls to the /api/organizations endpoint. Here's the core structure:

                
import http from "k6/http";
import { check, sleep } from "k6";
import { Counter, Rate, Trend } from "k6/metrics";

// Metrics
const organizationsApiCalls = new Counter("organizations_api_calls_total");
const excessiveCallsRate = new Rate("users_with_excessive_calls");
const apiCallsPerUser = new Trend("api_calls_per_user");

export const options = {
  stages: [
    { duration: "30s", target: 5 },
    { duration: "1m", target: 20 },
    { duration: "30s", target: 0 },
  ],
  thresholds: {
    organizations_api_calls_total: ["count<1500"],
    users_with_excessive_calls: ["rate<0.1"],
    api_calls_per_user: ["avg<5"],
    http_req_duration: ["p(95)<2000"],
  },
};

Let's break down the key metrics that make this script effective:

Total /organizations API calls: Tracked with organizations_api_calls_total. In a healthy dashboard session, each user should only need 1-2 calls. With multiple users active, the total number should stay well below 1,500 calls. If it climbs higher, it signals the dashboard is making excessive, repeated requests.
Calls per user: Measured by the api_calls_per_user trend. The threshold here is set to average < 5. This provides some buffer for retries, but anything higher indicates a pattern of repeated unnecessary calls.
Rate of users with excessive calls: Captured with the users_with_excessive_calls rate. The threshold is < 0.1, meaning fewer than 10% of users should ever trigger abnormal API usage. If many users cross that line, it strongly suggests a frontend bug affecting everyone.
Response time guardrail: The script also includes a performance safeguard (http_req_duration p(95)<2000) to ensure backend latency isn't silently degrading.

In the test's main function, we simulate a user session and measure how many times the /organizations endpoint is hit:

                
export default function () {
  let userApiCalls = { organizations: 0, total: 0 };

  for (let i = 0; i < 20; i++) {
    const res = http.get(`${baseUrl}/api/organizations`, {
      tags: { name: "organizations_api" },
    });

    userApiCalls.organizations++;
    userApiCalls.total++;

    check(res, {
      "organizations API responds": (r) => r.status === 200,
    });

    sleep(0.1);
  }

  organizationsApiCalls.add(userApiCalls.organizations);
  apiCallsPerUser.add(userApiCalls.organizations);

  if (userApiCalls.organizations > 10) {
    excessiveCallsRate.add(true);
  } else {
    excessiveCallsRate.add(false);
  }
}

When this script is run against the buggy dashboard, the metrics will quickly reveal abnormal patterns:

In other words, k6 transforms the hidden frontend bug into a visible performance signal. Instead of waiting for the backend to tip over in production, we can catch the Thundering Herd effect in a controlled test environment.

Running the Test

With the script ready, running the load test is straightforward. From your terminal, simply execute:

                
k6 run -e BUG_ENABLED=true .\dashboardBugDetection.js

This will spin up virtual users, load the dashboard, and begin tracking how often the /api/organizations endpoint is called.

What to Expect

When the bug is present you'll quickly see metrics breach their thresholds:
- Total /organizations calls far above 100.
- Average calls per user well beyond 5.
- Many users flagged as having “excessive calls.”
When behavior is normal the numbers remain stable:
- 1-2 /organizations calls per user.
- Average comfortably under 5.
- Fewer than 10% of users showing any abnormal patterns.

By comparing normal vs buggy runs, it becomes obvious how a seemingly minor frontend issue can generate a Thundering Herd effect, overwhelming the backend long before infrastructure alarms go off.

Conclusion

The key lesson from Cloudflare's outage is clear: frontend load testing belongs in every QA strategy. Dashboards and control panels touch multiple APIs, and a single misconfigured hook can trigger a flood of requests that overwhelms the backend. By simulating user sessions and enforcing thresholds on API call patterns, QA teams can spot these issues long before they reach production.

While we focused on frontend load testing in this post, there are other powerful approaches worth exploring. For example, API call pattern validation can confirm that each user action triggers the expected number of requests, not dozens. React component testing can catch dependency array mistakes in useEffect. End-to-end dashboard tests can monitor network calls during real user flows, while observability checks—such as request tagging—can make it easier to distinguish retries from new calls.

Together, these practices form a safety net. They ensure that what looks like a harmless UI bug doesn't escalate into a full-blown outage.

The full code examples for the mock dashboard and k6 test are available on our GitHub page, so you can experiment with these approaches yourself.