The Green Report | Writing a Custom CLI Script to Shard Tests Without Paid CI Parallelism

Writing a Custom CLI Script to Shard Tests Without Paid CI Parallelism

May 13th 2026 5 min read

easy

reporting

github

shell

If you're running Playwright on a free CI tier, you've probably stared at a 15-minute test run wondering if there's a better way. Most teams assume parallelism requires a premium CI plan, a paid testing platform, or a complex setup. There isn't. Playwright ships with a built-in --shard flag that splits your test suite into independent chunks, each runnable as its own CLI command. It's free, deterministic, and almost nobody talks about it.

What --shard Actually Does

The --shard flag takes the format --shard=X/N, where N is the total number of shards you want and X is the one you're currently running. So --shard=1/3 means "run the first third of the test suite", --shard=2/3 the second, and so on. Playwright splits tests deterministically by file, so the same shard will always contain the same tests across runs, which matters when you're chasing a flaky failure.

One thing worth clarifying: sharding is not the same as --workers. Workers control how many tests run in parallel within a single process on one machine. Sharding distributes tests across completely separate processes or machines, with no shared state or coordination between them. You can actually combine both: run 3 shards, each with 4 workers, and get multiplicative throughput without any extra tooling.

The Manual Setup: Running Shards Sequentially on One Runner

Before jumping to full parallelism, there's a simpler use case worth knowing: running shards one after another on a single runner. It won't cut your total runtime, but it gives you isolated reports per shard, which makes failure triage significantly easier. Instead of scanning one giant report for related failures, you can immediately see which slice of the suite is hurting.

Here's a basic shell script that does it:

                
for i in 1 2 3; do
  npx playwright test --shard=$i/3 --reporter=blob --output=blob-report-$i
done

Each shard writes its own blob report to a separate folder. If shard 2 fails, you re-run only that shard while the other results stay intact. This pattern is also useful during local debugging when you want to narrow down which group of test files contains a problem, without running the full suite every time.

Stitching Results Back Together

Running shards separately means you end up with multiple reports, which isn't ideal for reviewing results or sharing with the team. This is where most guides stop, but Playwright actually ships a built-in CLI command to handle exactly this: merge-reports.

Once each shard has finished and written its blob report, run this to combine them into a single HTML report:

                
npx playwright merge-reports --reporter=html blob-report-1 blob-report-2 blob-report-3

The output is a single Trace Viewer-compatible HTML report covering all shards, with full timeline, network requests, screenshots, and console logs intact. No information is lost in the merge. If you want JSON output instead for piping into other tooling, swap --reporter=html for --reporter=json and the same merge logic applies.

One practical tip: name your blob output folders consistently (like blob-report-$i in the script above) so the merge step stays predictable and scriptable without manual adjustments between runs.

Applying This to a Real Free-Tier CI Setup

This is where sharding goes from useful to genuinely fast. GitHub Actions free tier allows multiple concurrent jobs, and combining that with a matrix strategy lets you run all shards in parallel without paying for anything extra. Here's what that looks like:

                
jobs:
  test:
    runs-on: ubuntu-latest
    strategy:
      matrix:
        shard: [1, 2, 3]
    steps:
      - uses: actions/checkout@v5
      - uses: actions/setup-node@v6
      - run: npm ci
      - run: npx playwright install --with-deps
      - run: npx playwright test --shard=${{ matrix.shard }}/3 --reporter=blob --output=blob-report
      - uses: actions/upload-artifact@v7
        with:
          name: blob-report-${{ matrix.shard }}
          path: blob-report

  merge-reports:
    needs: test
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v5
      - uses: actions/setup-node@v6
      - run: npm ci
      - uses: actions/download-artifact@v8
        with:
          pattern: blob-report-*
          merge-multiple: true
          path: all-blob-reports
      - run: npx playwright merge-reports --reporter=html all-blob-reports
      - uses: actions/upload-artifact@v7
        with:
          name: html-report
          path: playwright-report

Each shard runs as its own job concurrently, uploads its blob report as an artifact, and a final merge job waits for all three before combining them into one HTML report. If you're on GitLab CI, Bitbucket Pipelines, or CircleCI, the same pattern applies using their equivalent matrix and artifact features. The Playwright CLI commands stay identical regardless of the CI platform.

When Sharding Isn't the Right Tool

Sharding solves a specific problem, and it's worth being honest about where it falls short. If your test suite is slow because of one or two heavyweight tests, splitting the suite into shards won't help much. Those slow tests will still dominate whichever shard they land in. In that case, look at --workers first and consider whether those tests can be optimized or parallelized internally.

Sharding also assumes your tests are stateless and independent. If multiple shards hit the same database, the same test user account, or any shared external resource with side effects, you'll likely see conflicts that are hard to debug because they only appear in parallel runs. Before sharding, it's worth auditing your fixtures and setup steps to confirm each test can run in isolation without relying on state left behind by another.

Think of sharding as a scaling tool, not a fix for underlying test design problems. Get the independence right first, then shard.

Key Takeaways

Playwright's --shard flag gives you genuine test distribution without any additional tooling or paid services. Running shards sequentially on a single runner improves triage; running them as parallel matrix jobs cuts your actual wall-clock time. The merge-reports command ties it all together into one clean report your whole team can review. The only real prerequisite is a test suite written with isolation in mind. If that's already in place, sharding is a one-afternoon setup with compounding returns every time your suite grows.