Skip to main content

Advanced Patterns

Once your basic pipeline is working, advanced patterns help it scale. As test suites grow, you need parallelism to keep feedback fast. As deployments become more frequent, you need gates and rollback strategies to maintain stability.

Test Parallelism with Sharding

When your E2E suite takes 20+ minutes, sharding splits it across multiple machines. Each machine runs a subset of tests, and the total time drops proportionally.

Playwright Sharding

e2e-tests:
  name: E2E Tests (${{ matrix.shard }})
  runs-on: ubuntu-latest
  strategy:
    fail-fast: false
    matrix:
      shard: [1/4, 2/4, 3/4, 4/4]
  steps:
    - uses: actions/checkout@v4
    - uses: actions/setup-node@v4
      with:
        node-version: 20
        cache: 'npm'
    - run: npm ci
    - run: npx playwright install --with-deps
    - run: npx playwright test --shard=${{ matrix.shard }}
    - uses: actions/upload-artifact@v4
      if: always()
      with:
        name: playwright-report-${{ strategy.job-index }}
        path: playwright-report/

Four machines each run one-quarter of the test suite. A 20-minute suite finishes in roughly 5 minutes. Playwright distributes tests evenly based on file names.

Merging Shard Reports

After all shards complete, merge the reports into one:

merge-reports:
  name: Merge E2E Reports
  needs: e2e-tests
  if: always()
  runs-on: ubuntu-latest
  steps:
    - uses: actions/checkout@v4
    - uses: actions/setup-node@v4
      with:
        node-version: 20
        cache: 'npm'
    - run: npm ci

    - uses: actions/download-artifact@v4
      with:
        pattern: playwright-report-*
        path: all-reports/
        merge-multiple: true

    - run: npx playwright merge-reports --reporter=html ./all-reports
    - uses: actions/upload-artifact@v4
      with:
        name: merged-playwright-report
        path: playwright-report/

Jest Sharding

Jest also supports sharding natively:

unit-tests:
  strategy:
    matrix:
      shard: [1, 2, 3]
  steps:
    - run: npx jest --ci --shard=${{ matrix.shard }}/3

Flaky Test Detection

Flaky tests are tests that pass and fail intermittently without code changes. They destroy trust in the pipeline. Detecting them requires tracking test results over time.

Approach 1: Retry and Flag

Configure retries and check which tests needed them:

// playwright.config.ts
export default defineConfig({
  retries: 2,
  reporter: [
    ['html'],
    ['json', { outputFile: 'results.json' }],
  ],
});

Parse the JSON results in CI to find tests that passed only on retry:

// scripts/check-flaky.js
const results = require('./results.json');

const flaky = results.suites
  .flatMap(s => s.specs)
  .filter(spec => spec.tests.some(t => t.status === 'flaky'));

if (flaky.length > 0) {
  console.log('Flaky tests detected:');
  flaky.forEach(f => console.log(`  - ${f.title}`));
  // Post to Slack, create an issue, etc.
}

Approach 2: Quarantine

Move known-flaky tests to a separate test suite that runs but does not block merges:

test.describe('quarantined', () => {
  test.fixme('flaky navigation test', async ({ page }) => {
    // This test is skipped but tracked
  });
});

Review quarantined tests weekly and either fix them or delete them.

Test Result Reporting

Make test results visible directly in pull requests:

GitHub Job Summary

Write results to $GITHUB_STEP_SUMMARY for a markdown summary on the Actions run page:

- name: Report results
  if: always()
  run: |
    echo "## Test Results" >> $GITHUB_STEP_SUMMARY
    echo "" >> $GITHUB_STEP_SUMMARY
    echo "| Suite | Status | Duration |" >> $GITHUB_STEP_SUMMARY
    echo "|-------|--------|----------|" >> $GITHUB_STEP_SUMMARY
    echo "| Unit | ✅ Pass | 45s |" >> $GITHUB_STEP_SUMMARY
    echo "| Integration | ✅ Pass | 2m 10s |" >> $GITHUB_STEP_SUMMARY
    echo "| E2E | ❌ Fail | 8m 30s |" >> $GITHUB_STEP_SUMMARY

JUnit Reports

Many CI tools parse JUnit XML. Configure Jest and Playwright to output JUnit:

# Jest
npx jest --ci --reporters=default --reporters=jest-junit

# Playwright
npx playwright test --reporter=junit

Deployment Gates

Deployment gates add checks between your test pipeline and production deployment:

deploy-staging:
  needs: e2e-tests
  runs-on: ubuntu-latest
  environment: staging
  steps:
    - run: ./deploy.sh staging

smoke-tests:
  needs: deploy-staging
  runs-on: ubuntu-latest
  steps:
    - run: npx playwright test --project=smoke --config=playwright.smoke.config.ts
      env:
        BASE_URL: https://staging.example.com

deploy-production:
  needs: smoke-tests
  runs-on: ubuntu-latest
  environment:
    name: production
    url: https://example.com
  steps:
    - run: ./deploy.sh production

The environment key in GitHub Actions enables environment-specific rules:

  • Required reviewers: Specific people must approve the deployment.
  • Wait timer: A delay before deployment starts (useful for scheduled releases).
  • Branch restrictions: Only the main branch can deploy to production.

Configure environments in Settings > Environments.

Rollback Strategies

Even with quality gates, bugs reach production. A rollback strategy minimizes damage:

Instant Rollback

Keep the previous deployment available for instant revert:

deploy-production:
  steps:
    - name: Deploy new version
      run: |
        # Tag current deployment for rollback
        ./deploy.sh tag-current as-previous
        # Deploy new version
        ./deploy.sh production

    - name: Post-deploy smoke test
      run: npx playwright test --project=smoke
      env:
        BASE_URL: https://example.com

    - name: Rollback on failure
      if: failure()
      run: ./deploy.sh rollback-to-previous

Canary Deployments

Route a small percentage of traffic to the new version. Monitor error rates, and if they spike, roll back automatically:

- name: Canary deploy (10% traffic)
  run: ./deploy.sh canary --percent=10

- name: Monitor for 5 minutes
  run: ./scripts/check-error-rate.sh --threshold=1 --duration=300

- name: Full deploy
  run: ./deploy.sh production --percent=100

Key Takeaways

  • Shard tests across multiple machines to cut E2E suite time proportionally.
  • Track flaky tests with retry detection and quarantine persistent offenders.
  • Write test summaries to $GITHUB_STEP_SUMMARY for visibility in PRs.
  • Use GitHub environment protection rules as deployment gates with required reviewers.
  • Always have a rollback strategy — automated rollback on smoke test failure is the safest pattern.