Once your basic pipeline is working, advanced patterns help it scale. As test suites grow, you need parallelism to keep feedback fast. As deployments become more frequent, you need gates and rollback strategies to maintain stability.
Test Parallelism with Sharding
When your E2E suite takes 20+ minutes, sharding splits it across multiple machines. Each machine runs a subset of tests, and the total time drops proportionally.
Playwright Sharding
e2e-tests:
name: E2E Tests (${{ matrix.shard }})
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
shard: [1/4, 2/4, 3/4, 4/4]
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: 20
cache: 'npm'
- run: npm ci
- run: npx playwright install --with-deps
- run: npx playwright test --shard=${{ matrix.shard }}
- uses: actions/upload-artifact@v4
if: always()
with:
name: playwright-report-${{ strategy.job-index }}
path: playwright-report/Four machines each run one-quarter of the test suite. A 20-minute suite finishes in roughly 5 minutes. Playwright distributes tests evenly based on file names.
Merging Shard Reports
After all shards complete, merge the reports into one:
merge-reports:
name: Merge E2E Reports
needs: e2e-tests
if: always()
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: 20
cache: 'npm'
- run: npm ci
- uses: actions/download-artifact@v4
with:
pattern: playwright-report-*
path: all-reports/
merge-multiple: true
- run: npx playwright merge-reports --reporter=html ./all-reports
- uses: actions/upload-artifact@v4
with:
name: merged-playwright-report
path: playwright-report/Jest Sharding
Jest also supports sharding natively:
unit-tests:
strategy:
matrix:
shard: [1, 2, 3]
steps:
- run: npx jest --ci --shard=${{ matrix.shard }}/3Flaky Test Detection
Flaky tests are tests that pass and fail intermittently without code changes. They destroy trust in the pipeline. Detecting them requires tracking test results over time.
Approach 1: Retry and Flag
Configure retries and check which tests needed them:
// playwright.config.ts
export default defineConfig({
retries: 2,
reporter: [
['html'],
['json', { outputFile: 'results.json' }],
],
});Parse the JSON results in CI to find tests that passed only on retry:
// scripts/check-flaky.js
const results = require('./results.json');
const flaky = results.suites
.flatMap(s => s.specs)
.filter(spec => spec.tests.some(t => t.status === 'flaky'));
if (flaky.length > 0) {
console.log('Flaky tests detected:');
flaky.forEach(f => console.log(` - ${f.title}`));
// Post to Slack, create an issue, etc.
}Approach 2: Quarantine
Move known-flaky tests to a separate test suite that runs but does not block merges:
test.describe('quarantined', () => {
test.fixme('flaky navigation test', async ({ page }) => {
// This test is skipped but tracked
});
});Review quarantined tests weekly and either fix them or delete them.
Test Result Reporting
Make test results visible directly in pull requests:
GitHub Job Summary
Write results to $GITHUB_STEP_SUMMARY for a markdown summary on the Actions run page:
- name: Report results
if: always()
run: |
echo "## Test Results" >> $GITHUB_STEP_SUMMARY
echo "" >> $GITHUB_STEP_SUMMARY
echo "| Suite | Status | Duration |" >> $GITHUB_STEP_SUMMARY
echo "|-------|--------|----------|" >> $GITHUB_STEP_SUMMARY
echo "| Unit | ✅ Pass | 45s |" >> $GITHUB_STEP_SUMMARY
echo "| Integration | ✅ Pass | 2m 10s |" >> $GITHUB_STEP_SUMMARY
echo "| E2E | ❌ Fail | 8m 30s |" >> $GITHUB_STEP_SUMMARYJUnit Reports
Many CI tools parse JUnit XML. Configure Jest and Playwright to output JUnit:
# Jest
npx jest --ci --reporters=default --reporters=jest-junit
# Playwright
npx playwright test --reporter=junitDeployment Gates
Deployment gates add checks between your test pipeline and production deployment:
deploy-staging:
needs: e2e-tests
runs-on: ubuntu-latest
environment: staging
steps:
- run: ./deploy.sh staging
smoke-tests:
needs: deploy-staging
runs-on: ubuntu-latest
steps:
- run: npx playwright test --project=smoke --config=playwright.smoke.config.ts
env:
BASE_URL: https://staging.example.com
deploy-production:
needs: smoke-tests
runs-on: ubuntu-latest
environment:
name: production
url: https://example.com
steps:
- run: ./deploy.sh productionThe environment key in GitHub Actions enables environment-specific rules:
- Required reviewers: Specific people must approve the deployment.
- Wait timer: A delay before deployment starts (useful for scheduled releases).
- Branch restrictions: Only the main branch can deploy to production.
Configure environments in Settings > Environments.
Rollback Strategies
Even with quality gates, bugs reach production. A rollback strategy minimizes damage:
Instant Rollback
Keep the previous deployment available for instant revert:
deploy-production:
steps:
- name: Deploy new version
run: |
# Tag current deployment for rollback
./deploy.sh tag-current as-previous
# Deploy new version
./deploy.sh production
- name: Post-deploy smoke test
run: npx playwright test --project=smoke
env:
BASE_URL: https://example.com
- name: Rollback on failure
if: failure()
run: ./deploy.sh rollback-to-previousCanary Deployments
Route a small percentage of traffic to the new version. Monitor error rates, and if they spike, roll back automatically:
- name: Canary deploy (10% traffic)
run: ./deploy.sh canary --percent=10
- name: Monitor for 5 minutes
run: ./scripts/check-error-rate.sh --threshold=1 --duration=300
- name: Full deploy
run: ./deploy.sh production --percent=100Key Takeaways
- Shard tests across multiple machines to cut E2E suite time proportionally.
- Track flaky tests with retry detection and quarantine persistent offenders.
- Write test summaries to
$GITHUB_STEP_SUMMARYfor visibility in PRs. - Use GitHub environment protection rules as deployment gates with required reviewers.
- Always have a rollback strategy — automated rollback on smoke test failure is the safest pattern.