Your report stops at data. QAI starts there.
AI-powered CI failure analysis

Your CI tells you what failed.QAI tells you why.

Engineering teams waste hours every week debugging CI failures. QAI cuts that time by giving you root cause, history, and a fix — directly on the pull request.

No credit card required.

Not ready to connect your real pipeline? Fork useqai/demo-shop →
github.com · PR #28 · QAI Agent
Q
QAI Agent · Test Intelligence
17 tests · main ← feature/checkout-redesign
High Risk · Do not merge — investigate failures first
9 passed8 failed3 clusters
cart.spec.ts added item appears in cartRCA · UI changed 85%
✦ AI Analysis
The price locator matches multiple elements (paragraph and span both showing $54.95), causing a strict mode violation. Use getByRole or filter by font-semibold.
/
200 · 0ms
/cart
200 · 12ms
POST /api/cart
201 · 38ms
GET /api/price
500 · 204ms
✕ Test failed
strict mode
✦ AI Fix Suggestion
- page.locator('text=$54.95')+ page.getByRole('paragraph')+ .filter(({ hasText: '$54.95' }))
✦ Ask QAI
@qai-agent why did this test fail?
QAI The price element is duplicated in the DOM — a paragraph and a span both render $54.95. Playwright's strict mode rejects ambiguous locators.
One step to add to any CI workflow · 60 seconds to first result.
uses: useqai/qai-agent@v1
// How it works

From CI failure to merge decision.

QAI runs after your tests. No setup beyond one workflow step.

01 — Ingests

CI fails. QAI reads.

Runs on every test suite. Playwright, Jest, pytest, Maven, Go — anything emitting JUnit XML.

71ea1f369 fail
b31514c23 fail
8d64ccce1 fail
02 — Clusters

Groups by root cause.

UI_CHANGED · TIMING_FLAKE · BACKEND_ERROR · TEST_BUG — with confidence scores from Playwright traces.

cart.speccheckoutsearchloginnavUIFLAKE
03 — Verdicts

Posts on the PR.

Risk score · merge recommendation · fix suggestion. In the place reviewers already look. No new tab.

PR #28High Risk
0.60RISK
Do not merge
8 / 17 failed · 4 clusters
✦ Fix suggestion readyView →
// Features

Everything your team needs to ship confidently.

Clustering & root cause

30 failing tests doesn't mean 30 problems. QAI groups failures by root cause, then explains exactly why — with confidence scores from Playwright traces.

checkout.spec.ts × 4TIMING_FLAKE
product-catalog.spec.tsUI_CHANGED
cart.spec.ts × 2TEST_BUG
✦ AI Analysis
beforeEach times out — checkout page not ready before assertions run.
TIMING_FLAKE75% confidence

AI fix suggestions

Exact code fix for each failing test — posted directly on the PR. Full context from traces, not generic advice.

checkout.spec.ts › beforeEach · 70% confidence
test.beforeEach(async ({ page }) => {
-  await page.goto('/checkout');
+  await page.goto('/checkout');
+  await page.waitForLoadState('networkidle');
+  await expect(page.getByPlaceholder(/DEM010/i))
+    .toBeVisible();
});

Trends & flakiness

See fail rate over time, flakiness leaderboard, cluster history. Know what's been broken for 3 weeks before your standup.

Fail rate per run (last 30)
Mar 28↗ ImprovingApr 25

Slack alerts

High-risk PRs fire an alert to your team channel instantly. Risk score, failure count, cluster breakdown, dashboard link.

Q
QAI Agent APP
🔴 High Risk PR detected
PR #78 · Risk: high (0.60)
20/70 tests failed · 4 clusters
Rec: Do not merge — investigate first
View in QAI Platform →

Ask QAI in your PR

Comment @qai-agent on any PR. Get historical context, severity, frequency data, and fix suggestions — right where you work.

@qai-agent is there an increase in failures vs previous PRs?
Yes — dramatic increase. Compared to previous PRs:
  • 11 failed tests this run
  • 10 unique failure patterns · 20 occurrences each
  • All failures first seen today — not pre-existing
  • 0% flaky score → consistent, reproducible

Automated rules

Define conditions — occurrences, fail rate, flakiness score, RCA cause — and fire any action automatically when a cluster matches.

When cluster_occurrences ≥ 5 AND TIMING_FLAKE
🐙 GitHub issuesent/opened
💬 Slack messagesent/opened
📧 Emailsent/opened
🔗 Webhookfired
// Coverage

Do you know which parts of your app you actually trust?

Not which lines ran — which pages your tests visited, which API calls they made, which paths they actually walked. QAI maps it automatically from Playwright traces.

Coverage Tree

Every page, every path — green vs red. See exactly which flows your tests walk end to end.

Confidence Map

Cell size = test volume, color = risk level. Spot coverage gaps and danger zones at a glance.

Zero instrumentation

Parsed from Playwright traces automatically. Upload and see results — no code changes, no config.

561 unique paths77% behavioral coveragePlaywright only
≥85% pass50–84%<50%/0 pathssignup523 paths · 70%login575 paths · 73%repos1587 paths · 75%setup1226 paths · 75%admin973 paths · 69%settings0 paths:id366 paths · 80%pulls116 paths · 88%clusters29 paths · 83%behavioral-cover…19 paths · 25%predictions11 paths · 100%tests13 paths · 98%trends15 paths · 100%:slug62 paths · 74%runs0 paths:id38 paths · 74%:id21 paths · 95%api-keys410 paths · 76%team1228 paths · 73%notifications410 paths · 80%:slug1367 paths · 75%rules1028 paths · 70%
// ROI Calculator

How much is broken CI costing you?

Developers on your team
3 devs
CI runs per week
20 runs
Minutes debugging per failure
20m
Average developer hourly rate
$100/hr
* Based on 60% reduction in debugging time. Adjust the sliders to match your team.
Time wasted on CI
26h / mo
Cost of wasted time
$2,600/mo
QAI cost
$29/mo
Pays for itself in
0.6 days
Estimated ROI with QAI
54×
Estimated monthly savings: $1,560/mo
Start saving — free for 7 days →
// Ask QAI

Your PR has a new reviewer.

Ask anything about your failures. QAI answers using your full history — not just the current run.

📊 Historical context

“How long has this been failing?” — First seen March 18th. 11 occurrences. Never flaky — consistently broken.

📉 Trend comparison

“Is this worse than previous PRs?” — Previous: 7 occurrences. This PR: 20. Nearly 3× increase. Breaking change introduced.

🧰 Fix on demand

“What's the fastest fix?” — Exact code generated. One click to apply. No dashboard required.

github.com · PR #31 · comment thread
TL
@tomerli · just now
@qai-agent is there an increase in failures comparing to previous PRs?
Q
QAI Agent · bot · 4s
Yes, there's a dramatic increase in failures compared to previous PRs.
Previous PRs: 7 occurrences of these patterns.
This PR: 20 occurrences each — nearly 3× increase.
UI_CHANGED failures (4 tests) — 85% confidence
TIMING_FLAKE failures (4 tests) — 70% confidence
0% flaky score — consistent, reproducible failures
Verdict: Block merge — systematic failures introduced by this PR.
// Get started

Ship green. Stay green.

Drop QAI into any GitHub Actions workflow. The next failed run gets clustered, explained, and resolved before standup.

# Add after your test step
- name: QAI Agent
  uses: useqai/qai-agent@v1
  if: always()
  with:
    junit-path: 'test-results/results.xml'
— or —
Not ready to connect your real pipeline? Fork demo-shop → QAI is already wired up. See what it does first.