Data integrity

Every number on CricketStudio re-derives from ball-by-ball data, and is checked against an independent source before it ships. Here's how.

Data sources. Three licensed feeds, all ball-by-ball: Licensed live feed (IPL 2026 live) · Cricsheet CC BY 3.0 (IPL historical — 18 seasons, 2007/08–2025) · Cricsheet CC BY 3.0 (MLC — 2023–2026). Total corpus: 1,317 matches · 312,309 deliveries. Phase A (IPL 2026 season) is complete as of June 2026; the integrity checks below apply across all three leagues.
Sample-size floors (publicly disclosed). Batting claims require ≥30 balls faced. Bowling claims require ≥15 balls bowled. Below these floors a claim is excluded from production. Every page displays the sample size that backs each claim.

Live check status

Re-run on every build and every quality-gates cron.

CheckStatusDetail
Run closure (ball-walk vs official scoreboard)! warn127/146 innings reconcile exactly; 19 short by a total 76 runs (upstream feed gap, not a computation error).
Legal-ball closure (deliveries vs overs) passAll 146 innings: legal deliveries match the official over count.
Ball-event vocabulary (closed enum) passEvery ball-event name matches a known, classified shape.
Statistical plausibility (strike rate / economy) passNo implausible aggregates across the player corpus.

What we check

Run closure. For every completed innings we add up the runs off the bat (from the ball-by-ball stream) plus extras (from the official scoreboard) and confirm the total equals the scoreboard total. If our figure ever exceeded the official total — the signature of extras being mis-credited to a batter — the build fails.

Legal-ball closure. The count of legal deliveries we walk must equal the over count on the official scoreboard. This catches wides or no-balls being mistakenly counted as legal balls (which would distort strike rates and economy).

Closed ball-event vocabulary. Every ball-event type the feed emits is matched against a known, classified set. A new, unrecognized event shape is treated as a potential bug and flagged before it can silently corrupt an aggregate.

Statistical plausibility. Aggregates outside cricket's physical range (e.g. an implausible strike rate or economy over a real sample) are flagged for review.

Honest about the source

Across 146 completed innings, 127 reconcile to the official total exactly, and 19 fall a few runs short because the upstream ball-by-ball feed itself omitted a handful of deliveries. We surface that gap rather than papering over it. Zero innings over-credit a batter.

This is the same data the operator dashboard reads — no separate marketing number. Methodology overview: /about. Ball-event vocabulary + plausibility: 0 high-severity, 0 flagged for review.