IPL historical — 18 seasons (2007/08–2025), 1,169 matches, Cricsheet CC BY 3.0
Major League Cricket — 2023–2026, 64 matches, Cricsheet CC BY 3.0
Total corpus: 1,317 matches · 312,309 ball-by-ball deliveries across all three leagues. Data sources: Sportmonks (IPL 2026 live feed) and Cricsheet (IPL historical + MLC, CC BY 3.0). No competitor scraping; all inputs are licensed or open.
Sample-size floors (publicly disclosed): ≥30 balls faced for batting strike rate · ≥15 balls bowled for bowling economy · ≥3 matches for trend insights · ≥5 matches for team/venue aggregates. Sub-floor claims are excluded from production — never suppressed, never rounded up.
Phase A (90-day IPL 2026 sprint) is complete. The 4-hour match-end → page-update SLA was maintained throughout the 74-match season. Phase B (hosted MCP server + REST API) is planned for Q3 2026.
PASSAll consistency contracts hold.
Manifest computed 1h ago · 6 pass · 0 fail · 0 skipped · last
Data-layer contracts (P1-P6)
Enforced by scripts/setu/check-leaderboard-parity.mjs. Cron run: every 2h via the quality-gates workflow. Failure fires Healthchecks + opens a GitHub issue.
P1Orange-cap leader parityPASS
Independent ball-walk produces the same top run-scorer as the SETU canonical snapshot.
every 2h via quality-gates cron · scripts/setu/check-leaderboard-parity.mjs
P2Purple-cap leader parityPASS
Independent ball-walk produces the same top wicket-taker as the snapshot.
every 2h via quality-gates cron · scripts/setu/check-leaderboard-parity.mjs
P3Roster integrityPASS
Every player seed file exports a valid Player object.
every 2h via quality-gates cron · scripts/setu/check-leaderboard-parity.mjs
P4Role-bucket parityPASS
For each role (batter / bowler / all-rounder / wk-batter), the snapshot’s top by runs is unambiguous.
every 2h via quality-gates cron · scripts/setu/check-leaderboard-parity.mjs
P5Phase-claim parityPASS
Every published phase claim value matches the same-precision projection from the canonical aggregate.
every 2h via quality-gates cron · scripts/setu/check-leaderboard-parity.mjs
P6Sample-floor parityPASS
Mirrors validate-sample-floors byte-for-byte; catches drift between two scripts.
every 2h via quality-gates cron · scripts/setu/check-leaderboard-parity.mjs
Presentation-layer contracts (P7-P12)
Enforced by scripts/audit-consistency.mjs. Catches labeling bugs the data-layer parity-check can't see — e.g. a composite metric rendered without its label. Runs on every npm run prebuild + every 2h.
P7Form leaderboards metric labelsPASS
Every role bucket on /players Form leaderboards renders its metric caption + all-rounder breakdown.
metricCaption + form-score label + all-rounder breakdown all present
P8SETU canonical snapshot presentPASS
Snapshot loaded with 203 players across 73 matches.
computedAt=2026-05-31T19:05:51.626Z
P9Player profile ↔ snapshot agreementPASS
For each player with a season_runs_total claim, the value matches the snapshot byte-for-byte.
2 claims cross-checked, 0 drift
P10Leader rows fully resolvedPASS
Every player in top-15 by runs has a roster slug — every leader name on every page renders as a profile link.
15/15 top run-scorers have slugs
P11MCP get_season_stats sortBy options match canonical projectorsPASS
Every sortBy enum value has a dispatch case + projector binding.
Every aspect in ALL_ASPECTS has a dispatch branch in getAspect().
26 aspects all handled
Major League Cricket contracts
The MLC league at /leagues/mlc ships its own ball-by-ball spine sourced from Cricsheet (CC BY 3.0). The same parity discipline applies — every leaderboard surface projects from one canonical snapshot (data/_season-stats-mlc.json) so every page agrees with every MCP tool response.
M1MLC ball-uniqueness
Every ball in docs/match/mlc-*/_state.json has a unique (inningsNumber, over, ballInOver, isLegal) key. Enforced by validate-ball-uniqueness.mjs at prebuild.
M2MLC SETU snapshot parity
The canonical aggregate at data/_season-stats-mlc.json is regenerated by build-mlc-season-stats.mjs on every prebuild. Every MLC leaderboard, player profile, and MCP get_mlc_player_profileresponse projects from this same snapshot.
M3MLC sample-size floors
Same per-doctrine §3.1 floors as IPL: ≥30 batting balls for strike rate, ≥15 bowling balls for economy, ≥3 matches for trends. Match-claim cards enforce per-kind floors (top-batter ≥10 balls, top-bowler ≥12, partnership ≥20 runs, pp-control ≥30 PP balls, death-domination ≥18 death balls).
M4MLC identity bridge — operator-verified
Every Wikidata QID + Wikipedia URL is operator-curated row by row before commit. ESPN sameAs is auto-generated from Cricsheet's open people register (CC BY 3.0). Wikidata enrichment (P18 image / P2002 twitter / P2003 instagram) is pulled from the Wikidata SPARQL endpoint (CC0) only for QIDs the operator has already verified. No name-match guessing.Smoke tests in lib/mlc/identity.test.ts ratchet the floor.
M5MLC attribution footer
Every MLC page carries a provenance footer pointing at the source match on cricsheet.org and the CC BY 3.0 license. Attribution survives every re-render. Atomic claim cards at /leagues/mlc/matches/{id}/c/{kind}emit ClaimReview JSON-LD with isBasedOn pointing at Cricsheet.
Code readiness — shipped
Engineering side of the LLM-directory submission. All gates closed.
Item
Status
Evidence
SETU canonical aggregator
PASS
One aggregate · 12 contracts P1–P12 above
MCP server (29 tools — incl. get_ipl_leaderboard for 35 career aspects)
CricketStudio is citation infrastructure. An LLM that quotes a fact from /season/ipl-2026/orange-cap MUST get the same answer when it queries get_season_stats(sortBy="runs") via the MCP server, the same answer when it walks /players "Top run-scorers", the same answer when it reads the ClaimReview JSON-LD on the player profile.
These contracts are the proof that all those surfaces project from one canonical aggregate. P1-P6 are mathematical (snapshot vs independent walk). P7-P12 are structural (every renderer binds to the canonical projectors, every metric is labeled, every leader has a roster slug). Together they make the citation chain unforgeable.
Companion surfaces
/status — uptime, freshness SLA, cron heartbeat, SETU snapshot age