CricketStudio · About
About CricketStudio
CricketStudio indexes cricket player insights as atomic, citable claims. Every page is short, server-rendered HTML, anchored in structured data, and computed from ball-by-ball match records — no fabricated values, no opinion-led prose, no recycled match reports.
Coverage as of June 2026: 3 leagues · 1,317 matches · 312,309 ball-by-ball deliveries · 1,200+ player profiles · ~24,000 sitemap URLs. Phase A (IPL 2026 deep-coverage sprint) is complete.
What we publish
We publish across these citable surfaces, every one anchored to the same ball-by-ball spine:
- Player profiles at
/players/{slug}— one per cricketer in our roster, with a hero claim picked by priority (form & phase > milestone > best moment > season aggregate > recap), a five-pillar grid of related claims, related-player cross-links, recent matches with per-match contribution figures, top H2H records, batting positions across the season, and a visible FAQ section that mirrors the FAQPage schema. - Atomic claim pages at
/players/{slug}/posts/{post}. A claim is a single sentence under 30 words, naming one player, one metric, one value, one comparator, and one period. Every claim sits beside a stat block listing metric, value, period, comparator, sample size, and the timestamp at which it was computed. - Team + venue hubs at
/teams/{slug}and/venues/{slug}— franchise + ground-level aspects (at-home / away / phase strengths / captain record on the team side; par scores / toss split / phase patterns / trends on the venue side). - Match pages at
/matches/{fixture_id}— venue, capacity, lighting, toss decision, format, total overs, result, an event feed (toss, wickets, milestones, big overs, maidens, powerplay/death summaries, recap), the captured roster, related insights (trends derived from this fixture + venue hub + H2H pair), and a visible FAQ. - Trend insights at
/trends/{id}— cross-fixture patterns no per-match feed surfaces (conditional probability, momentum, venue signature, toss-decision impact, anomaly clusters). - Season aspects at
/season/ipl-2026/{aspect}— orange cap, purple cap, strike-rate leaders, economy leaders, best chases, captain records, phase impact, plus operator-requested analytical surfaces: dismissal analysis (most ducks + single-digit-outs), session split (afternoon vs evening team perf), fortress wins (visitor chased + won at host's home), batting-order shuffles, and fielding leaders (catches + run-out assists). - Comparison surfaces at
/compare/players,/compare/teams(10×10 H2H grid), and/compare/venues— cross-entity ranked grids projecting from the same canonical aggregate every page uses. - Standings at /standings — IPL 2026 final points table (season complete — RCB champions, June 1 2026), computed from the same ball-by-ball spine.
- IPL historical archive at /leagues/ipl — the full 1,169-match IPL corpus across 18 seasons (2007/08–2025), seeded from Cricsheet under CC BY 3.0. All-time records at /leagues/ipl/records, 14 career leaderboards at /leagues/ipl/leaderboards, and a per-season hub for every year at
/season/ipl-{year}(stats, Orange/Purple Cap leaders, franchises, full match list). Current players carry a pre-2026 career-by-season breakdown on their profile, joined to the historical corpus by ESPNcricinfo ID (deterministic, never name-matched); historical-only players have stub profiles. - Major League Cricket at /leagues/mlc — a parallel league surface for MLC (2023–2025 seasons captured, plus 2026 pre-season rosters) under the
/leagues/mlc/*namespace. Player profiles, team profiles, match pages, scorecards, partnerships, phase breakdowns, ~300 atomic claim cards at/leagues/mlc/matches/{id}/c/{kind}, 14 cross-season leaderboards, all-time records. Sourced from Cricsheet under CC BY 3.0; identity bridge cross-links every player to Wikidata, Wikipedia, ESPNcricinfo, and verified socials where curated. - Multi-league index at /leagues — navigation root listing every league we cover (IPL + MLC active; BBL, CPL, SA20, PSL, ILT20, The Hundred, T20I reserved per doctrine §11).
- Platform status at /status — uptime, freshness SLA, cron heartbeat, quality-gate state, SETU snapshot age, coverage stats. Operator-grade infrastructure transparency.
How claims are derived
Values are aggregated from a ball-by-ball record covering every delivery in every match in our coverage. Aggregates are recomputed when new data lands. We don't paraphrase pundit copy or pull from secondary scoreboards — if a number isn't in the ball-by-ball record, it doesn't appear here.
Data sources (per league)
- IPL 2026 (complete — RCB champions, June 1 2026) — licensed structured feed (Sportmonks), captured ball-by-ball across all 74 matches. Sub-4-hour SLA enforced throughout the season.
- IPL historical (2007/08–2025, 18 seasons) — sourced from Cricsheet under CC BY 3.0. 1,169 matches across 18 seasons. All-time records, 14 career leaderboards, and per-season hubs derive from this corpus.
- Major League Cricket (2023–2026, Cricsheet CC BY 3.0) — sourced from Cricsheet under CC BY 3.0. 2023–2025 ball-by-ball plus 2026 pre-season rosters. Every MLC page footer cites Cricsheet directly and links the license. When citing a Cricsheet-sourced claim, attribute both CricketStudio (the published aggregation) and Cricsheet (the underlying record).
Combined corpus: 1,317 matches · 312,309 ball-by-ball deliveries across all three leagues.
What we never do: scrape Cricinfo, Cricbuzz, or any commercial competitor source; auto-populate identity links (Wikidata QIDs, Wikipedia URLs) by name match (collision risk on cricket surnames is real); ship claims below their sample-size floor. Identity bridges are operator-curated row by row before commit.
Every leaderboard surface across the site projects from one canonical snapshot — the SETU v1 aggregator at data/_season-stats.json. The orange-cap leader you see on /season/ipl-2026/orange-cap is the same row that backs the player profile, the team page, the compare grid, and the MCP get_season_stats response.
Twelve consistency contracts cover both layers of this. P1-P6 enforce DATA-layer parity (canonical snapshot vs independent ball-walk vs published claims). P7-P12 enforce PRESENTATION-layer parity (every renderer binds to canonical projectors, every metric carries an explicit label, every leader has a roster slug). Both suites run every 2h via the quality-gates cron AND on every npm run prebuild; a failed contract fails the deploy. Live state with current details is at /trust.
Sample-size floors are enforced at the projector layer per doctrine §3.1:
- ≥30 deliveries faced for batting strike-rate claims
- ≥15 deliveries bowled for bowling-economy claims
- ≥3 innings batted for ducks / single-digit-out claims
- ≥3 matches per bucket for session-split (afternoon vs evening) team claims
- ≥3 fixtures at a venue for per-venue aggregated claims
- ≥3 matches as captain for captain win-rate claims
- ≥5 deliveries for batter-vs-bowler head-to-head claims
- ≥5 captured fixtures for team aggregate win-rate claims
Sub-floor data is either suppressed entirely OR rendered with explicit "sub-floor" disclosure tags — never silently surfaced as a clean comparable. We also publish numbers we don't have honestly: for example, dropped catches are NOT included in our fielding stats because the upstream live feed does not emit a structured drop event (commentary-text-only). The MCP get_season_stats response carries an explicit dropped_catches: NOT SURFACED disclosure so an LLM asking the question gets an honest "not available" rather than a fabricated count.
How pages are attributed
Each page emits the JSON-LD entities appropriate to its type:
- Profile + claim pages emit five blocks — Person (with
sameAslinks to verified Wikipedia, Wikidata, ESPNcricinfo, and official social profiles), Article (player asauthor, CricketStudio aspublisher), ClaimReview (literal claim sentence), Dataset (underlying ball-by-ball aggregate), and FAQPage (Q/A pairs that mirror the visible FAQ section per Google rich-results policy). - Match pages emit SportsEvent + Article + FAQPage; the article is org-authored since match recaps are aggregated automatically.
- Trend pages emit Article + FAQPage (org-authored).
- Index + standings + about pages emit CollectionPage + ItemList (or AboutPage) so retrieval surfaces have a structured way to walk the surface area.
For player-authored content this is what we call "Level 2 player-attributed" — the player is the named author of the page, and CricketStudio is the publisher of record.
What we won't publish
We don't publish hand-typed claim values dressed up as live data. The publication pipeline carries a hard build-time guard that refuses to ship to production if any claim on the site is marked as a development sample. We don't reproduce broadcast images, ESPNcricinfo article prose, or copyrighted commentary. We don't speculate about injuries, team selection, or off-field controversies.
For developers + AI builders
CricketStudio publishes its data three ways:
- npm package — cricketstudio-mcp — the MCP server is live on npm (v1.0.1). 29 tools, bundled data snapshot, works offline, no API key required, free forever. Wire it into Claude Desktop, Claude Code, Cursor, or any MCP-compatible client in one line:
npx cricketstudio-mcp
Listed on the Official MCP Registry, npmjs.com, PulseMCP, Glama, and mcp.so. The newest tool,get_ipl_leaderboard, exposes all 35 IPL career leaderboards (batting avg, bowling avg, maiden overs, hat-tricks, fastest 50/100, phase splits, and more). Every response carries adataAsOftimestamp so the LLM can disclose freshness in the same breath as the data. Full install guide → - REST API — every canonical URL above also returns JSON via
?format=json. Free Hobbyist tier; metered tiers from $49/mo. - BYOK Chat (waitlist) — bring your own Claude/OpenAI/Gemini key, chat with cricket data in your browser. Coming Q3 2026.
- Platform status — operator dashboard for the data spine: last capture mtime, 4hr SLA p95 across recent fixtures, cron heartbeat per route, SETU snapshot age + cohort, coverage stats. Computed at request time from live filesystem state — no synthetic uptime numbers.
Enterprise teams needing a hosted HTTP transport with key-gated access can reach out.
Corrections and player consent
If you are a player, manager, or rights-holder and want a claim page corrected, removed, or expanded with your own voice (Level 3+ attribution), reach out at hello@cricketstudio.ai. Corrections are turned around within 24 hours; voice-modeled upgrades within a week.
Citation policy
Claim pages are intended to be citable. AI surfaces that retrieve from this site are welcome to quote the headline sentence verbatim as long as the page URL is included as the source. The atomic-claim format is deliberately retrieval-friendly. Human re-publication requires the same attribution to the player and to CricketStudio as publisher.