A comprehensive deep-dive into Indian Premier League deliveries data — uncovering scoring patterns, phase dynamics, and strategic insights through statistical modelling and machine learning.
This project applies the full data science pipeline — from raw cleaning to statistical inference — on the IPL deliveries dataset. It answers questions teams, analysts, and fans have always debated: which phase of an innings matters most? Do death overs really score more? Who are the true match-winners?
Every chart, model, and test is backed by the actual ball-by-ball data, making the conclusions statistically grounded rather than anecdotal.
Ball-by-ball IPL delivery records spanning multiple seasons. Every row represents a single delivery — capturing runs scored, wickets, extras, dismissal types, batsmen, bowlers, and match context.
Individual deliveries across all matched IPL games
Columns: match_id, inning, over, ball, batter, bowler, runs, extras, wickets & more
Super overs excluded to keep stats clean and accurate
Overs 16–20 average 16.77 runs/over vs 15.40 in Powerplay — a statistically significant difference confirmed by Welch's t-test (p ≈ 0.000000). The final five overs are the most explosive in any IPL innings.
Kohli leads all IPL batsmen with 8,004 total runs — nearly 1,300 ahead of S Dhawan. He also leads in boundaries (979), making him the most consistent and aggressive volume scorer in the league's history.
Of all 13,000+ wickets in the dataset, 62% are catches. Bowled accounts for just 17%. Field placement and inducing edges is far more effective than clean-bowling batsmen in T20 cricket.
Nearly 60% of all runs scored in the dataset come from 4s and 6s. Teams with boundary-hitting specialists have a structural advantage that cannot be compensated by running between the wickets alone.
Overs 7–8 consistently show a run-rate dip as new batsmen settle after the Powerplay. This is the prime window for economical spin bowling — backed by YS Chahal's 213 wickets leading all IPL bowlers.
The scatter of top-30 batsmen shows a strong positive trend between balls faced and total runs (Pearson r ≈ 0.98 for batsman vs total runs). Consistency and longevity at the crease is the clearest predictor of overall output.
Deploy attacking openers under fielding restrictions. Target the boundary — powerplay avg is 15.40 runs/over, setting the match's run-rate foundation.
Bowl your best swing/seam bowlers with the new ball. Target tight lines; early breakthroughs dramatically shift the match momentum.
Consolidate and rotate strike. Avoid reckless shots during this highest wicket-rate phase — preserve your power hitters for the death.
Introduce your best spinners. Build dot-ball pressure and exploit the transition period — this phase offers the highest wicket probability.
Save your cleanest boundary hitters. At 16.77 avg runs/over, maximising this window with specialist finishers is the single biggest scoring lever.
Invest in yorker specialists. Slower balls and wide yorkers suppress boundaries. Avoid full-pitched deliveries — they cost matches.
62% of wickets are catches. Prioritise attacking field placements and edge-inducing bowling over line-and-length containment.
With 59.9% of runs from 4s & 6s, teams should specifically recruit boundary-hitting specialists — not just high-average batsmen.
Run rate rises ~0.12 runs per over as innings progress. Save your best bowlers for overs 15+ where the scoring spike accelerates.