@@ -5,7 +5,7 @@ sidebar: false
5
5
6
6
{{< figure src="/images/content_images/cs/ipl-stadium.png"
7
7
caption="** IPLT20, the biggest Cricket Festival in India** "
8
- alt="Indian Premier League Cricket"
8
+ alt="Indian Premier League Cricket cup and stadium "
9
9
attr="* (Image credits: IPLT20 (cup and logo) & Akash Yadav (stadium))* "
10
10
attrlink="https://unsplash.com/@aksh1802 " >}}
11
11
@@ -22,150 +22,159 @@ with the young and the old alike, connecting billions in India unlike any other
22
22
Cricket enjoys lots of media attention. There is a significant amount of
23
23
[ money] ( https://www.statista.com/topics/4543/indian-premier-league-ipl/ ) and
24
24
fame at stake. Over the last several years, technology has literally been a game
25
- changer. Audience are spoilt for choice with zillions of streaming media, tournaments,
26
- affordable access to mobile based live cricket watching and more.
25
+ changer. Audiences are spoilt for choice with streaming media, tournaments,
26
+ affordable access to mobile based live cricket watching, and more.
27
27
28
28
The Indian Premier League (IPL) is a professional Twenty20 cricket
29
- league, founded by the Board of Control for Cricket India (BCCI) in 2008. IPL is
30
- one of the most attended cricketing events in the world, valued at [ $6.7
31
- billion ] ( https://en.wikipedia.org/wiki/Indian_Premier_League ) in 2019.
29
+ league, founded in 2008. It is one of the most attended cricketing events in
30
+ the world, valued at [ $6.7 billion ] ( https://en.wikipedia.org/wiki/Indian_Premier_League )
31
+ in 2019.
32
32
33
33
Cricket is a game of numbers - the runs scored by a batsman, the wickets taken
34
- by a bowler, or the matches won by a cricket team, the number of times a batsman
34
+ by a bowler, the matches won by a cricket team, the number of times a batsman
35
35
responds in a certain way to a kind of bowling attack, etc. The capability to
36
- dig into cricketing numbers for both improving performance as well as to study
37
- the business opportunity , overall market and economics of cricket via powerful
36
+ dig into cricketing numbers for both improving performance and studying
37
+ the business opportunities , overall market and economics of cricket via powerful
38
38
analytics tools, powered by numerical computing software such as NumPy, is a big
39
- deal. Cricket analytics provide interesting insights into the game and
40
- predictive intelligence regarding the game outcome .
39
+ deal. Cricket analytics provides interesting insights into the game and
40
+ predictive intelligence regarding game outcomes .
41
41
42
- Today, there are rich and almost infinite troves of cricket games records and
43
- statistics available, for e.g., [ ESPN
44
- cricinfo] ( https://stats.espncricinfo.com/ci/engine/stats/index.html ) ,
42
+ Today, there are rich and almost infinite troves of cricket game records and
43
+ statistics available, e.g., [ ESPN
44
+ cricinfo] ( https://stats.espncricinfo.com/ci/engine/stats/index.html ) and
45
45
[ cricsheet] ( https://cricsheet.org ) . These and several such cricket databases
46
46
have been used for [ cricket
47
47
analysis] ( https://www.researchgate.net/publication/336886516_Data_visualization_and_toss_related_analysis_of_IPL_teams_and_batsmen_performances )
48
- using the latest machine learning and prediction modelling algorithms. There are
49
- media and entertainment platforms along with professional sporting bodies
50
- associated with the game that use technology and analytics for determining key metrics for improving match winning
51
- chances:
48
+ using the latest machine learning and predictive modelling algorithms.
49
+ Media and entertainment platforms along with professional sports bodies
50
+ associated with the game use technology and analytics for determining key
51
+ metrics for improving match winning chances:
52
52
53
53
* batting performance moving average,
54
54
* score forecasting,
55
- * gaining insights into fitness and performance of a player against different oppositions ,
55
+ * gaining insights into fitness and performance of a player against different opposition ,
56
56
* player contribution to wins and losses for making strategic decisions on team composition
57
57
58
58
{{< figure src="/images/content_images/cs/cricket-pitch.png"
59
- class="csfigcaption" caption="** Cricket Pitch, the focal point in the field** "
60
- alt="cricket pitch" align="middle" attr="(Image Credits: IPLPaper)"
61
- attrlink="http://debarghyadas.com/files/IPLpaper.pdf " >}}
59
+ class="csfigcaption"
60
+ caption="** Cricket Pitch, the focal point in the field** "
61
+ alt="A cricket pitch with bowler and batsmen"
62
+ align="middle"
63
+ attr="* (Image Credits: Debarghya Das)* "
64
+ attrlink="http://debarghyadas.com/files/IPLpaper.pdf " >}}
62
65
63
66
### Key Data Analytics Objectives
64
67
65
- * Sports data analytics are used not only in cricket but several [ other
66
- sports] ( https://adtmag.com/blogs/dev-watch/2017/07/sports-analytics.aspx ) for
67
- improving the overall team performance and maximize chances of wins .
68
- * Real-time data analytics can help in gaining insights even while the game is
69
- on for changing tactics by the team and by associated businesses for economic
70
- benefits and growth.
68
+ * Sports data analytics are used not only in cricket but many [ other
69
+ sports] ( https://adtmag.com/blogs/dev-watch/2017/07/sports-analytics.aspx ) for
70
+ improving the overall team performance and maximize winning chances .
71
+ * Real-time data analytics can help in gaining insights even during the game
72
+ for changing tactics by the team and by associated businesses for economic
73
+ benefits and growth.
71
74
* Besides historical analysis, predictive models are
72
- harnessed to determine the possible match outcomes that require significant
73
- number crunching and data science know-how, visualization tools and capability
74
- to include newer observations in the analysis.
75
+ harnessed to determine the possible match outcomes that require significant
76
+ number crunching and data science know-how, visualization tools and capability
77
+ to include newer observations in the analysis.
75
78
76
79
{{< figure src="/images/content_images/cs/player-pose-estimator.png"
77
- class="fig-center" alt="pose estimator" caption="** Cricket Pose Estimator** "
78
- attr="(Image Credits: AI for Cricket)"
79
- attrlink="https://www.youtube.com/watch?v=nH5W7tQUSrI " >}}
80
+ class="fig-center"
81
+ alt="pose estimator"
82
+ caption="** Cricket Pose Estimator** "
83
+ attr="* (Image Credits: connect.vin)* "
84
+ attrlink="https://connect.vin/2019/05/ai-for-cricket-batsman-pose-analysis/ " >}}
80
85
81
86
### The Challenges
82
87
83
88
* ** Data Cleaning and preprocessing**
84
89
85
- There are several public and proprietary sources of cricket data, latter are
86
- mostly with the broadcasting corporations that hold rights to various seasons
87
- and matches played. IPL has expanded cricket beyond the classic test match
88
- format to a much larger scale. The number of matches played every season across
89
- various formats has increased and so has the data, the algorithms, newer
90
- technologies and simulation models. Real time video analysis requires field
91
- mapping, player tracking, ball tracking, player shot analysis and several other
92
- aspects involved in how the ball is delivered, its angle, spin, velocity and
93
- trajectory.
90
+ There are several public and proprietary sources of cricket data. The
91
+ latter are mostly owned by broadcasting corporations that hold rights to
92
+ various seasons and matches played. IPL has expanded cricket beyond the
93
+ classic test match format to a much larger scale. The number of matches
94
+ played every season across various formats has increased and so has the data,
95
+ the algorithms, newer technologies and simulation models. Real time video
96
+ analysis requires field mapping, player tracking, ball tracking, player shot
97
+ analysis and several other aspects involved in how the ball is delivered, its
98
+ angle, spin, velocity and trajectory.
94
99
95
100
* ** Data Representation**
96
101
97
- One of the most trickiest of challenges with sports analytics is getting the
98
- right data representation. What this means is getting the raw data in a form
99
- such that it can be laid out for comparison and building models. If the initial
100
- representation itself is incorrect, everything that follows is akin to fitting
101
- noise to detect a signal. In cricket, just like any other sport, there can be a
102
- large number of variables related to tracking various numbers of players on the
103
- field, their attributes, the ball and several possibilities of potential
104
- actions. The complexity of data analytics and representation is directly
105
- proportional to the kind of predictive questions that are put forth during
106
- analysis and are highly dependent on data representation and the model. Things
107
- get even more challenging in terms of computation, data comparisons when dynamic
108
- cricket play predictions are sought such as what would have happened if the
109
- batsman had hit the ball at a different angle or velocity?
102
+ One of the most tricky challenges with sports analytics is getting the
103
+ right data representation. What this means is getting the raw data in a form
104
+ such that it can be laid out for comparison and building models. If the
105
+ initial representation itself is incorrect, everything that follows is akin
106
+ to fitting noise to detect a signal. In cricket, just like any other sport,
107
+ there can be a large number of variables related to tracking various numbers
108
+ of players on the field, their attributes, the ball and several possibilities
109
+ of potential actions. The complexity of data analytics and representation is
110
+ directly proportional to the kind of predictive questions that are put forth
111
+ during analysis and are highly dependent on data representation and the
112
+ model. Things get even more challenging in terms of computation, data
113
+ comparisons when dynamic cricket play predictions are sought such as what
114
+ would have happened if the batsman had hit the ball at a different angle or
115
+ velocity?
110
116
111
117
* ** Predictive Analytics Complexity**
112
118
113
- In cricket, some of the decision making is based on questions such as how
114
- often a batsman plays a kind of shot if the ball delivery is of a certain type,
115
- or say how does a bowler change his line and length if the batsman responds to
116
- his delivery in a certain way, or how often does a team decide to bat after
117
- winning a toss or switches the batting order, etc. This kind of predictive
118
- analytics queries requires highly granular dataset availability and the
119
- capability to synthesize data and create generative models that are highly
120
- accurate.
119
+ In cricket, some of the decision making is based on questions such as "how
120
+ often does a batsman play a certain kind of shot if the ball delivery is of a
121
+ particular type", or "how does a bowler change his line and length if the
122
+ batsman responds to his delivery in a certain way".
123
+ This kind of predictive analytics query requires highly granular dataset
124
+ availability and the capability to synthesize data and create generative
125
+ models that are highly accurate.
121
126
122
127
## NumPy’s Role in Cricket Analytics
123
128
124
- Sports Analytics is a thriving field. It utilizes several Python based software
125
- including NumPy, Matplotlib, Sci-kit learn, PyCharm, Jupyter and
126
- [ others] ( https://adtmag.com/blogs/dev-watch/2017/07/sports-analytics.aspx ) in
127
- addition to latest machine learning and AI techniques.
128
-
129
- NumPy, the standard numerical analysis package for Python, has been utilized
129
+ Sports Analytics is a thriving field. Many researchers and companies
130
+ [ use NumPy] ( https://adtmag.com/blogs/dev-watch/2017/07/sports-analytics.aspx )
131
+ and other PyData packages like Scikit-learn, SciPy, Matplotlib, and Jupyter.
132
+ in addition to latest machine learning and AI techniques. NumPy has been used
130
133
for various kinds of cricket related sporting analytics such as:
131
134
132
- * ** Data Correlation:** Data graphing and [ visualization] ( https://towardsdatascience.com/advanced-sports-visualization-with-pandas-matplotlib-and-seaborn-9c16df80a81b ) provides useful insights into
133
- relationship between various datasets,
134
- [ causation] ( https://amplitude.com/blog/2017/01/19/causation-correlation ) ,
135
- [ stratification] ( https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4996805/ ) for
136
- tactical analysis. NumPy offers the core foundation for several such analyses.
137
-
138
- * ** Statistical Analysis:** NumPy Compute processing helps estimate the
139
- statistical significance of observational data or match events in the context
140
- of various player and game tactics, estimating the game outcome by comparison
141
- with a generative or static model.
142
-
143
- * ** Data Visualization:** NumPy is used as the primary [ numerical compute
144
- processing] ( https://www.researchgate.net/publication/336886516_Data_visualization_and_toss_related_analysis_of_IPL_teams_and_batsmen_performances )
145
- engine for the cricket datasets. Pandas used as the data processing and I/O.
146
- Matplotlib used as the basic visualization for players. Seaborn package used
147
- as the modern visualization for Toss related analysis as well as for team and
148
- player insights.
149
-
150
- {{< figure src="/images/content_images/cs/cricket-stats.png" class="fig-center"
151
- alt="role of numpy" caption="** IPL Data Analytics** " attr="(Source: mc.ai)"
152
- attrlink="https://mc.ai/predicting-the-outcome-of-cricket-matches-using-ai/ " >}}
135
+ * ** Data Correlation:** data graphing and [ visualization] ( https://towardsdatascience.com/advanced-sports-visualization-with-pandas-matplotlib-and-seaborn-9c16df80a81b )
136
+ provides useful insights into relationship between various datasets;
137
+ [ causal analysis] ( https://amplitude.com/blog/2017/01/19/causation-correlation ) ,
138
+ and [ big data approaches] ( https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4996805/ )
139
+ are used for tactical analysis. NumPy offers the core foundation for several
140
+ such analyses.
141
+
142
+ * ** Statistical Analysis:** NumPy numerical capabilities help estimate the
143
+ statistical significance of observational data or match events in the context
144
+ of various player and game tactics, estimating the game outcome by comparison
145
+ with a generative or static model.
146
+
147
+ * ** Data Visualization:** NumPy is used as the primary
148
+ [ numerical computing] ( https://www.researchgate.net/publication/336886516_Data_visualization_and_toss_related_analysis_of_IPL_teams_and_batsmen_performances )
149
+ engine for cricket datasets. Pandas is used for statistics and I/O.
150
+ Matplotlib is used as the basic visualization for players. with Seaborn used
151
+ as the modern visualization package for Toss related analysis as well as for
152
+ team and player insights.
153
+
154
+ {{< figure src="/images/content_images/cs/cricket-stats.png"
155
+ class="fig-center"
156
+ alt="The role of NumPy in cricket analytics - toss and match winners"
157
+ caption="** IPL toss data analysis** "
158
+ attr="* (Image credits: mc.ai)* "
159
+ attrlink="https://mc.ai/predicting-the-outcome-of-cricket-matches-using-ai/ " >}}
153
160
154
161
## Summary
155
162
156
163
Sports Analytics have changed the way professional games are played, especially
157
- regarding decision making which was not so long ago, primarily done based on
158
- “gut” feeling or adherence to past traditions. It is no secret that the Indian
164
+ regarding decision making which was until recently primarily done based on
165
+ “gut feeling" or adherence to past traditions. It is no secret that the Indian
159
166
cricket teams rely heavily on data analytics to decide their strategy for an
160
- upcoming match or fine tune their tactics during the game. NumPy forms the
161
- solid foundation used by several Python based softwares to provide higher level
167
+ upcoming match or fine tune their tactics during the game. NumPy forms a
168
+ solid foundation for a large set of Python packages which provide higher level
162
169
functions related to data analytics, machine learning and AI algorithms. These
163
- software are widely deployed to gain real-time insights that help in decision
170
+ packages are widely deployed to gain real-time insights that help in decision
164
171
making for game-changing outcomes, both on field as well as to draw inferences
165
172
and drive business around the game of cricket. Finding out the hidden
166
173
parameters, patterns and attributes that lead to the outcome of a cricket match
167
174
helps the stakeholders to take notice of game insights that are otherwise hidden
168
175
in numbers and statistics.
169
176
170
- {{< figure src="/images/content_images/cs/numpy_ca_benefits.png" class="fig-center"
171
- alt="numpy benefits" caption="** Key NumPy Capabilities utilized** " >}}
177
+ {{< figure src="/images/content_images/cs/numpy_ca_benefits.png"
178
+ class="fig-center"
179
+ alt="Diagram showing benefits of using NumPy for cricket analytics"
180
+ caption="** Key NumPy Capabilities utilized** " >}}
0 commit comments