Skip to content

Commit 4c7c7be

Browse files
committed
A copy-edit of the cricket analytics case study.
Also fixes a number of issues with image credits, and changes the styling of those credits in captions. Alt text is also improved.
1 parent 6e58d92 commit 4c7c7be

File tree

1 file changed

+107
-98
lines changed

1 file changed

+107
-98
lines changed

content/en/case-studies/cricket-analytics.md

+107-98
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ sidebar: false
55

66
{{< figure src="/images/content_images/cs/ipl-stadium.png"
77
caption="**IPLT20, the biggest Cricket Festival in India**"
8-
alt="Indian Premier League Cricket"
8+
alt="Indian Premier League Cricket cup and stadium"
99
attr="*(Image credits: IPLT20 (cup and logo) & Akash Yadav (stadium))*"
1010
attrlink="https://unsplash.com/@aksh1802" >}}
1111

@@ -22,150 +22,159 @@ with the young and the old alike, connecting billions in India unlike any other
2222
Cricket enjoys lots of media attention. There is a significant amount of
2323
[money](https://www.statista.com/topics/4543/indian-premier-league-ipl/) and
2424
fame at stake. Over the last several years, technology has literally been a game
25-
changer. Audience are spoilt for choice with zillions of streaming media, tournaments,
26-
affordable access to mobile based live cricket watching and more.
25+
changer. Audiences are spoilt for choice with streaming media, tournaments,
26+
affordable access to mobile based live cricket watching, and more.
2727

2828
The Indian Premier League (IPL) is a professional Twenty20 cricket
29-
league, founded by the Board of Control for Cricket India (BCCI) in 2008. IPL is
30-
one of the most attended cricketing events in the world, valued at [$6.7
31-
billion](https://en.wikipedia.org/wiki/Indian_Premier_League) in 2019.
29+
league, founded in 2008. It is one of the most attended cricketing events in
30+
the world, valued at [$6.7 billion](https://en.wikipedia.org/wiki/Indian_Premier_League)
31+
in 2019.
3232

3333
Cricket is a game of numbers - the runs scored by a batsman, the wickets taken
34-
by a bowler, or the matches won by a cricket team, the number of times a batsman
34+
by a bowler, the matches won by a cricket team, the number of times a batsman
3535
responds in a certain way to a kind of bowling attack, etc. The capability to
36-
dig into cricketing numbers for both improving performance as well as to study
37-
the business opportunity, overall market and economics of cricket via powerful
36+
dig into cricketing numbers for both improving performance and studying
37+
the business opportunities, overall market and economics of cricket via powerful
3838
analytics tools, powered by numerical computing software such as NumPy, is a big
39-
deal. Cricket analytics provide interesting insights into the game and
40-
predictive intelligence regarding the game outcome.
39+
deal. Cricket analytics provides interesting insights into the game and
40+
predictive intelligence regarding game outcomes.
4141

42-
Today, there are rich and almost infinite troves of cricket games records and
43-
statistics available, for e.g., [ESPN
44-
cricinfo](https://stats.espncricinfo.com/ci/engine/stats/index.html),
42+
Today, there are rich and almost infinite troves of cricket game records and
43+
statistics available, e.g., [ESPN
44+
cricinfo](https://stats.espncricinfo.com/ci/engine/stats/index.html) and
4545
[cricsheet](https://cricsheet.org). These and several such cricket databases
4646
have been used for [cricket
4747
analysis](https://www.researchgate.net/publication/336886516_Data_visualization_and_toss_related_analysis_of_IPL_teams_and_batsmen_performances)
48-
using the latest machine learning and prediction modelling algorithms. There are
49-
media and entertainment platforms along with professional sporting bodies
50-
associated with the game that use technology and analytics for determining key metrics for improving match winning
51-
chances:
48+
using the latest machine learning and predictive modelling algorithms.
49+
Media and entertainment platforms along with professional sports bodies
50+
associated with the game use technology and analytics for determining key
51+
metrics for improving match winning chances:
5252

5353
* batting performance moving average,
5454
* score forecasting,
55-
* gaining insights into fitness and performance of a player against different oppositions,
55+
* gaining insights into fitness and performance of a player against different opposition,
5656
* player contribution to wins and losses for making strategic decisions on team composition
5757

5858
{{< figure src="/images/content_images/cs/cricket-pitch.png"
59-
class="csfigcaption" caption="**Cricket Pitch, the focal point in the field**"
60-
alt="cricket pitch" align="middle" attr="(Image Credits: IPLPaper)"
61-
attrlink="http://debarghyadas.com/files/IPLpaper.pdf" >}}
59+
class="csfigcaption"
60+
caption="**Cricket Pitch, the focal point in the field**"
61+
alt="A cricket pitch with bowler and batsmen"
62+
align="middle"
63+
attr="*(Image Credits: Debarghya Das)*"
64+
attrlink="http://debarghyadas.com/files/IPLpaper.pdf" >}}
6265

6366
### Key Data Analytics Objectives
6467

65-
* Sports data analytics are used not only in cricket but several [other
66-
sports](https://adtmag.com/blogs/dev-watch/2017/07/sports-analytics.aspx) for
67-
improving the overall team performance and maximize chances of wins.
68-
* Real-time data analytics can help in gaining insights even while the game is
69-
on for changing tactics by the team and by associated businesses for economic
70-
benefits and growth.
68+
* Sports data analytics are used not only in cricket but many [other
69+
sports](https://adtmag.com/blogs/dev-watch/2017/07/sports-analytics.aspx) for
70+
improving the overall team performance and maximize winning chances.
71+
* Real-time data analytics can help in gaining insights even during the game
72+
for changing tactics by the team and by associated businesses for economic
73+
benefits and growth.
7174
* Besides historical analysis, predictive models are
72-
harnessed to determine the possible match outcomes that require significant
73-
number crunching and data science know-how, visualization tools and capability
74-
to include newer observations in the analysis.
75+
harnessed to determine the possible match outcomes that require significant
76+
number crunching and data science know-how, visualization tools and capability
77+
to include newer observations in the analysis.
7578

7679
{{< figure src="/images/content_images/cs/player-pose-estimator.png"
77-
class="fig-center" alt="pose estimator" caption="**Cricket Pose Estimator**"
78-
attr="(Image Credits: AI for Cricket)"
79-
attrlink="https://www.youtube.com/watch?v=nH5W7tQUSrI" >}}
80+
class="fig-center"
81+
alt="pose estimator"
82+
caption="**Cricket Pose Estimator**"
83+
attr="*(Image Credits: connect.vin)*"
84+
attrlink="https://connect.vin/2019/05/ai-for-cricket-batsman-pose-analysis/" >}}
8085

8186
### The Challenges
8287

8388
* **Data Cleaning and preprocessing**
8489

85-
There are several public and proprietary sources of cricket data, latter are
86-
mostly with the broadcasting corporations that hold rights to various seasons
87-
and matches played. IPL has expanded cricket beyond the classic test match
88-
format to a much larger scale. The number of matches played every season across
89-
various formats has increased and so has the data, the algorithms, newer
90-
technologies and simulation models. Real time video analysis requires field
91-
mapping, player tracking, ball tracking, player shot analysis and several other
92-
aspects involved in how the ball is delivered, its angle, spin, velocity and
93-
trajectory.
90+
There are several public and proprietary sources of cricket data. The
91+
latter are mostly owned by broadcasting corporations that hold rights to
92+
various seasons and matches played. IPL has expanded cricket beyond the
93+
classic test match format to a much larger scale. The number of matches
94+
played every season across various formats has increased and so has the data,
95+
the algorithms, newer technologies and simulation models. Real time video
96+
analysis requires field mapping, player tracking, ball tracking, player shot
97+
analysis and several other aspects involved in how the ball is delivered, its
98+
angle, spin, velocity and trajectory.
9499

95100
* **Data Representation**
96101

97-
One of the most trickiest of challenges with sports analytics is getting the
98-
right data representation. What this means is getting the raw data in a form
99-
such that it can be laid out for comparison and building models. If the initial
100-
representation itself is incorrect, everything that follows is akin to fitting
101-
noise to detect a signal. In cricket, just like any other sport, there can be a
102-
large number of variables related to tracking various numbers of players on the
103-
field, their attributes, the ball and several possibilities of potential
104-
actions. The complexity of data analytics and representation is directly
105-
proportional to the kind of predictive questions that are put forth during
106-
analysis and are highly dependent on data representation and the model. Things
107-
get even more challenging in terms of computation, data comparisons when dynamic
108-
cricket play predictions are sought such as what would have happened if the
109-
batsman had hit the ball at a different angle or velocity?
102+
One of the most tricky challenges with sports analytics is getting the
103+
right data representation. What this means is getting the raw data in a form
104+
such that it can be laid out for comparison and building models. If the
105+
initial representation itself is incorrect, everything that follows is akin
106+
to fitting noise to detect a signal. In cricket, just like any other sport,
107+
there can be a large number of variables related to tracking various numbers
108+
of players on the field, their attributes, the ball and several possibilities
109+
of potential actions. The complexity of data analytics and representation is
110+
directly proportional to the kind of predictive questions that are put forth
111+
during analysis and are highly dependent on data representation and the
112+
model. Things get even more challenging in terms of computation, data
113+
comparisons when dynamic cricket play predictions are sought such as what
114+
would have happened if the batsman had hit the ball at a different angle or
115+
velocity?
110116

111117
* **Predictive Analytics Complexity**
112118

113-
In cricket, some of the decision making is based on questions such as how
114-
often a batsman plays a kind of shot if the ball delivery is of a certain type,
115-
or say how does a bowler change his line and length if the batsman responds to
116-
his delivery in a certain way, or how often does a team decide to bat after
117-
winning a toss or switches the batting order, etc. This kind of predictive
118-
analytics queries requires highly granular dataset availability and the
119-
capability to synthesize data and create generative models that are highly
120-
accurate.
119+
In cricket, some of the decision making is based on questions such as "how
120+
often does a batsman play a certain kind of shot if the ball delivery is of a
121+
particular type", or "how does a bowler change his line and length if the
122+
batsman responds to his delivery in a certain way".
123+
This kind of predictive analytics query requires highly granular dataset
124+
availability and the capability to synthesize data and create generative
125+
models that are highly accurate.
121126

122127
## NumPy’s Role in Cricket Analytics
123128

124-
Sports Analytics is a thriving field. It utilizes several Python based software
125-
including NumPy, Matplotlib, Sci-kit learn, PyCharm, Jupyter and
126-
[others](https://adtmag.com/blogs/dev-watch/2017/07/sports-analytics.aspx) in
127-
addition to latest machine learning and AI techniques.
128-
129-
NumPy, the standard numerical analysis package for Python, has been utilized
129+
Sports Analytics is a thriving field. Many researchers and companies
130+
[use NumPy](https://adtmag.com/blogs/dev-watch/2017/07/sports-analytics.aspx)
131+
and other PyData packages like Scikit-learn, SciPy, Matplotlib, and Jupyter.
132+
in addition to latest machine learning and AI techniques. NumPy has been used
130133
for various kinds of cricket related sporting analytics such as:
131134

132-
* **Data Correlation:** Data graphing and [visualization](https://towardsdatascience.com/advanced-sports-visualization-with-pandas-matplotlib-and-seaborn-9c16df80a81b) provides useful insights into
133-
relationship between various datasets,
134-
[causation](https://amplitude.com/blog/2017/01/19/causation-correlation),
135-
[stratification](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4996805/) for
136-
tactical analysis. NumPy offers the core foundation for several such analyses.
137-
138-
* **Statistical Analysis:** NumPy Compute processing helps estimate the
139-
statistical significance of observational data or match events in the context
140-
of various player and game tactics, estimating the game outcome by comparison
141-
with a generative or static model.
142-
143-
* **Data Visualization:** NumPy is used as the primary [numerical compute
144-
processing](https://www.researchgate.net/publication/336886516_Data_visualization_and_toss_related_analysis_of_IPL_teams_and_batsmen_performances)
145-
engine for the cricket datasets. Pandas used as the data processing and I/O.
146-
Matplotlib used as the basic visualization for players. Seaborn package used
147-
as the modern visualization for Toss related analysis as well as for team and
148-
player insights.
149-
150-
{{< figure src="/images/content_images/cs/cricket-stats.png" class="fig-center"
151-
alt="role of numpy" caption="**IPL Data Analytics**" attr="(Source: mc.ai)"
152-
attrlink="https://mc.ai/predicting-the-outcome-of-cricket-matches-using-ai/" >}}
135+
* **Data Correlation:** data graphing and [visualization](https://towardsdatascience.com/advanced-sports-visualization-with-pandas-matplotlib-and-seaborn-9c16df80a81b)
136+
provides useful insights into relationship between various datasets;
137+
[causal analysis](https://amplitude.com/blog/2017/01/19/causation-correlation),
138+
and [big data approaches](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4996805/)
139+
are used for tactical analysis. NumPy offers the core foundation for several
140+
such analyses.
141+
142+
* **Statistical Analysis:** NumPy numerical capabilities help estimate the
143+
statistical significance of observational data or match events in the context
144+
of various player and game tactics, estimating the game outcome by comparison
145+
with a generative or static model.
146+
147+
* **Data Visualization:** NumPy is used as the primary
148+
[numerical computing](https://www.researchgate.net/publication/336886516_Data_visualization_and_toss_related_analysis_of_IPL_teams_and_batsmen_performances)
149+
engine for cricket datasets. Pandas is used for statistics and I/O.
150+
Matplotlib is used as the basic visualization for players. with Seaborn used
151+
as the modern visualization package for Toss related analysis as well as for
152+
team and player insights.
153+
154+
{{< figure src="/images/content_images/cs/cricket-stats.png"
155+
class="fig-center"
156+
alt="The role of NumPy in cricket analytics - toss and match winners"
157+
caption="**IPL toss data analysis**"
158+
attr="*(Image credits: mc.ai)*"
159+
attrlink="https://mc.ai/predicting-the-outcome-of-cricket-matches-using-ai/" >}}
153160

154161
## Summary
155162

156163
Sports Analytics have changed the way professional games are played, especially
157-
regarding decision making which was not so long ago, primarily done based on
158-
“gut feeling or adherence to past traditions. It is no secret that the Indian
164+
regarding decision making which was until recently primarily done based on
165+
“gut feeling" or adherence to past traditions. It is no secret that the Indian
159166
cricket teams rely heavily on data analytics to decide their strategy for an
160-
upcoming match or fine tune their tactics during the game. NumPy forms the
161-
solid foundation used by several Python based softwares to provide higher level
167+
upcoming match or fine tune their tactics during the game. NumPy forms a
168+
solid foundation for a large set of Python packages which provide higher level
162169
functions related to data analytics, machine learning and AI algorithms. These
163-
software are widely deployed to gain real-time insights that help in decision
170+
packages are widely deployed to gain real-time insights that help in decision
164171
making for game-changing outcomes, both on field as well as to draw inferences
165172
and drive business around the game of cricket. Finding out the hidden
166173
parameters, patterns and attributes that lead to the outcome of a cricket match
167174
helps the stakeholders to take notice of game insights that are otherwise hidden
168175
in numbers and statistics.
169176

170-
{{< figure src="/images/content_images/cs/numpy_ca_benefits.png" class="fig-center"
171-
alt="numpy benefits" caption="**Key NumPy Capabilities utilized**" >}}
177+
{{< figure src="/images/content_images/cs/numpy_ca_benefits.png"
178+
class="fig-center"
179+
alt="Diagram showing benefits of using NumPy for cricket analytics"
180+
caption="**Key NumPy Capabilities utilized**" >}}

0 commit comments

Comments
 (0)