Skip to content

Commit 35e718b

Browse files
shaloorgommers
authored andcommitted
Fixes #gh-164 addresses Ralf's comments in PR
1 parent 895c3e6 commit 35e718b

File tree

3 files changed

+12
-39
lines changed

3 files changed

+12
-39
lines changed

content/en/case-studies/cricket-analytics.md

+12-39
Original file line numberDiff line numberDiff line change
@@ -87,36 +87,25 @@ metrics for improving match winning chances:
8787

8888
* **Data Cleaning and preprocessing**
8989

90-
There are several public and proprietary sources of cricket data. The
91-
latter are mostly owned by broadcasting corporations that hold rights to
92-
various seasons and matches played. IPL has expanded cricket beyond the
93-
classic test match format to a much larger scale. The number of matches
94-
played every season across various formats has increased and so has the data,
95-
the algorithms, newer technologies and simulation models. Real time video
96-
analysis requires field mapping, player tracking, ball tracking, player shot
97-
analysis and several other aspects involved in how the ball is delivered, its
98-
angle, spin, velocity and trajectory.
99-
100-
* **Data Representation**
101-
102-
One of the most tricky challenges with sports analytics is getting the
103-
right data representation. What this means is getting the raw data in a form
104-
such that it can be laid out for comparison and building models. If the
105-
initial representation itself is incorrect, everything that follows is akin
106-
to fitting noise to detect a signal. In cricket, just like any other sport,
90+
IPL has expanded cricket beyond the classic test match format to a much larger scale. The number of matches played every season across various formats has increased and so has the data, the algorithms, newer sports data analysis
91+
technologies and simulation models. Cricket data analysis requires field mapping, player tracking, ball tracking, player shot analysis and several other aspects involved in how the ball is delivered, its angle, spin, velocity and trajectory. All these factors together have increased the complexity of data cleaning and preprocessing.
92+
93+
* **Dynamic Modeling**
94+
95+
In cricket, just like any other sport,
10796
there can be a large number of variables related to tracking various numbers
10897
of players on the field, their attributes, the ball and several possibilities
109-
of potential actions. The complexity of data analytics and representation is
98+
of potential actions. The complexity of data analytics and modeling is
11099
directly proportional to the kind of predictive questions that are put forth
111100
during analysis and are highly dependent on data representation and the
112101
model. Things get even more challenging in terms of computation, data
113102
comparisons when dynamic cricket play predictions are sought such as what
114103
would have happened if the batsman had hit the ball at a different angle or
115-
velocity?
104+
velocity.
116105

117106
* **Predictive Analytics Complexity**
118107

119-
In cricket, some of the decision making is based on questions such as "how
108+
Much of the decision making in Cricket is based on questions such as "how
120109
often does a batsman play a certain kind of shot if the ball delivery is of a
121110
particular type", or "how does a bowler change his line and length if the
122111
batsman responds to his delivery in a certain way".
@@ -132,39 +121,23 @@ and other PyData packages like Scikit-learn, SciPy, Matplotlib, and Jupyter.
132121
in addition to latest machine learning and AI techniques. NumPy has been used
133122
for various kinds of cricket related sporting analytics such as:
134123

135-
* **Data Correlation:** data graphing and [visualization](https://towardsdatascience.com/advanced-sports-visualization-with-pandas-matplotlib-and-seaborn-9c16df80a81b)
124+
* **Data Visualization:** Data graphing and [visualization](https://towardsdatascience.com/advanced-sports-visualization-with-pandas-matplotlib-and-seaborn-9c16df80a81b)
136125
provides useful insights into relationship between various datasets;
137126
[causal analysis](https://amplitude.com/blog/2017/01/19/causation-correlation),
138127
and [big data approaches](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4996805/)
139-
are used for tactical analysis. NumPy offers the core foundation for several
128+
are used for tactical analysis. NumPy offers the core numeric computing foundation for several
140129
such analyses.
141130

142131
* **Statistical Analysis:** NumPy numerical capabilities help estimate the
143132
statistical significance of observational data or match events in the context
144133
of various player and game tactics, estimating the game outcome by comparison
145134
with a generative or static model.
146135

147-
* **Data Visualization:** NumPy is used as the primary
148-
[numerical computing](https://www.researchgate.net/publication/336886516_Data_visualization_and_toss_related_analysis_of_IPL_teams_and_batsmen_performances)
149-
engine for cricket datasets. Pandas is used for statistics and I/O.
150-
Matplotlib is used as the basic visualization for players. with Seaborn used
151-
as the modern visualization package for Toss related analysis as well as for
152-
team and player insights.
153-
154-
{{< figure src="/images/content_images/cs/cricket-stats.png"
155-
class="fig-center"
156-
alt="The role of NumPy in cricket analytics - toss and match winners"
157-
caption="**IPL toss data analysis**"
158-
attr="*(Image credits: mc.ai)*"
159-
attrlink="https://mc.ai/predicting-the-outcome-of-cricket-matches-using-ai/" >}}
160-
161136
## Summary
162137

163138
Sports Analytics have changed the way professional games are played, especially
164139
regarding decision making which was until recently primarily done based on
165-
“gut feeling" or adherence to past traditions. It is no secret that the Indian
166-
cricket teams rely heavily on data analytics to decide their strategy for an
167-
upcoming match or fine tune their tactics during the game. NumPy forms a
140+
“gut feeling" or adherence to past traditions. NumPy forms a
168141
solid foundation for a large set of Python packages which provide higher level
169142
functions related to data analytics, machine learning and AI algorithms. These
170143
packages are widely deployed to gain real-time insights that help in decision
Binary file not shown.
Loading

0 commit comments

Comments
 (0)