Skip to content

Ref #gh_164 First draft of Cricket analytics using NumPy #174

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Apr 26, 2020
5 changes: 3 additions & 2 deletions config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -45,10 +45,11 @@ params:
alttext: Two orbs orbiting each other. They are displacing gravity around them.
url: /case-studies/gw-discov
- title: Sports Analytics
text: TODO!
text: Cricket Analytics is changing the game by improving player and team performance through statistical modelling and predictive analytics. NumPy enables many of these analyses.
img: images/content_images/case_studies/sports.jpg
alttext: Cricket ball on green field.
url: /
url: /case-studies/cricket-analytics

section2:
title: KEY FEATURES
features:
Expand Down
2 changes: 1 addition & 1 deletion content/en/case-studies/blackhole-image.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ title: "Case Study: The First Image of a Black Hole"
sidebar: false
---

{{< figure src="/images/content_images/cs/blackhole.jpg" caption="**Black Hole M87**" alt="black hole image" attr="(**Image Credits:** Event Horizon Telescope Collaboration)" attrlink="https://www.jpl.nasa.gov/images/universe/20190410/blackhole20190410.jpg" >}}
{{< figure src="/images/content_images/cs/blackhole.jpg" caption="**Black Hole M87**" alt="black hole image" attr="*(Image Credits: Event Horizon Telescope Collaboration)*" attrlink="https://www.jpl.nasa.gov/images/universe/20190410/blackhole20190410.jpg" >}}

<blockquote cite="https://www.youtube.com/watch?v=BIvezCVcsYs">
<p>Imaging the M87 Black Hole is like trying to see something that is by definition impossible to see.</p>
Expand Down
158 changes: 158 additions & 0 deletions content/en/case-studies/cricket-analytics.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,158 @@
---
title: "Case Study: Cricket Analytics, the game changer!"
sidebar: false
---

{{< figure src="/images/content_images/cs/ipl-stadium.png"
caption="**IPLT20, the biggest Cricket Festival in India**"
alt="Indian Premier League Cricket cup and stadium"
attr="*(Image credits: IPLT20 (cup and logo) & Akash Yadav (stadium))*"
attrlink="https://unsplash.com/@aksh1802" >}}

<blockquote cite="https://www.scoopwhoop.com/sports/ms-dhoni/">
<p>You don't play for the crowd, you play for the country.</p>
<footer align="right">—M S Dhoni, <cite>International Cricket Player, ex-captain, Indian Team, plays for Chennai Super Kings in IPL</cite></footer>
</blockquote>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be nice to replace this with a quote related to NumPy. I think of all the links in this case study, the Stats LLC one looks most substantial. @shaloo have you tried contacting http://patricklucey.com/index.html ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initiated email conversation regarding the same.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Waiting on email response. We can update it once I get more insights from Patrick later as a new / fresh PR. May be we can close this one and open a new issue once I hear from Patrick.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ralf, can we merge cricket case study updates now? I haven't heard back from most of those who I contacted regarding this, Patrick included. Once it shows up on the site, there will be more takers to provide feedback or better conten, I think.


## About Cricket

It would be an understatement to state that Indians love cricket. The game is
played in just about every nook and cranny of India, rural or urban, popular
with the young and the old alike, connecting billions in India unlike any other sport.
Cricket enjoys lots of media attention. There is a significant amount of
[money](https://www.statista.com/topics/4543/indian-premier-league-ipl/) and
fame at stake. Over the last several years, technology has literally been a game
changer. Audiences are spoilt for choice with streaming media, tournaments,
affordable access to mobile based live cricket watching, and more.

The Indian Premier League (IPL) is a professional Twenty20 cricket
league, founded in 2008. It is one of the most attended cricketing events in
the world, valued at [$6.7 billion](https://en.wikipedia.org/wiki/Indian_Premier_League)
in 2019.

Cricket is a game of numbers - the runs scored by a batsman, the wickets taken
by a bowler, the matches won by a cricket team, the number of times a batsman
responds in a certain way to a kind of bowling attack, etc. The capability to
dig into cricketing numbers for both improving performance and studying
the business opportunities, overall market and economics of cricket via powerful
analytics tools, powered by numerical computing software such as NumPy, is a big
deal. Cricket analytics provides interesting insights into the game and
predictive intelligence regarding game outcomes.

Today, there are rich and almost infinite troves of cricket game records and
statistics available, e.g., [ESPN
cricinfo](https://stats.espncricinfo.com/ci/engine/stats/index.html) and
[cricsheet](https://cricsheet.org). These and several such cricket databases
have been used for [cricket
analysis](https://www.researchgate.net/publication/336886516_Data_visualization_and_toss_related_analysis_of_IPL_teams_and_batsmen_performances)
using the latest machine learning and predictive modelling algorithms.
Media and entertainment platforms along with professional sports bodies
associated with the game use technology and analytics for determining key
metrics for improving match winning chances:

* batting performance moving average,
* score forecasting,
* gaining insights into fitness and performance of a player against different opposition,
* player contribution to wins and losses for making strategic decisions on team composition

{{< figure src="/images/content_images/cs/cricket-pitch.png"
class="csfigcaption"
caption="**Cricket Pitch, the focal point in the field**"
alt="A cricket pitch with bowler and batsmen"
align="middle"
attr="*(Image Credits: Debarghya Das)*"
attrlink="http://debarghyadas.com/files/IPLpaper.pdf" >}}

### Key Data Analytics Objectives

* Sports data analytics are used not only in cricket but many [other
sports](https://adtmag.com/blogs/dev-watch/2017/07/sports-analytics.aspx) for
improving the overall team performance and maximize winning chances.
* Real-time data analytics can help in gaining insights even during the game
for changing tactics by the team and by associated businesses for economic
benefits and growth.
* Besides historical analysis, predictive models are
harnessed to determine the possible match outcomes that require significant
number crunching and data science know-how, visualization tools and capability
to include newer observations in the analysis.

{{< figure src="/images/content_images/cs/player-pose-estimator.png"
class="fig-center"
alt="pose estimator"
caption="**Cricket Pose Estimator**"
attr="*(Image Credits: connect.vin)*"
attrlink="https://connect.vin/2019/05/ai-for-cricket-batsman-pose-analysis/" >}}

### The Challenges

* **Data Cleaning and preprocessing**

IPL has expanded cricket beyond the classic test match format to a much
larger scale. The number of matches played every season across various
formats has increased and so has the data, the algorithms, newer sports data
analysis technologies and simulation models. Cricket data analysis requires
field mapping, player tracking, ball tracking, player shot analysis and
several other aspects involved in how the ball is delivered, its angle, spin,
velocity and trajectory. All these factors together have increased the
complexity of data cleaning and preprocessing.

* **Dynamic Modeling**

In cricket, just like any other sport,
there can be a large number of variables related to tracking various numbers
of players on the field, their attributes, the ball and several possibilities
of potential actions. The complexity of data analytics and modeling is
directly proportional to the kind of predictive questions that are put forth
during analysis and are highly dependent on data representation and the
model. Things get even more challenging in terms of computation, data
comparisons when dynamic cricket play predictions are sought such as what
would have happened if the batsman had hit the ball at a different angle or
velocity.

* **Predictive Analytics Complexity**

Much of the decision making in Cricket is based on questions such as "how
often does a batsman play a certain kind of shot if the ball delivery is of a
particular type", or "how does a bowler change his line and length if the
batsman responds to his delivery in a certain way".
This kind of predictive analytics query requires highly granular dataset
availability and the capability to synthesize data and create generative
models that are highly accurate.

## NumPy’s Role in Cricket Analytics

Sports Analytics is a thriving field. Many researchers and companies
[use NumPy](https://adtmag.com/blogs/dev-watch/2017/07/sports-analytics.aspx)
and other PyData packages like Scikit-learn, SciPy, Matplotlib, and Jupyter.
in addition to latest machine learning and AI techniques. NumPy has been used
Comment on lines +126 to +127
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:
, after Jupter instead of .

for various kinds of cricket related sporting analytics such as:

* **Statistical Analysis:** NumPy's numerical capabilities help estimate the
statistical significance of observational data or match events in the context
of various player and game tactics, estimating the game outcome by comparison
with a generative or static model.
[Causal analysis](https://amplitude.com/blog/2017/01/19/causation-correlation)
and [big data approaches](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4996805/)
are used for tactical analysis.

* **Data Visualization:** Data graphing and [visualization](https://towardsdatascience.com/advanced-sports-visualization-with-pandas-matplotlib-and-seaborn-9c16df80a81b)
provides useful insights into relationship between various datasets.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

although these two points make sense in general for how numpy will be used in cricket analytics and i am sure numpy is used in cricket linked sporting analysis, the linked articles aren't specific to cricket. maybe its the case that we dont have concrete examples today ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's correct. I have been chasing a few folks from the cricketing analytics world but haven't been able to get these inputs yet. Would you have any leads / insights into the same?


## Summary

Sports Analytics have changed the way professional games are played, especially
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should it read : "Sport Analytics has changed the way professional games are played" instead ?

regarding decision making which was until recently primarily done based on
“gut feeling" or adherence to past traditions. NumPy forms a
solid foundation for a large set of Python packages which provide higher level
functions related to data analytics, machine learning and AI algorithms. These
packages are widely deployed to gain real-time insights that help in decision
making for game-changing outcomes, both on field as well as to draw inferences
and drive business around the game of cricket. Finding out the hidden
parameters, patterns and attributes that lead to the outcome of a cricket match
helps the stakeholders to take notice of game insights that are otherwise hidden
in numbers and statistics.

{{< figure src="/images/content_images/cs/numpy_ca_benefits.png"
class="fig-center"
alt="Diagram showing benefits of using NumPy for cricket analytics"
caption="**Key NumPy Capabilities utilized**" >}}
16 changes: 9 additions & 7 deletions content/en/case-studies/gw-discov.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ title: "Case Study: Discovery of Gravitational Waves"
sidebar: false
---

{{< figure src="/images/content_images/cs/gw_sxs_image.png" class="fig-center" caption="**Gravitational Waves**" alt="binary coalesce black hole generating gravitational waves" attr="(**Image Credits: The Simulating eXtreme Spacetimes (SXS) Project at LIGO** )" attrlink="https://youtu.be/Zt8Z_uzG71o" >}}
{{< figure src="/images/content_images/cs/gw_sxs_image.png" class="fig-center" caption="**Gravitational Waves**" alt="binary coalesce black hole generating gravitational waves" attr="*(Image Credits: The Simulating eXtreme Spacetimes (SXS) Project at LIGO)*" attrlink="https://youtu.be/Zt8Z_uzG71o" >}}

<blockquote cite="https://www.youtube.com/watch?v=BIvezCVcsYs">
<p>The scientific Python ecosystem is critical infrastructure for the research done at LIGO.</p>
Expand Down Expand Up @@ -45,12 +45,13 @@ made from warped spacetime.
astrophysics, cosmology, particle physics, and nuclear physics.
* Crunch observed data via numerical relativity computations that involves
complex maths in order to discern signal from noise, filter out relevant
signal and statistically estimate significance of observed data
signal and statistically estimate significance of observed data
* Data visualization so that the binary / numerical results can be
comprehended.


### The Challenges


### The Challenges

* **Computation**

Expand All @@ -61,7 +62,7 @@ made from warped spacetime.
complex relativity equations and huge amounts of data which present a
computational challenge:
[O(10^7) CPU hrs needed for binary merger analyses](https://youtu.be/7mcHknWWzNI)
spread on 6 dedicated LIGO clusters
spread on 6 dedicated LIGO clusters

* **Data Deluge**

Expand Down Expand Up @@ -89,7 +90,7 @@ made from warped spacetime.
{{< figure src="/images/content_images/cs/gw_strain_amplitude.png" class="fig-center" alt="gravitational waves strain amplitude" caption="**Estimated gravitational-wave strain amplitude from GW150914**" attr="(**Graph Credits:** Observation of Gravitational Waves from a Binary Black Hole Merger, ResearchGate Publication)" attrlink="https://www.researchgate.net/publication/293886905_Observation_of_Gravitational_Waves_from_a_Binary_Black_Hole_Merger" >}}

## NumPy’s Role in the detection of Gravitational Waves

Gravitational waves emitted from the merger cannot be computed using any
technique except brute force numerical relativity using supercomputers.
The amount of data LIGO collects is as incomprehensibly large as gravitational
Expand All @@ -111,13 +112,14 @@ speed. Here are some examples:
* Visualization of data
- Time series
- Spectrograms
* Compute Correlations
* Compute Correlations
* Key [Software](https://github.com/lscsoft) developed in GW data analysis
such as [GwPy](https://gwpy.github.io/docs/stable/overview.html) and
[PyCBC](https://pycbc.org) uses NumPy and AstroPy under the hood for
providing object based interfaces to utilities, tools and methods for
studying data from gravitational-wave detectors.


## Summary

GW detection has enabled researchers to discover entirely unexpected phenomena
Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added static/images/content_images/cs/ipl-stadium.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.