Skip to content

Commit b265e5d

Browse files
Merge pull request #101 from plotly/misc-signal
Misc signal
2 parents 33a32c1 + 328bf39 commit b265e5d

File tree

5 files changed

+355
-409
lines changed

5 files changed

+355
-409
lines changed

python/peak-finding.md

Lines changed: 135 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,135 @@
1+
---
2+
jupyter:
3+
jupytext:
4+
notebook_metadata_filter: all
5+
text_representation:
6+
extension: .md
7+
format_name: markdown
8+
format_version: '1.1'
9+
jupytext_version: 1.1.1
10+
kernelspec:
11+
display_name: Python 3
12+
language: python
13+
name: python3
14+
language_info:
15+
codemirror_mode:
16+
name: ipython
17+
version: 3
18+
file_extension: .py
19+
mimetype: text/x-python
20+
name: python
21+
nbconvert_exporter: python
22+
pygments_lexer: ipython3
23+
version: 3.6.7
24+
plotly:
25+
description: Learn how to find peaks and valleys on datasets in Python
26+
display_as: peak-analysis
27+
has_thumbnail: false
28+
ipynb: ~notebook_demo/120
29+
language: python
30+
layout: user-guide
31+
name: Peak Finding
32+
order: 3
33+
page_type: example_index
34+
permalink: python/peak-finding/
35+
thumbnail: /images/static-image
36+
title: Peak Finding in Python | plotly
37+
---
38+
39+
#### Imports
40+
The tutorial below imports [Pandas](https://plot.ly/pandas/intro-to-pandas-tutorial/), and [SciPy](https://www.scipy.org/).
41+
42+
```python
43+
import pandas as pd
44+
from scipy.signal import find_peaks
45+
```
46+
47+
#### Import Data
48+
To start detecting peaks, we will import some data on milk production by month:
49+
50+
```python
51+
import plotly.graph_objects as go
52+
import pandas as pd
53+
54+
milk_data = pd.read_csv('https://raw.githubusercontent.com/plotly/datasets/master/monthly-milk-production-pounds.csv')
55+
time_series = milk_data['Monthly milk production (pounds per cow)']
56+
57+
fig = go.Figure(data=go.Scatter(
58+
y = time_series,
59+
mode = 'lines'
60+
))
61+
62+
fig.show()
63+
```
64+
65+
#### Peak Detection
66+
67+
We need to find the x-axis indices for the peaks in order to determine where the peaks are located.
68+
69+
```python
70+
import plotly.graph_objects as go
71+
import pandas as pd
72+
from scipy.signal import find_peaks
73+
74+
milk_data = pd.read_csv('https://raw.githubusercontent.com/plotly/datasets/master/monthly-milk-production-pounds.csv')
75+
time_series = milk_data['Monthly milk production (pounds per cow)']
76+
77+
indices = find_peaks(time_series)[0]
78+
79+
fig = go.Figure()
80+
fig.add_trace(go.Scatter(
81+
y=time_series,
82+
mode='lines+markers',
83+
name='Original Plot'
84+
))
85+
86+
fig.add_trace(go.Scatter(
87+
x=indices,
88+
y=[time_series[j] for j in indices],
89+
mode='markers',
90+
marker=dict(
91+
size=8,
92+
color='red',
93+
symbol='cross'
94+
),
95+
name='Detected Peaks'
96+
))
97+
98+
fig.show()
99+
```
100+
101+
#### Only Highest Peaks
102+
We can attempt to set our threshold so that we identify as many of the _highest peaks_ that we can.
103+
104+
```python
105+
import plotly.graph_objects as go
106+
import numpy as np
107+
import pandas as pd
108+
from scipy.signal import find_peaks
109+
110+
milk_data = pd.read_csv('https://raw.githubusercontent.com/plotly/datasets/master/monthly-milk-production-pounds.csv')
111+
time_series = milk_data['Monthly milk production (pounds per cow)']
112+
113+
indices = find_peaks(time_series, threshold=20)[0]
114+
115+
fig = go.Figure()
116+
fig.add_trace(go.Scatter(
117+
y=time_series,
118+
mode='lines+markers',
119+
name='Original Plot'
120+
))
121+
122+
fig.add_trace(go.Scatter(
123+
x=indices,
124+
y=[time_series[j] for j in indices],
125+
mode='markers',
126+
marker=dict(
127+
size=8,
128+
color='red',
129+
symbol='cross'
130+
),
131+
name='Detected Peaks'
132+
))
133+
134+
fig.show()
135+
```

python/random-walk.md

Lines changed: 168 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,168 @@
1+
---
2+
jupyter:
3+
jupytext:
4+
notebook_metadata_filter: all
5+
text_representation:
6+
extension: .md
7+
format_name: markdown
8+
format_version: '1.1'
9+
jupytext_version: 1.1.1
10+
kernelspec:
11+
display_name: Python 3
12+
language: python
13+
name: python3
14+
language_info:
15+
codemirror_mode:
16+
name: ipython
17+
version: 3
18+
file_extension: .py
19+
mimetype: text/x-python
20+
name: python
21+
nbconvert_exporter: python
22+
pygments_lexer: ipython3
23+
version: 3.6.7
24+
plotly:
25+
description: Learn how to use Python to make a Random Walk
26+
display_as: statistics
27+
has_thumbnail: false
28+
ipynb: ~notebook_demo/114
29+
language: python
30+
layout: user-guide
31+
name: Random Walk
32+
order: 10
33+
page_type: example_index
34+
permalink: python/random-walk/
35+
thumbnail: /images/static-image
36+
title: Random Walk in Python. | plotly
37+
---
38+
39+
A [random walk](https://en.wikipedia.org/wiki/Random_walk) can be thought of as a random process in which a token or a marker is randomly moved around some space, that is, a space with a metric used to compute distance. It is more commonly conceptualized in one dimension ($\mathbb{Z}$), two dimensions ($\mathbb{Z}^2$) or three dimensions ($\mathbb{Z}^3$) in Cartesian space, where $\mathbb{Z}$ represents the set of integers. In the visualizations below, we will be using [scatter plots](https://plot.ly/python/line-and-scatter/) as well as a colorscale to denote the time sequence of the walk.
40+
41+
42+
#### Random Walk in 1D
43+
44+
45+
The jitter in the data points along the x and y axes are meant to illuminate where the points are being drawn and what the tendancy of the random walk is.
46+
47+
```python
48+
import plotly.graph_objects as go
49+
import numpy as np
50+
51+
l = 100
52+
steps = np.random.choice([-1, 1], size=l) + 0.05 * np.random.randn(l) # l steps
53+
position = np.cumsum(steps) # integrate the position by summing steps values
54+
y = 0.05 * np.random.randn(l)
55+
56+
fig = go.Figure(data=go.Scatter(
57+
x=position,
58+
y=y,
59+
mode='markers',
60+
name='Random Walk in 1D',
61+
marker=dict(
62+
color=np.arange(l),
63+
size=7,
64+
colorscale='Reds',
65+
showscale=True,
66+
)
67+
))
68+
69+
fig.update_layout(yaxis_range=[-1, 1])
70+
fig.show()
71+
```
72+
73+
#### Random Walk in 2D
74+
75+
```python
76+
import plotly.graph_objects as go
77+
import numpy as np
78+
79+
l = 1000
80+
x_steps = np.random.choice([-1, 1], size=l) + 0.2 * np.random.randn(l) # l steps
81+
y_steps = np.random.choice([-1, 1], size=l) + 0.2 * np.random.randn(l) # l steps
82+
x_position = np.cumsum(x_steps) # integrate the position by summing steps values
83+
y_position = np.cumsum(y_steps) # integrate the position by summing steps values
84+
85+
fig = go.Figure(data=go.Scatter(
86+
x=x_position,
87+
y=y_position,
88+
mode='markers',
89+
name='Random Walk',
90+
marker=dict(
91+
color=np.arange(l),
92+
size=8,
93+
colorscale='Greens',
94+
showscale=True
95+
)
96+
))
97+
98+
fig.show()
99+
```
100+
101+
#### Random walk and diffusion
102+
103+
In the two following charts we show the link between random walks and diffusion. We compute a large number `N` of random walks representing for examples molecules in a small drop of chemical. While all trajectories start at 0, after some time the spatial distribution of points is a Gaussian distribution. Also, the average distance to the origin grows as $\sqrt(t)$.
104+
105+
```python
106+
import plotly.graph_objects as go
107+
import numpy as np
108+
109+
l = 1000
110+
N = 10000
111+
steps = np.random.choice([-1, 1], size=(N, l)) + 0.05 * np.random.standard_normal((N, l)) # l steps
112+
position = np.cumsum(steps, axis=1) # integrate all positions by summing steps values along time axis
113+
114+
fig = go.Figure(data=go.Histogram(x=position[:, -1])) # positions at final time step
115+
fig.show()
116+
```
117+
118+
```python
119+
import plotly.graph_objects as go
120+
from plotly.subplots import make_subplots
121+
import numpy as np
122+
123+
l = 1000
124+
N = 10000
125+
t = np.arange(l)
126+
steps = np.random.choice([-1, 1], size=(N, l)) + 0.05 * np.random.standard_normal((N, l)) # l steps
127+
position = np.cumsum(steps, axis=1) # integrate the position by summing steps values
128+
average_distance = np.std(position, axis=0) # average distance
129+
130+
fig = make_subplots(1, 2)
131+
fig.add_trace(go.Scatter(x=t, y=average_distance, name='mean distance'), 1, 1)
132+
fig.add_trace(go.Scatter(x=t, y=average_distance**2, name='mean squared distance'), 1, 2)
133+
fig.update_xaxes(title_text='$t$')
134+
fig.update_yaxes(title_text='$l$', col=1)
135+
fig.update_yaxes(title_text='$l^2$', col=2)
136+
fig.update_layout(showlegend=False)
137+
fig.show()
138+
```
139+
140+
#### Advanced Tip
141+
We can formally think of a 1D random walk as a point jumping along the integer number line. Let $Z_i$ be a random variable that takes on the values +1 and -1. Let this random variable represent the steps we take in the random walk in 1D (where +1 means right and -1 means left). Also, as with the above visualizations, let us assume that the probability of moving left and right is just $\frac{1}{2}$. Then, consider the sum
142+
143+
$$
144+
\begin{align*}
145+
S_n = \sum_{i=0}^{n}{Z_i}
146+
\end{align*}
147+
$$
148+
149+
where S_n represents the point that the random walk ends up on after n steps have been taken.
150+
151+
To find the `expected value` of $S_n$, we can compute it directly. Since each $Z_i$ is independent, we have
152+
153+
$$
154+
\begin{align*}
155+
\mathbb{E}(S_n) = \sum_{i=0}^{n}{\mathbb{E}(Z_i)}
156+
\end{align*}
157+
$$
158+
159+
but since $Z_i$ takes on the values +1 and -1 then
160+
161+
$$
162+
\begin{align*}
163+
\mathbb{E}(Z_i) = 1 \cdot P(Z_i=1) + -1 \cdot P(Z_i=-1) = \frac{1}{2} - \frac{1}{2} = 0
164+
\end{align*}
165+
$$
166+
167+
Therefore, we expect our random walk to hover around $0$ regardless of how many steps we take in our walk.
168+

0 commit comments

Comments
 (0)