Skip to content

Commit d4c9583

Browse files
committed
fix variable name and minor formatting update
1 parent a9bc29e commit d4c9583

File tree

1 file changed

+33
-31
lines changed

1 file changed

+33
-31
lines changed

lectures/prob_dist.md

+33-31
Original file line numberDiff line numberDiff line change
@@ -46,18 +46,22 @@ Let's start with discrete distributions.
4646

4747
A discrete distribution is defined by a set of numbers $S = \{x_1, \ldots, x_n\}$ and a **probability mass function** (PMF) on $S$, which is a function $p$ from $S$ to $[0,1]$ with the property
4848

49-
$$ \sum_{i=1}^n p(x_i) = 1 $$
49+
$$
50+
\sum_{i=1}^n p(x_i) = 1
51+
$$
5052

5153
We say that a random variable $X$ **has distribution** $p$ if $X$ takes value $x_i$ with probability $p(x_i)$.
5254

5355
That is,
5456

55-
$$ \mathbb P\{X = x_i\} = p(x_i) \quad \text{for } i= 1, \ldots, n $$
57+
$$
58+
\mathbb P\{X = x_i\} = p(x_i) \quad \text{for } i= 1, \ldots, n
59+
$$
5660

5761
The **mean** or **expected value** of a random variable $X$ with distribution $p$ is
5862

5963
$$
60-
\mathbb{E}[X] = \sum_{i=1}^n x_i p(x_i)
64+
\mathbb{E}[X] = \sum_{i=1}^n x_i p(x_i)
6165
$$
6266

6367
Expectation is also called the *first moment* of the distribution.
@@ -67,16 +71,16 @@ We also refer to this number as the mean of the distribution (represented by) $p
6771
The **variance** of $X$ is defined as
6872

6973
$$
70-
\mathbb{V}[X] = \sum_{i=1}^n (x_i - \mathbb{E}[X])^2 p(x_i)
74+
\mathbb{V}[X] = \sum_{i=1}^n (x_i - \mathbb{E}[X])^2 p(x_i)
7175
$$
7276

7377
Variance is also called the *second central moment* of the distribution.
7478

7579
The **cumulative distribution function** (CDF) of $X$ is defined by
7680

7781
$$
78-
F(x) = \mathbb{P}\{X \leq x\}
79-
= \sum_{i=1}^n \mathbb 1\{x_i \leq x\} p(x_i)
82+
F(x) = \mathbb{P}\{X \leq x\}
83+
= \sum_{i=1}^n \mathbb 1\{x_i \leq x\} p(x_i)
8084
$$
8185

8286
Here $\mathbb 1\{ \textrm{statement} \} = 1$ if "statement" is true and zero otherwise.
@@ -115,7 +119,6 @@ u.pmf(1)
115119
u.pmf(2)
116120
```
117121

118-
119122
Here's a plot of the probability mass function:
120123

121124
```{code-cell} ipython3
@@ -129,7 +132,6 @@ ax.set_ylabel('PMF')
129132
plt.show()
130133
```
131134

132-
133135
Here's a plot of the CDF:
134136

135137
```{code-cell} ipython3
@@ -143,10 +145,8 @@ ax.set_ylabel('CDF')
143145
plt.show()
144146
```
145147

146-
147148
The CDF jumps up by $p(x_i)$ at $x_i$.
148149

149-
150150
```{exercise}
151151
:label: prob_ex1
152152
@@ -179,7 +179,7 @@ We can import the Bernoulli distribution on $S = \{0,1\}$ from SciPy like so:
179179

180180
```{code-cell} ipython3
181181
θ = 0.4
182-
u = scipy.stats.bernoulli(p)
182+
u = scipy.stats.bernoulli(θ)
183183
```
184184

185185
Here's the mean and variance at $\theta=0.4$
@@ -201,7 +201,7 @@ u.pmf(1)
201201
Another useful (and more interesting) distribution is the **binomial distribution** on $S=\{0, \ldots, n\}$, which has PMF:
202202

203203
$$
204-
p(i) = \binom{n}{i} \theta^i (1-\theta)^{n-i}
204+
p(i) = \binom{n}{i} \theta^i (1-\theta)^{n-i}
205205
$$
206206

207207
Again, $\theta \in [0,1]$ is a parameter.
@@ -299,7 +299,7 @@ We can see that the output graph is the same as the one above.
299299
The geometric distribution has infinite support $S = \{0, 1, 2, \ldots\}$ and its PMF is given by
300300

301301
$$
302-
p(i) = (1 - \theta)^i \theta
302+
p(i) = (1 - \theta)^i \theta
303303
$$
304304

305305
where $\lambda \in [0,1]$ is a parameter
@@ -338,7 +338,7 @@ plt.show()
338338
The Poisson distribution on $S = \{0, 1, \ldots\}$ with parameter $\lambda > 0$ has PMF
339339

340340
$$
341-
p(i) = \frac{\lambda^i}{i!} e^{-\lambda}
341+
p(i) = \frac{\lambda^i}{i!} e^{-\lambda}
342342
$$
343343

344344
The interpretation of $p(i)$ is: the probability of $i$ events in a fixed time interval, where the events occur independently at a constant rate $\lambda$.
@@ -376,12 +376,14 @@ plt.show()
376376

377377
A continuous distribution is represented by a **probability density function**, which is a function $p$ over $\mathbb R$ (the set of all real numbers) such that $p(x) \geq 0$ for all $x$ and
378378

379-
$$ \int_{-\infty}^\infty p(x) dx = 1 $$
379+
$$
380+
\int_{-\infty}^\infty p(x) dx = 1
381+
$$
380382

381383
We say that random variable $X$ has distribution $p$ if
382384

383385
$$
384-
\mathbb P\{a < X < b\} = \int_a^b p(x) dx
386+
\mathbb P\{a < X < b\} = \int_a^b p(x) dx
385387
$$
386388

387389
for all $a \leq b$.
@@ -391,14 +393,14 @@ The definition of the mean and variance of a random variable $X$ with distributi
391393
For example, the mean of $X$ is
392394

393395
$$
394-
\mathbb{E}[X] = \int_{-\infty}^\infty x p(x) dx
396+
\mathbb{E}[X] = \int_{-\infty}^\infty x p(x) dx
395397
$$
396398

397399
The **cumulative distribution function** (CDF) of $X$ is defined by
398400

399401
$$
400-
F(x) = \mathbb P\{X \leq x\}
401-
= \int_{-\infty}^x p(x) dx
402+
F(x) = \mathbb P\{X \leq x\}
403+
= \int_{-\infty}^x p(x) dx
402404
$$
403405

404406

@@ -407,8 +409,8 @@ $$
407409
Perhaps the most famous distribution is the **normal distribution**, which has density
408410

409411
$$
410-
p(x) = \frac{1}{\sqrt{2\pi}\sigma}
411-
\exp\left(-\frac{(x-\mu)^2}{2\sigma^2}\right)
412+
p(x) = \frac{1}{\sqrt{2\pi}\sigma}
413+
\exp\left(-\frac{(x-\mu)^2}{2\sigma^2}\right)
412414
$$
413415

414416
This distribution has two parameters, $\mu \in \mathbb R$ and $\sigma \in (0, \infty)$.
@@ -468,8 +470,8 @@ plt.show()
468470
The **lognormal distribution** is a distribution on $\left(0, \infty\right)$ with density
469471

470472
$$
471-
p(x) = \frac{1}{\sigma x \sqrt{2\pi}}
472-
\exp \left(- \frac{\left(\log x - \mu\right)^2}{2 \sigma^2} \right)
473+
p(x) = \frac{1}{\sigma x \sqrt{2\pi}}
474+
\exp \left(- \frac{\left(\log x - \mu\right)^2}{2 \sigma^2} \right)
473475
$$
474476

475477
This distribution has two parameters, $\mu$ and $\sigma$.
@@ -530,8 +532,8 @@ plt.show()
530532
The **exponential distribution** is a distribution supported on $\left(0, \infty\right)$ with density
531533

532534
$$
533-
p(x) = \lambda \exp \left( - \lambda x \right)
534-
\qquad (x > 0)
535+
p(x) = \lambda \exp \left( - \lambda x \right)
536+
\qquad (x > 0)
535537
$$
536538

537539
This distribution has one parameter $\lambda$.
@@ -586,8 +588,8 @@ plt.show()
586588
The **beta distribution** is a distribution on $(0, 1)$ with density
587589

588590
$$
589-
p(x) = \frac{\Gamma(\alpha + \beta)}{\Gamma(\alpha) \Gamma(\beta)}
590-
x^{\alpha - 1} (1 - x)^{\beta - 1}
591+
p(x) = \frac{\Gamma(\alpha + \beta)}{\Gamma(\alpha) \Gamma(\beta)}
592+
x^{\alpha - 1} (1 - x)^{\beta - 1}
591593
$$
592594

593595
where $\Gamma$ is the [gamma function](https://en.wikipedia.org/wiki/Gamma_function).
@@ -648,8 +650,8 @@ plt.show()
648650
The **gamma distribution** is a distribution on $\left(0, \infty\right)$ with density
649651

650652
$$
651-
p(x) = \frac{\beta^\alpha}{\Gamma(\alpha)}
652-
x^{\alpha - 1} \exp(-\beta x)
653+
p(x) = \frac{\beta^\alpha}{\Gamma(\alpha)}
654+
x^{\alpha - 1} \exp(-\beta x)
653655
$$
654656

655657
This distribution has two parameters, $\alpha > 0$ and $\beta > 0$.
@@ -745,13 +747,13 @@ Suppose we have an observed distribution with values $\{x_1, \ldots, x_n\}$
745747
The **sample mean** of this distribution is defined as
746748

747749
$$
748-
\bar x = \frac{1}{n} \sum_{i=1}^n x_i
750+
\bar x = \frac{1}{n} \sum_{i=1}^n x_i
749751
$$
750752

751753
The **sample variance** is defined as
752754

753755
$$
754-
\frac{1}{n} \sum_{i=1}^n (x_i - \bar x)^2
756+
\frac{1}{n} \sum_{i=1}^n (x_i - \bar x)^2
755757
$$
756758

757759
For the income distribution given above, we can calculate these numbers via

0 commit comments

Comments
 (0)