Skip to content

[prob_dist] Bernoulli distribution section - editorial suggestions #403

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
longye-tian opened this issue Mar 18, 2024 · 2 comments
Closed

Comments

@longye-tian
Copy link
Collaborator

I've noticed that the content in the Bernoulli distribution section closely mirrors that of the uniform distribution section.

With this in mind, I'd like to propose the following amendments. Would that be alright with you, John(@jstac)?


Another useful distribution is the Bernoulli distribution on $S = {0, 1}$, which has PMF:

$$p(x_i) = \begin{cases} p & \text{if $x_i=1$}\\ 1-p & \text{if $x_i = 0$}\\ \end{cases}$$

Here $x_i\in S$ is the outcome of the random variable.

The interpretation of $p(x_i)$ is the probability of 'true' for any single experiment that asks 'True-False' question.

We can import the Bernoulli distribution on $S = {0,1}$ from SciPy like so:

p = 0.4 # The probability of True

u = scipy.stats.bernoulli(p)

Here's the mean and variance:

u.mean(), u.var()

The formula for the mean is $p$, and the formula for the variance is $p(1-p)$.

Now let's evaluate the PMF:

u.pmf(0)

u.pmf(1)

Here's a plot of the probability mass function:

fig, ax = plt.subplots()
S = np.arange(-4, 6)
ax.plot(S, u.pmf(S), linestyle='', marker='o', alpha=0.8, ms=4)
ax.vlines(S, 0, u.pmf(S), lw=0.2)
ax.set_xticks(S)
plt.show()

Here's a plot of the CDF:

fig, ax = plt.subplots()
S = np.arange(-4, 6)
ax.step(S, u.cdf(S))
ax.vlines(S, 0, u.cdf(S), lw=0.2)
ax.set_xticks(S)
plt.show()

The CDF jumps $p(x_i)$ at $x_i$.


Best,
Longye

@jstac
Copy link
Contributor

jstac commented Mar 18, 2024

Many thanks for picking this up @longye-tian.

Let's simplify this section, since the random variable is so simple. I suggest we drop the sentence "The interpretation of $p(x_i)$ is the probability of 'true' for any single experiment that asks 'True-False' question." and also the two plots.

@longye-tian
Copy link
Collaborator Author

Thank you, John. I agree that given the simplicity of the random variable, removing the specified sentence and the two plots will streamline our content effectively. I'll proceed with making these changes.

Best,
Longye

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants