Skip to content

Commit d9912c2

Browse files
Merge pull request #175 from eli-b/master
Remove word
2 parents 5adeb88 + 242730d commit d9912c2

File tree

1 file changed

+2
-2
lines changed

1 file changed

+2
-2
lines changed

Chapter4_TheGreatestTheoremNeverTold/LawOfLargeNumbers.ipynb

+2-2
Original file line numberDiff line numberDiff line change
@@ -285,7 +285,7 @@
285285
"\n",
286286
"Often data comes in aggregated form. For instance, data may be grouped by state, county, or city level. Of course, the population numbers vary per geographic area. If the data is an average of some characteristic of each the geographic areas, we must be conscious of the Law of Large Numbers and how it can *fail* for areas with small populations.\n",
287287
"\n",
288-
"We will observe this on a toy dataset. Suppose there are five thousand counties in our dataset. Furthermore, population number in each state are uniformly distributed between 100 and 1500. The way the population numbers are generated is irrelevant to the discussion, so we do not justify this. We are interested in measuring the average height of individuals per county. Unbeknownst to the us, height does **not** vary across county, and each individual, regardless of the county he or she is currently living in, has the same distribution of what their height may be:\n",
288+
"We will observe this on a toy dataset. Suppose there are five thousand counties in our dataset. Furthermore, population number in each state are uniformly distributed between 100 and 1500. The way the population numbers are generated is irrelevant to the discussion, so we do not justify this. We are interested in measuring the average height of individuals per county. Unbeknownst to us, height does **not** vary across county, and each individual, regardless of the county he or she is currently living in, has the same distribution of what their height may be:\n",
289289
"\n",
290290
"$$ \\text{height} \\sim \\text{Normal}(150, 15 ) $$\n",
291291
"\n",
@@ -979,7 +979,7 @@
979979
"\n",
980980
"Basically what we are doing is using a Beta prior (with parameters $a=1, b=1$, which is a uniform distribution), and using a Binomial likelihood with observations $u, N = u+d$. This means our posterior is a Beta distribution with parameters $a' = 1 + u, b' = 1 + (N - u) = 1+d$. We then need to find the value, $x$, such that 0.05 probability is less than $x$. This is usually done by inverting the CDF ([Cumulative Distribution Function](http://en.wikipedia.org/wiki/Cumulative_Distribution_Function)), but the CDF of the beta, for integer parameters, is known but is a large sum [3]. \n",
981981
"\n",
982-
"We instead using a Normal approximation. The mean of the Beta is $\\mu = a'/(a'+b')$ and the variance is \n",
982+
"We instead use a Normal approximation. The mean of the Beta is $\\mu = a'/(a'+b')$ and the variance is \n",
983983
"\n",
984984
"$$\\sigma^2 = \\frac{a'b'}{ (a' + b')^2(a'+b'+1) }$$\n",
985985
"\n",

0 commit comments

Comments
 (0)