nealmcb
diff --git a/‎.gitignore
Lines changed: 0 additions & 1 deletion b/‎.gitignore
Lines changed: 0 additions & 1 deletion
diff --git a/‎Chapter1_Introduction/Chapter1_Introduction.ipynb
Lines changed: 10 additions & 10 deletions b/‎Chapter1_Introduction/Chapter1_Introduction.ipynb
Lines changed: 10 additions & 10 deletions
@@ -1,4 +1,3 @@
 .DS_Store
 *.pyc
-*.png
 *~
@@ -90,7 +90,7 @@
       "\n",
       "It's clear that in each example we did not completely discard the prior belief after seeing new evidence $X$, but we *re-weighted the prior* to incorporate the new evidence (i.e. we put more weight, or confidence, on some beliefs versus others). \n",
       "\n",
-      "By introducing prior uncertainty about events, we are already admitting that any guess we make is potentially very wrong. After observing data, evidence, or other information, and we update our beliefs, our guess becomes *less wrong*. This is the alternative side of the prediction coin, where typically we try to be *more right*.\n"
+      "By introducing prior uncertainty about events, we are already admitting that any guess we make is potentially very wrong. After observing data, evidence, or other information, we update our beliefs, and our guess becomes *less wrong*. This is the alternative side of the prediction coin, where typically we try to be *more right*.\n"
      ]
     },
     {
@@ -233,7 +233,7 @@
       "\n",
       "That being said, it does assign a positive probability to $p$ really being 0.5. As more data accumulates, we would see more and more probabilitiy being assigned at $p=0.5$.\n",
       "\n",
-      "The next example demonstrates a simple demonstration of the mathematics of Bayesian updating. "
+      "The next example is a simple demonstration of the mathematics of Bayesian updating. "
      ]
     },
     {
@@ -305,7 +305,7 @@
      "cell_type": "markdown",
      "metadata": {},
      "source": [
-      "We can see the biggest gains if we observe the $X$ tests passed are when the prior probability, $p$, is low. Let's settle on a specific value for the prior. I'm a strong programmer (I think), so I'm going to give myself a realistic prior of 0.20, that is, there is a 20% chance that I write code bug-free. To be more realistic, this prior should be a function of how complicated and large the code is, but let's pin it at 0.20. Then my updated belief that my code is bug-free is 0.33. \n",
+      "We can see the biggest gains if we observe the $X$ tests passed when the prior probability, $p$, is low. Let's settle on a specific value for the prior. I'm a strong programmer (I think), so I'm going to give myself a realistic prior of 0.20, that is, there is a 20% chance that I write code bug-free. To be more realistic, this prior should be a function of how complicated and large the code is, but let's pin it at 0.20. Then my updated belief that my code is bug-free is 0.33. \n",
       "\n",
       "Recall that the prior is a probability: $p$ is the prior probability that there *are no bugs*, so $1-p$ is the prior probability that there *are bugs*.\n",
       "\n",
@@ -387,7 +387,7 @@
       "\n",
       "$$E\\large[ \\;Z\\; | \\; \\lambda \\;\\large] = \\lambda $$\n",
       "\n",
-      "We will use this property often, so it's something useful to remember. Below we plot the probability mass distribution for different $\\lambda$ values. The first thing to notice is that by increasing $\\lambda$ we add more probability to larger values occurring. Secondly, notice that although the graph ends at 15, the distributions do not. They assign positive probability to every non-negative integer.."
+      "We will use this property often, so it's something useful to remember. Below we plot the probability mass distribution for different $\\lambda$ values. The first thing to notice is that by increasing $\\lambda$ we add more probability to larger values occurring. Secondly, notice that although the graph ends at 15, the distributions do not. They assign positive probability to every non-negative integer."
      ]
     },
     {
@@ -571,13 +571,13 @@
       "& \\Rightarrow P( \\tau = k ) = \\frac{1}{70}\n",
       "\\end{align}\n",
       "\n",
-      "So after all this, what does our overall prior for the unknown variables look like? Frankly, *it doesn't matter*. What we should understand is that it would be an ugly, complicated, mess involving symbols only a mathematician would love. And things would only get uglier the more complicated our models become. Regardless, all we really care about is the posterior distribution. We next turn to PyMC, a Python library for performing Bayesian analysis, and is agnostic to the mathematical monster we have created. \n",
+      "So after all this, what does our overall prior for the unknown variables look like? Frankly, *it doesn't matter*. What we should understand is that it would be an ugly, complicated, mess involving symbols only a mathematician would love. And things would only get uglier the more complicated our models become. Regardless, all we really care about is the posterior distribution. We next turn to PyMC, a Python library for performing Bayesian analysis, that is agnostic to the mathematical monster we have created. \n",
       "\n",
       "\n",
       "Introducing our first hammer: PyMC\n",
       "-----\n",
       "\n",
-      "PyMC is a Python library for programming Bayesian analysis [3]. It is a fast, well-maintained library. The only unfortunate part is that documentation can be lacking in areas, especially the bridge between beginner to hacker. One this book's main goals is to solve that problem, and also to demonstrate why PyMC is so cool.\n",
+      "PyMC is a Python library for programming Bayesian analysis [3]. It is a fast, well-maintained library. The only unfortunate part is that documentation can be lacking in areas, especially the bridge between beginner to hacker. One of this book's main goals is to solve that problem, and also to demonstrate why PyMC is so cool.\n",
       "\n",
       "We will model the above problem using the PyMC library. This type of programming is called *probabilistic programming*, an unfortunate misnomer that invokes ideas of randomly-generated code and has likely confused and frightened users away from this field. The code is not random. The title is given because we create probability models using programming variables as the model's components, that is, model components are first-class primitives in this framework. \n",
       "\n",
@@ -615,7 +615,7 @@
      "cell_type": "markdown",
      "metadata": {},
      "source": [
-      "In the above code, we create the PyMC variables corresponding to $\\lambda_1, \\; \\lambda_2$ in lines `8,9`. We assign them to PyMC's *stochastic variables*, called stochastic variables because they are treated by the backend as random number generators. We can test this by calling their built-in `random()` method."
+      "In the above code, we create the PyMC variables corresponding to $\\lambda_1, \\; \\lambda_2$. We assign them to PyMC's *stochastic variables*, called stochastic variables because they are treated by the backend as random number generators. We can test this by calling their built-in `random()` method."
      ]
     },
     {
@@ -677,7 +677,7 @@
      "cell_type": "markdown",
      "metadata": {},
      "source": [
-      "The variable `observations` combines our data, `count_data`, with our proposed data-generation scheme, given by the variable `lambda_`, through the `value` keyword. We also set `observed = True` to tell PyMC that this should stay fixed in our analysis. Finally, PyMC wants us to collect all the variables of interest and create a `Model` instance out of them. This makes our life easier when we try to retrieve the results.\n",
+      "The variable `observation` combines our data, `count_data`, with our proposed data-generation scheme, given by the variable `lambda_`, through the `value` keyword. We also set `observed = True` to tell PyMC that this should stay fixed in our analysis. Finally, PyMC wants us to collect all the variables of interest and create a `Model` instance out of them. This makes our life easier when we try to retrieve the results.\n",
       "\n",
       "The below code will be explained in the Chapter 3, but this is where our results come from. One can think of it as a *learning* step. The machinery being employed is called *Monte Carlo Markov Chains* (which I delay explaining until Chapter 3). It returns thousands of random variables from the posterior distributions of $\\lambda_1, \\lambda_2$ and $\\tau$. We can plot a histogram of the random variables to see what the posterior distribution looks like. Below, we collect the samples (called *traces* in MCMC literature) in histograms."
      ]
@@ -784,7 +784,7 @@
       "\n",
       "Recall that the Bayesian methodology returns a *distribution*, hence we now have distributions to describe the unknown $\\lambda$'s and $\\tau$. What have we gained? Immediately we can see the uncertainty in our estimates: the more variance in the distribution, the less certain our posterior belief should be. We can also say what a plausible value for the parameters might be: $\\lambda_1$ is around 18 and $\\lambda_2$ is around 23. What other observations can you make? Look at the data again, do these seem reasonable? The distributions of the two $\\\\lambda$s are positioned very differently, indicating that it's likely there was a change in the user's text-message behaviour.\n",
       "\n",
-      "Also notice that the posteriors' distributions do not look like any Poisson distributions, though we originally started modelling with Poisson random variables. They are really not anything we recognize. But this is OK. This is one of the benefits of taking a computational point-of-view. If we had instead done this mathematically, we would have been stuck with a very analytically intractable (and messy) distribution. Via computations, we are agnostic to the tractability.\n",
+      "Also notice that the posterior distributions for the $\\lambda$'s do not look like any exponential distributions, though we originally started modelling with exponential random variables. They are really not anything we recognize. But this is OK. This is one of the benefits of taking a computational point-of-view. If we had instead done this mathematically, we would have been stuck with a very analytically intractable (and messy) distribution. Via computations, we are agnostic to the tractability.\n",
       "\n",
       "Our analysis also returned a distribution for what $\\tau$ might be. Its posterior distribution looks a little different from the other two because it is a discrete random variable, hence it doesn't assign probabilities to internals. We can see that near day 45, there was a 50% chance the users behaviour changed. Had no change occurred, or the change been gradual over time, the posterior distribution of $\\tau$ would have been more spread out, reflecting that many values are likely candidates for $\\tau$. On the contrary, it is very peaked. "
      ]
@@ -887,7 +887,7 @@
      "cell_type": "markdown",
      "metadata": {},
      "source": [
-      "2\\.  What is the expected percent increase text-message rates? `hint:` compute the mean of `lambda_1_samples/lambda_2_samples`. Note that quantity is very different from `lambda_1_samples.mean()/lambda_2_samples.mean()`."
+      "2\\.  What is the expected percentage increase in text-message rates? `hint:` compute the mean of `lambda_1_samples/lambda_2_samples`. Note that quantity is very different from `lambda_1_samples.mean()/lambda_2_samples.mean()`."
      ]
     },
     {
-Original file line number
+Diff line change
@@ @@ -1,4 +1,3 @@ @@
 .DS_Store
 *.pyc
 -*.png
 *~