pymc-devs
diff --git a/‎examples/case_studies/BEST.ipynb
+7-7 b/‎examples/case_studies/BEST.ipynb
+7-7
diff --git a/‎examples/case_studies/LKJ.ipynb
+10-10 b/‎examples/case_studies/LKJ.ipynb
+10-10
diff --git a/‎examples/case_studies/bayesian_ab_testing.ipynb
+16-16 b/‎examples/case_studies/bayesian_ab_testing.ipynb
+16-16
@@ -16,7 +16,7 @@
      "name": "stdout",
      "output_type": "stream",
      "text": [
-      "Running on PyMC3 v3.11.0\n"
+      "Running on PyMC v3.11.0\n"
      ]
     }
    ],
@@ -25,9 +25,9 @@
     "import matplotlib.pyplot as plt\n",
     "import numpy as np\n",
     "import pandas as pd\n",
-    "import pymc3 as pm\n",
+    "import pymc as pm\n",
     "\n",
-    "print(f\"Running on PyMC3 v{pm.__version__}\")"
+    "print(f\"Running on PyMC v{pm.__version__}\")"
    ]
   },
   {
@@ -222,7 +222,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Since PyMC3 parameterizes the Student-T in terms of precision, rather than standard deviation, we must transform the standard deviations before specifying our likelihoods."
+    "Since PyMC parameterizes the Student-T in terms of precision, rather than standard deviation, we must transform the standard deviations before specifying our likelihoods."
    ]
   },
   {
@@ -323,7 +323,7 @@
    ],
    "source": [
     "with model:\n",
-    "    trace = pm.sample(2000, return_inferencedata=True)"
+    "    trace = pm.sample(2000)"
    ]
   },
   {
@@ -574,7 +574,7 @@
    "source": [
     "The original pymc2 implementation was written by Andrew Straw and can be found here: https://github.com/strawlab/best\n",
     "\n",
-    "Ported to PyMC3 by [Thomas Wiecki](https://twitter.com/twiecki) (c) 2015, updated by Chris Fonnesbeck."
+    "Ported to PyMC by [Thomas Wiecki](https://twitter.com/twiecki) (c) 2015, updated by Chris Fonnesbeck."
    ]
   },
   {
@@ -596,7 +596,7 @@
       "numpy     : 1.19.2\n",
       "matplotlib: 3.3.2\n",
       "arviz     : 0.11.2\n",
-      "pymc3     : 3.11.0\n",
+      "pymc     : 3.11.0\n",
       "\n",
       "Watermark: 2.2.0\n",
       "\n"
 
@@ -11,7 +11,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "While the [inverse-Wishart distribution](https://en.wikipedia.org/wiki/Inverse-Wishart_distribution) is the conjugate prior for the covariance matrix of a multivariate normal distribution, it is [not very well-suited](https://github.com/pymc-devs/pymc3/issues/538#issuecomment-94153586) to modern Bayesian computational methods.  For this reason, the [LKJ prior](http://www.sciencedirect.com/science/article/pii/S0047259X09000876) is recommended when modeling the covariance matrix of a multivariate normal distribution.\n",
+    "While the [inverse-Wishart distribution](https://en.wikipedia.org/wiki/Inverse-Wishart_distribution) is the conjugate prior for the covariance matrix of a multivariate normal distribution, it is [not very well-suited](https://github.com/pymc-devs/pymc/issues/538#issuecomment-94153586) to modern Bayesian computational methods.  For this reason, the [LKJ prior](http://www.sciencedirect.com/science/article/pii/S0047259X09000876) is recommended when modeling the covariance matrix of a multivariate normal distribution.\n",
     "\n",
     "To illustrate modelling covariance with the LKJ distribution, we first generate a two-dimensional normally-distributed sample data set."
    ]
@@ -25,21 +25,21 @@
      "name": "stdout",
      "output_type": "stream",
      "text": [
-      "Running on PyMC3 v3.11.2\n"
+      "Running on PyMC v3.11.2\n"
      ]
     }
    ],
    "source": [
     "import arviz as az\n",
     "import numpy as np\n",
-    "import pymc3 as pm\n",
+    "import pymc as pm\n",
     "import seaborn as sns\n",
     "\n",
     "from matplotlib import pyplot as plt\n",
     "from matplotlib.lines import Line2D\n",
     "from matplotlib.patches import Ellipse\n",
     "\n",
-    "print(f\"Running on PyMC3 v{pm.__version__}\")"
+    "print(f\"Running on PyMC v{pm.__version__}\")"
    ]
   },
   {
@@ -135,7 +135,7 @@
     "\n",
     "The LKJ distribution provides a prior on the correlation matrix, $\\mathbf{C} = \\textrm{Corr}(x_i, x_j)$, which, combined with priors on the standard deviations of each component, [induces](http://www3.stat.sinica.edu.tw/statistica/oldpdf/A10n416.pdf) a prior on the covariance matrix, $\\Sigma$. Since inverting $\\Sigma$ is numerically unstable and inefficient, it is computationally advantageous to use the [Cholesky decompositon](https://en.wikipedia.org/wiki/Cholesky_decomposition) of $\\Sigma$, $\\Sigma = \\mathbf{L} \\mathbf{L}^{\\top}$, where $\\mathbf{L}$ is a lower-triangular matrix. This decompositon allows computation of the term $(\\mathbf{x} - \\mu)^{\\top} \\Sigma^{-1} (\\mathbf{x} - \\mu)$ using back-substitution, which is more numerically stable and efficient than direct matrix inversion.\n",
     "\n",
-    "PyMC3 supports LKJ priors for the Cholesky decomposition of the covariance matrix via the [LKJCholeskyCov](../api/distributions/multivariate.rst) distribution. This distribution has parameters `n` and `sd_dist`, which are the dimension of the observations, $\\mathbf{x}$, and the PyMC3 distribution of the component standard deviations, respectively. It also has a hyperparamter `eta`, which controls the amount of correlation between components of $\\mathbf{x}$. The LKJ distribution has the density $f(\\mathbf{C}\\ |\\ \\eta) \\propto |\\mathbf{C}|^{\\eta - 1}$, so $\\eta = 1$ leads to a uniform distribution on correlation matrices, while the magnitude of correlations between components decreases as $\\eta \\to \\infty$.\n",
+    "PyMC supports LKJ priors for the Cholesky decomposition of the covariance matrix via the [LKJCholeskyCov](../api/distributions/multivariate.rst) distribution. This distribution has parameters `n` and `sd_dist`, which are the dimension of the observations, $\\mathbf{x}$, and the PyMC distribution of the component standard deviations, respectively. It also has a hyperparamter `eta`, which controls the amount of correlation between components of $\\mathbf{x}$. The LKJ distribution has the density $f(\\mathbf{C}\\ |\\ \\eta) \\propto |\\mathbf{C}|^{\\eta - 1}$, so $\\eta = 1$ leads to a uniform distribution on correlation matrices, while the magnitude of correlations between components decreases as $\\eta \\to \\infty$.\n",
     "\n",
     "In this example, we model the standard deviations with $\\textrm{Exponential}(1.0)$ priors, and the correlation matrix as $\\mathbf{C} \\sim \\textrm{LKJ}(\\eta = 2)$."
    ]
@@ -267,7 +267,7 @@
      "text": [
       "Auto-assigning NUTS sampler...\n",
       "Initializing NUTS using adapt_diag...\n",
-      "WARNING (theano.tensor.blas): We did not find a dynamic library in the library_dir of the library we use for blas. If you use ATLAS, make sure to compile it with dynamics library.\n",
+      "WARNING (aesara.tensor.blas): We did not find a dynamic library in the library_dir of the library we use for blas. If you use ATLAS, make sure to compile it with dynamics library.\n",
       "Multiprocess sampling (4 chains in 4 jobs)\n",
       "NUTS: [μ, chol]\n"
      ]
@@ -544,7 +544,7 @@
     "    trace = pm.sample(\n",
     "        random_seed=RANDOM_SEED,\n",
     "        init=\"adapt_diag\",\n",
-    "        return_inferencedata=True,\n",
+    "        ,\n",
     "        idata_kwargs={\"dims\": {\"chol_stds\": [\"axis\"], \"chol_corr\": [\"axis\", \"axis_bis\"]}},\n",
     "    )\n",
     "az.summary(trace, var_names=\"~chol\", round_to=2)"
@@ -761,14 +761,14 @@
       "Python version       : 3.8.10\n",
       "IPython version      : 7.25.0\n",
       "\n",
-      "theano: 1.1.2\n",
+      "aesara: 1.1.2\n",
       "xarray: 0.17.0\n",
       "\n",
       "matplotlib: 3.3.4\n",
       "arviz     : 0.11.2\n",
       "seaborn   : 0.11.1\n",
       "numpy     : 1.21.0\n",
-      "pymc3     : 3.11.2\n",
+      "pymc     : 3.11.2\n",
       "\n",
       "Watermark: 2.2.0\n",
       "\n"
@@ -777,7 +777,7 @@
    ],
    "source": [
     "%load_ext watermark\n",
-    "%watermark -n -u -v -iv -w -p theano,xarray"
+    "%watermark -n -u -v -iv -w -p aesara,xarray"
    ]
   }
  ],
 
@@ -10,7 +10,7 @@
      "name": "stdout",
      "output_type": "stream",
      "text": [
-      "Running on PyMC3 v3.11.2\n"
+      "Running on PyMC v3.11.2\n"
      ]
     }
    ],
@@ -22,12 +22,12 @@
     "import matplotlib.pyplot as plt\n",
     "import numpy as np\n",
     "import pandas as pd\n",
-    "import pymc3 as pm\n",
-    "import pymc3.math as pmm\n",
+    "import pymc as pm\n",
+    "import pymc.math as pmm\n",
     "\n",
     "from scipy.stats import bernoulli, expon\n",
     "\n",
-    "print(f\"Running on PyMC3 v{pm.__version__}\")"
+    "print(f\"Running on PyMC v{pm.__version__}\")"
    ]
   },
   {
@@ -98,13 +98,13 @@
     "\n",
     "With this, we can sample from the joint posterior of $\\theta_A, \\theta_B$. \n",
     "\n",
-    "You may have noticed that the Beta distribution is the conjugate prior for the Binomial, so we don't need MCMC sampling to estimate the posterior (the exact solution can be found in the VWO paper). We'll still demonstrate how sampling can be done with PyMC3 though, and doing this makes it easier to extend the model with different priors, dependency assumptions, etc.\n",
+    "You may have noticed that the Beta distribution is the conjugate prior for the Binomial, so we don't need MCMC sampling to estimate the posterior (the exact solution can be found in the VWO paper). We'll still demonstrate how sampling can be done with PyMC though, and doing this makes it easier to extend the model with different priors, dependency assumptions, etc.\n",
     "\n",
     "Finally, remember that our outcome of interest is whether B is better than A. A common measure in practice for whether B is better than is the _relative uplift in conversion rates_, i.e. the percentage difference of $\\theta_B$ over $\\theta_A$:\n",
     "\n",
     "$$\\mathrm{reluplift}_B = \\theta_B / \\theta_A - 1$$\n",
     "\n",
-    "We'll implement this model setup in PyMC3 below."
+    "We'll implement this model setup in PyMC below."
    ]
   },
   {
@@ -181,7 +181,7 @@
    "id": "8e1f6ca4",
    "metadata": {},
    "source": [
-    "Note that we can pass in arbitrary values for the observed data in these prior predictive checks. PyMC3 will not use that data when sampling from the prior predictive distribution."
+    "Note that we can pass in arbitrary values for the observed data in these prior predictive checks. PyMC will not use that data when sampling from the prior predictive distribution."
    ]
   },
   {
@@ -404,9 +404,9 @@
     "    generated = generate_binomial_data(variants, true_rates, samples_per_variant)\n",
     "    data = [BinomialData(**generated[v].to_dict()) for v in variants]\n",
     "    with ConversionModelTwoVariant(priors=weak_prior).create_model(data):\n",
-    "        trace_weak = pm.sample(draws=5000, return_inferencedata=True, cores=1, chains=2)\n",
+    "        trace_weak = pm.sample(draws=5000, cores=1, chains=2)\n",
     "    with ConversionModelTwoVariant(priors=strong_prior).create_model(data):\n",
-    "        trace_strong = pm.sample(draws=5000, return_inferencedata=True, cores=1, chains=2)\n",
+    "        trace_strong = pm.sample(draws=5000, cores=1, chains=2)\n",
     "\n",
     "    true_rel_uplift = true_rates[1] / true_rates[0] - 1\n",
     "\n",
@@ -884,7 +884,7 @@
     "    generated = generate_binomial_data(variants, true_rates, samples_per_variant)\n",
     "    data = [BinomialData(**generated[v].to_dict()) for v in variants]\n",
     "    with ConversionModel(priors).create_model(data=data, comparison_method=comparison_method):\n",
-    "        trace = pm.sample(draws=5000, return_inferencedata=True, chains=2, cores=1)\n",
+    "        trace = pm.sample(draws=5000, chains=2, cores=1)\n",
     "\n",
     "    n_plots = len(variants)\n",
     "    fig, axs = plt.subplots(nrows=n_plots, ncols=1, figsize=(3 * n_plots, 7), sharex=True)\n",
@@ -1439,7 +1439,7 @@
     "    with RevenueModel(conversion_rate_prior, mean_purchase_prior).create_model(\n",
     "        data, comparison_method\n",
     "    ):\n",
-    "        trace = pm.sample(draws=5000, return_inferencedata=True, chains=2, cores=1)\n",
+    "        trace = pm.sample(draws=5000, chains=2, cores=1)\n",
     "\n",
     "    n_plots = len(variants)\n",
     "    fig, axs = plt.subplots(nrows=n_plots, ncols=1, figsize=(3 * n_plots, 7), sharex=True)\n",
@@ -1895,9 +1895,9 @@
     "* How do we plan the length and size of A/B tests using power analysis, if we're using Bayesian models to analyse the results?\n",
     "* Outside of the conversion rates (bernoulli random variables for each visitor), many value distributions in online software cannot be fit with nice densities like Normal, Gamma, etc. How do we model these?\n",
     "\n",
-    "Various textbooks and online resources dive into these areas in more detail. [Doing Bayesian Data Analysis](http://doingbayesiandataanalysis.blogspot.com/) by John Kruschke is a great resource, and has been translated to PyMC3 here: https://github.com/JWarmenhoven/DBDA-python.\n",
+    "Various textbooks and online resources dive into these areas in more detail. [Doing Bayesian Data Analysis](http://doingbayesiandataanalysis.blogspot.com/) by John Kruschke is a great resource, and has been translated to PyMC here: https://github.com/JWarmenhoven/DBDA-python.\n",
     "\n",
-    "We also plan to create more PyMC3 tutorials on these topics, so stay tuned!\n",
+    "We also plan to create more PyMC tutorials on these topics, so stay tuned!\n",
     "\n",
     "---\n",
     "\n",
@@ -1924,10 +1924,10 @@
       "Python version       : 3.8.6\n",
       "IPython version      : 7.23.1\n",
       "\n",
-      "theano: 1.1.2\n",
+      "aesara: 1.1.2\n",
       "xarray: 0.18.0\n",
       "\n",
-      "pymc3     : 3.11.2\n",
+      "pymc     : 3.11.2\n",
       "arviz     : 0.11.2\n",
       "matplotlib: 3.4.2\n",
       "pandas    : 1.2.4\n",
@@ -1940,7 +1940,7 @@
    ],
    "source": [
     "%load_ext watermark\n",
-    "%watermark -n -u -v -iv -w -p theano,xarray"
+    "%watermark -n -u -v -iv -w -p aesara,xarray"
    ]
   }
  ],
Original file line number	Diff line number	Diff line change
`@@ -16,7 +16,7 @@`
`16`	`16`	`"name": "stdout",`
`17`	`17`	`"output_type": "stream",`
`18`	`18`	`"text": [`
`19`		`- "Running on PyMC3 v3.11.0\n"`
	`19`	`+ "Running on PyMC v3.11.0\n"`
`20`	`20`	`]`
`21`	`21`	`}`
`22`	`22`	`],`
`@@ -25,9 +25,9 @@`
`25`	`25`	`"import matplotlib.pyplot as plt\n",`
`26`	`26`	`"import numpy as np\n",`
`27`	`27`	`"import pandas as pd\n",`
`28`		`- "import pymc3 as pm\n",`
	`28`	`+ "import pymc as pm\n",`
`29`	`29`	`"\n",`
`30`		`- "print(f\"Running on PyMC3 v{pm.__version__}\")"`
	`30`	`+ "print(f\"Running on PyMC v{pm.__version__}\")"`
`31`	`31`	`]`
`32`	`32`	`},`
`33`	`33`	`{`
`@@ -222,7 +222,7 @@`
`222`	`222`	`"cell_type": "markdown",`
`223`	`223`	`"metadata": {},`
`224`	`224`	`"source": [`
`225`		`- "Since PyMC3 parameterizes the Student-T in terms of precision, rather than standard deviation, we must transform the standard deviations before specifying our likelihoods."`
	`225`	`+ "Since PyMC parameterizes the Student-T in terms of precision, rather than standard deviation, we must transform the standard deviations before specifying our likelihoods."`
`226`	`226`	`]`
`227`	`227`	`},`
`228`	`228`	`{`
`@@ -323,7 +323,7 @@`
`323`	`323`	`],`
`324`	`324`	`"source": [`
`325`	`325`	`"with model:\n",`
`326`		`- " trace = pm.sample(2000, return_inferencedata=True)"`
	`326`	`+ " trace = pm.sample(2000)"`
`327`	`327`	`]`
`328`	`328`	`},`
`329`	`329`	`{`
`@@ -574,7 +574,7 @@`
`574`	`574`	`"source": [`
`575`	`575`	`"The original pymc2 implementation was written by Andrew Straw and can be found here: https://github.com/strawlab/best\n",`
`576`	`576`	`"\n",`
`577`		`- "Ported to PyMC3 by [Thomas Wiecki](https://twitter.com/twiecki) (c) 2015, updated by Chris Fonnesbeck."`
	`577`	`+ "Ported to PyMC by [Thomas Wiecki](https://twitter.com/twiecki) (c) 2015, updated by Chris Fonnesbeck."`
`578`	`578`	`]`
`579`	`579`	`},`
`580`	`580`	`{`
`@@ -596,7 +596,7 @@`
`596`	`596`	`"numpy : 1.19.2\n",`
`597`	`597`	`"matplotlib: 3.3.2\n",`
`598`	`598`	`"arviz : 0.11.2\n",`
`599`		`- "pymc3 : 3.11.0\n",`
	`599`	`+ "pymc : 3.11.0\n",`
`600`	`600`	`"\n",`
`601`	`601`	`"Watermark: 2.2.0\n",`
`602`	`602`	`"\n"`