Skip to content

Commit a734a05

Browse files
author
Yoav Ram
committed
fix equation, added docs
1 parent 2c8d18c commit a734a05

File tree

1 file changed

+63
-15
lines changed

1 file changed

+63
-15
lines changed

likelihood ratio test.ipynb

+63-15
Original file line numberDiff line numberDiff line change
@@ -2,24 +2,15 @@
22
"cells": [
33
{
44
"cell_type": "code",
5-
"execution_count": 1,
5+
"execution_count": null,
66
"metadata": {
77
"collapsed": false
88
},
9-
"outputs": [
10-
{
11-
"name": "stderr",
12-
"output_type": "stream",
13-
"text": [
14-
":0: FutureWarning: IPython widgets are experimental and may change in the future.\n"
15-
]
16-
}
17-
],
9+
"outputs": [],
1810
"source": [
1911
"%matplotlib inline\n",
2012
"from numpy import linspace, log, exp\n",
2113
"from numpy.random import normal\n",
22-
"from scipy import stats\n",
2314
"from lmfit import Model\n",
2415
"import seaborn as sns\n",
2516
"sns.set_style('ticks')\n",
@@ -30,14 +21,31 @@
3021
"cell_type": "markdown",
3122
"metadata": {},
3223
"source": [
33-
"The likelihood ratio test works like this (based on [this](http://www.stat.sc.edu/~habing/courses/703/GLRTExample.pdf)):\n",
24+
"# The likelihood ratio test: Python implementation"
25+
]
26+
},
27+
{
28+
"cell_type": "markdown",
29+
"metadata": {},
30+
"source": [
31+
"For two models, one nested in the other (meaning that the nested model estimated parameters are a subset of the nesting model), the test statistic $D$ is (based on [this](http://www.stat.sc.edu/~habing/courses/703/GLRTExample.pdf)):\n",
3432
"\n",
3533
"$$\n",
36-
"D = -2 log \\Lambda = -2 log ( \\Big(\\frac{\\sum{(X_i - \\hat{X_i}(\\theta_1))^2}}{\\sum{(X_i - \\hat{X_i}(\\theta_0))^2}}\\Big)^{n/2} ) \\\\\n",
37-
"lim_{n \\to \\infty} D = \\chi^2_{df=\\Delta}\n",
34+
"\\Lambda = \\Big( \\Big(\\frac{\\sum{(X_i - \\hat{X_i}(\\theta_1))^2}}{\\sum{(X_i - \\hat{X_i}(\\theta_0))^2}}\\Big)^{n/2} \\Big) \\\\\n",
35+
"D = -2 log \\Lambda \\\\\n",
36+
"lim_{n \\to \\infty} D \\sim \\chi^2_{df=\\Delta}\n",
3837
"$$\n",
3938
"\n",
40-
"where $\\Lambda$ is the likelihood ratio, $D$ is the statistic, $X_{i}$ are the data points, $X_i(\\theta)$ is the model prediction with parameters $\\theta$, $\\theta_i$ is the parameters estimation for model $i$, $n$ is the number of data points and $\\Delta$ is the difference in number of parameters between the models."
39+
"where $\\Lambda$ is the likelihood ratio, $D$ is the statistic, $X_{i}$ are the data points, $\\hat{X_i}(\\theta)$ is the model prediction with parameters $\\theta$, $\\theta_i$ is the parameters estimation for model $i$, $n$ is the number of data points and $\\Delta$ is the difference in number of parameters between the models."
40+
]
41+
},
42+
{
43+
"cell_type": "markdown",
44+
"metadata": {},
45+
"source": [
46+
"The python implementation below compares between two `lmfit.ModelFit` objects. These are the results of fitting models to the same data set using the [`lmfit` package](lmfit.github.io/lmfit-py/). \n",
47+
"\n",
48+
"The function compares between model fit `m0` and `m1` and assumes that `m0` is nested in `m1`, meaning that the set of varying parameters of `m0` is a subset of the varying parameters of `m1`. The property `chisqr` of the `ModelFit` objects is the sum of the square of the residuals of the fit. `ndata` is the number of data points. `nvarys` is the number of varying parameters."
4149
]
4250
},
4351
{
@@ -66,6 +74,25 @@
6674
" return stats.chisqprob(D, ddf), D, ddf"
6775
]
6876
},
77+
{
78+
"cell_type": "markdown",
79+
"metadata": {},
80+
"source": [
81+
"## Test on a simple model\n",
82+
"\n",
83+
"We test the function on data generated from the equation:\n",
84+
"\n",
85+
"$$\n",
86+
"y = b + e^{-a t} + N(0, \\sigma^2)\n",
87+
"$$\n",
88+
"\n",
89+
"where $a$ and $b$ are the parameters, $t$ is the variable, and $N$ is the normal distribution.\n",
90+
"\n",
91+
"We fit a nesting model `model_fit1` (the alternative hypothesis of the test). This model estimates both $a$ and $b$.\n",
92+
"We also fit a nested model `model_fit0` (the nul hypothesis of the test) in which either $a$ or $b$ is fixed at an initial value.\n",
93+
"We than plot both model fits, print the estimated parameters, the test statistic and the p-value of the test."
94+
]
95+
},
6996
{
7097
"cell_type": "code",
7198
"execution_count": 4,
@@ -5134,6 +5161,16 @@
51345161
"source": [
51355162
"## Compare with R\n",
51365163
"\n",
5164+
"Here we compare the Python implementation with the R implementation - `lmtest::lrtest`. The data is generated from a quadratic polynomial:\n",
5165+
"\n",
5166+
"$$\n",
5167+
"y = at^2 + bt + c + N(0,\\sigma^2)\n",
5168+
"$$\n",
5169+
"\n",
5170+
"and we fit two models - the nested model is a linear function and the nesting model is a quadratic polynomial.\n",
5171+
"\n",
5172+
"### Note\n",
5173+
"\n",
51375174
"In order to get this to work you will need to make sure the `lmtest` package is installed (don't install via RStudio - use regular command line R) and that the %R_HOME% and %R_USER% are set to where R is running from the command line and where the packages are installed to when running from the command line."
51385175
]
51395176
},
@@ -5232,6 +5269,17 @@
52325269
" %Rpull pvalue\n",
52335270
" print pvalue[0]"
52345271
]
5272+
},
5273+
{
5274+
"cell_type": "markdown",
5275+
"metadata": {},
5276+
"source": [
5277+
"## License\n",
5278+
"\n",
5279+
"This notebook was written by [Yoav Ram](http://www.yoavram.com) and [Uri Obolski](https://sites.google.com/site/hadanylab/people/uri-obolski). The latest version can be found at [ipython.yoavram.com](http://nbviewer.ipython.org/github/yoavram/ipython-notebooks/blob/master/likelihood%20ratio%20test.ipynb). \n",
5280+
"\n",
5281+
"The code is released with a CC-BY-SA 3.0 license."
5282+
]
52355283
}
52365284
],
52375285
"metadata": {

0 commit comments

Comments
 (0)