first commit to chapter 7; just skeleton

CamDavidsonPilon · CamDavidsonPilon · commit 54580fb003e1 · 2013-06-03T22:15:02.000-04:00
diff --git a/Chapter7_BayesianMachineLearning/DontOverfit.ipynb b/Chapter7_BayesianMachineLearning/DontOverfit.ipynb
@@ -0,0 +1,44 @@
+{
+ "metadata": {
+  "name": "DontOverfit"
+ },
+ "nbformat": 3,
+ "nbformat_minor": 0,
+ "worksheets": [
+  {
+   "cells": [
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "## Implementation of Salisman's Don't Overfit submission"
+     ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "From [Kaggle](http://www.kaggle.com/c/overfitting)\n",
+      ">In order to achieve this we have created a simulated data set with 200 variables and 20,000 cases. An \u2018equation\u2019 based on this data was created in order to generate a Target to be predicted. Given the all 20,000 cases, the problem is very easy to solve \u2013 but you only get given the Target value of 250 cases \u2013 the task is to build a model that gives the best predictions on the remaining 19,750 cases."
+     ]
+    },
+    {
+     "cell_type": "code",
+     "collapsed": false,
+     "input": [
+      "import gzip\n",
+      "import requests\n",
+      "url = \"\"\n",
+      "data = requests.get(url)\n",
+      "f = gzip.open('file.txt.gz', 'rb')\n",
+      "file_content = f.read()\n"
+     ],
+     "language": "python",
+     "metadata": {},
+     "outputs": []
+    }
+   ],
+   "metadata": {}
+  }
+ ]
+}
diff --git a/Chapter7_BayesianMachineLearning/MachineLearning.ipynb b/Chapter7_BayesianMachineLearning/MachineLearning.ipynb
@@ -0,0 +1,159 @@
+{
+ "metadata": {
+  "name": "MachineLearning"
+ },
+ "nbformat": 3,
+ "nbformat_minor": 0,
+ "worksheets": [
+  {
+   "cells": [
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "List of topics to cover:\n",
+      "\n",
+      "- Bayesian solution to overfitting\n",
+      "  - Salisman's solution to the Don't Overfit\n",
+      "- Predictive distributions; \"how do I evaluate testing data?\"\n",
+      "- model fitting, BIC + visualization tools\n",
+      "- Gaussian Processes\n",
+      "\n",
+      "\n",
+      "Would be nice/cool to cover:\n",
+      "\n",
+      "- classification models (using the books text)\n",
+      "- Bayesian networks?"
+     ]
+    },
+    {
+     "cell_type": "code",
+     "collapsed": false,
+     "input": [],
+     "language": "python",
+     "metadata": {},
+     "outputs": []
+    },
+    {
+     "cell_type": "code",
+     "collapsed": false,
+     "input": [],
+     "language": "python",
+     "metadata": {},
+     "outputs": []
+    },
+    {
+     "cell_type": "code",
+     "collapsed": false,
+     "input": [],
+     "language": "python",
+     "metadata": {},
+     "outputs": []
+    },
+    {
+     "cell_type": "code",
+     "collapsed": false,
+     "input": [],
+     "language": "python",
+     "metadata": {},
+     "outputs": []
+    },
+    {
+     "cell_type": "code",
+     "collapsed": false,
+     "input": [
+      "from IPython.core.display import HTML\n",
+      "def css_styling():\n",
+      "    styles = open(\"../styles/custom.css\", \"r\").read()\n",
+      "    return HTML(styles)\n",
+      "css_styling()"
+     ],
+     "language": "python",
+     "metadata": {},
+     "outputs": [
+      {
+       "html": [
+        "<style>\n",
+        "    @font-face {\n",
+        "        font-family: \"Computer Modern\";\n",
+        "        src: url('http://mirrors.ctan.org/fonts/cm-unicode/fonts/otf/cmunss.otf');\n",
+        "    }\n",
+        "    div.cell{\n",
+        "        width:800px;\n",
+        "        margin-left:auto;\n",
+        "        margin-right:auto;\n",
+        "    }\n",
+        "    h1 {\n",
+        "        font-family: Helvetica, serif;\n",
+        "    }\n",
+        "    h4{\n",
+        "        margin-top:12px;\n",
+        "        margin-bottom: 3px;\n",
+        "       }\n",
+        "    div.text_cell_render{\n",
+        "        font-family: Computer Modern, \"Helvetica Neue\", Arial, Helvetica, Geneva, sans-serif;\n",
+        "        line-height: 145%;\n",
+        "        font-size: 130%;\n",
+        "        width:800px;\n",
+        "        margin-left:auto;\n",
+        "        margin-right:auto;\n",
+        "    }\n",
+        "    .CodeMirror{\n",
+        "            font-family: \"Source Code Pro\", source-code-pro,Consolas, monospace;\n",
+        "    }\n",
+        "    .prompt{\n",
+        "        display: None;\n",
+        "    }\n",
+        "    .text_cell_render h5 {\n",
+        "        font-weight: 300;\n",
+        "        font-size: 16pt;\n",
+        "        color: #4057A1;\n",
+        "        font-style: italic;\n",
+        "        margin-bottom: .5em;\n",
+        "        margin-top: 0.5em;\n",
+        "        display: block;\n",
+        "    }\n",
+        "    \n",
+        "    .warning{\n",
+        "        color: rgb( 240, 20, 20 )\n",
+        "        }\n",
+        "       \n",
+        "</style>\n",
+        "<script>\n",
+        "    MathJax.Hub.Config({\n",
+        "                        TeX: {\n",
+        "                           extensions: [\"AMSmath.js\"]\n",
+        "                           },\n",
+        "                tex2jax: {\n",
+        "                    inlineMath: [ ['$','$'], [\"\\\\(\",\"\\\\)\"] ],\n",
+        "                    displayMath: [ ['$$','$$'], [\"\\\\[\",\"\\\\]\"] ]\n",
+        "                },\n",
+        "                displayAlign: 'center', // Change this to 'center' to center equations.\n",
+        "                \"HTML-CSS\": {\n",
+        "                    styles: {'.MathJax_Display': {\"margin\": 4}}\n",
+        "                }\n",
+        "        });\n",
+        "</script>"
+       ],
+       "output_type": "pyout",
+       "prompt_number": 1,
+       "text": [
+        "<IPython.core.display.HTML at 0x5beaeb8>"
+       ]
+      }
+     ],
+     "prompt_number": 1
+    },
+    {
+     "cell_type": "code",
+     "collapsed": false,
+     "input": [],
+     "language": "python",
+     "metadata": {},
+     "outputs": []
+    }
+   ],
+   "metadata": {}
+  }
+ ]
+}