{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "81c27040",
   "metadata": {},
   "outputs": [],
   "source": [
    "import pymc as pm\n",
    "import numpy as np\n",
    "import arviz as az\n",
    "from pymc.math import switch, ge, exp\n",
    "\n",
    "%load_ext lab_black"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "37138bca",
   "metadata": {},
   "source": [
    "# Revisiting UK Coal Mining Disasters*\n",
    "\n",
    "Adapted from [Unit 10: disasters.odc](https://raw.githubusercontent.com/areding/6420-pymc/main/original_examples/Codes4Unit10/disasters.odc).\n",
    "\n",
    "Data can be found [here](https://raw.githubusercontent.com/areding/6420-pymc/main/data/r.txt)."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e79c8243",
   "metadata": {},
   "source": [
    "Change Point Analysis, discussed previously in [this Unit 5](../unit5/Unit5-Gibbs3.ipynb) example about Gibbs sampling.\n",
    "\n",
    "The 112 data points represent the numbers of coal-mining disasters involving 10 or more men killed per year between  1851 and 1962. \n",
    " \n",
    "Based on the observation that the there was a significant decrease around 1900, it is suitable to apply a change-point model to divide the whole dataset into two periods; each period with its own distribution of number of disasters.\n",
    " \n",
    "The data set was compiled by Maguire, Pearson and Wynn in 1952 and \n",
    "updated by Jarrett (1978). This data have been used by a number of authors to illustrate various techniques that can be applied to point processes\n",
    "\n",
    "\n",
    " Maguire, B. A., Pearson, E. S. and Wynn, A. H. A. (1952). The time intervals between industrial accidents.   Biometrika, 39, 168\u0013†180.\n",
    "\n",
    " Jarrett, R.G. (1979). A note on the intervals between coal-mining disasters. Biometrika, 66, 191-193. \n",
    "\n",
    " Carlin, Gelfand, and Smith (1992) Heirarchical Bayesian Analysis of Changepoint Problems. Applied Statistics, 41, 389-405.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "45e6f391",
   "metadata": {},
   "outputs": [],
   "source": [
    "# X is the number of coal mine disasters per year\n",
    "# fmt: off\n",
    "X = np.array([4, 5, 4, 1, 0, 4, 3, 4, 0, 6, 3, 3, 4, 0, 2, 6, 3, 3, 5, 4, 5, 3, 1,\n",
    "     4, 4, 1, 5, 5, 3, 4, 2, 5, 2, 2, 3, 4, 2, 1, 3, 2, 2, 1, 1, 1, 1, 3,\n",
    "     0, 0, 1, 0, 1, 1, 0, 0, 3, 1, 0, 3, 2, 2, 0, 1, 1, 1, 0, 1, 0, 1, 0,\n",
    "     0, 0, 2, 1, 0, 0, 0, 1, 1, 0, 2, 3, 3, 1, 1, 2, 1, 1, 1, 1, 2, 4, 2,\n",
    "     0, 0, 0, 1, 4, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1])\n",
    "# fmt: on\n",
    "\n",
    "y = np.array([y for y in range(1851, 1963)])"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "527ddd88",
   "metadata": {},
   "source": [
    "## Model 1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "482d95b6",
   "metadata": {
    "tags": [
     "hide-output"
    ]
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Auto-assigning NUTS sampler...\n",
      "Initializing NUTS using jitter+adapt_diag...\n",
      "Multiprocess sampling (4 chains in 4 jobs)\n",
      "NUTS: [year, λ, μ]\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "\n",
       "<style>\n",
       "    /* Turns off some styling */\n",
       "    progress {\n",
       "        /* gets rid of default border in Firefox and Opera. */\n",
       "        border: none;\n",
       "        /* Needs to be in here for Safari polyfill so background images work as expected. */\n",
       "        background-size: auto;\n",
       "    }\n",
       "    progress:not([value]), progress:not([value])::-webkit-progress-bar {\n",
       "        background: repeating-linear-gradient(45deg, #7e7e7e, #7e7e7e 10px, #5c5c5c 10px, #5c5c5c 20px);\n",
       "    }\n",
       "    .progress-bar-interrupted, .progress-bar-interrupted::-webkit-progress-bar {\n",
       "        background: #F44336;\n",
       "    }\n",
       "</style>\n"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "\n",
       "    <div>\n",
       "      <progress value='12000' class='' max='12000' style='width:300px; height:20px; vertical-align: middle;'></progress>\n",
       "      100.00% [12000/12000 03:26&lt;00:00 Sampling 4 chains, 0 divergences]\n",
       "    </div>\n",
       "    "
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Sampling 4 chains for 1_000 tune and 2_000 draw iterations (4_000 + 8_000 draws total) took 206 seconds.\n",
      "Chain 0 reached the maximum tree depth. Increase `max_treedepth`, increase `target_accept` or reparameterize.\n",
      "Chain 1 reached the maximum tree depth. Increase `max_treedepth`, increase `target_accept` or reparameterize.\n",
      "Chain 2 reached the maximum tree depth. Increase `max_treedepth`, increase `target_accept` or reparameterize.\n",
      "Chain 3 reached the maximum tree depth. Increase `max_treedepth`, increase `target_accept` or reparameterize.\n"
     ]
    }
   ],
   "source": [
    "α = 4\n",
    "β = 1\n",
    "γ = 0.5\n",
    "δ = 1\n",
    "\n",
    "with pm.Model() as m:\n",
    "    year = pm.Uniform(\"year\", 1851, 1963)\n",
    "    λ = pm.Gamma(\"λ\", α, β)\n",
    "    μ = pm.Gamma(\"μ\", γ, δ)\n",
    "\n",
    "    diff = pm.Deterministic(\"diff\", μ - λ)\n",
    "\n",
    "    rate = λ + switch(ge(y - year, 0), 1, 0) * diff\n",
    "    pm.Poisson(\"lik\", mu=rate, observed=X)\n",
    "\n",
    "    trace = pm.sample(2000)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "9bd529a2",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>mean</th>\n",
       "      <th>sd</th>\n",
       "      <th>hdi_3%</th>\n",
       "      <th>hdi_97%</th>\n",
       "      <th>mcse_mean</th>\n",
       "      <th>mcse_sd</th>\n",
       "      <th>ess_bulk</th>\n",
       "      <th>ess_tail</th>\n",
       "      <th>r_hat</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>year</th>\n",
       "      <td>1890.405</td>\n",
       "      <td>2.405</td>\n",
       "      <td>1886.000</td>\n",
       "      <td>1894.461</td>\n",
       "      <td>0.108</td>\n",
       "      <td>0.076</td>\n",
       "      <td>537.0</td>\n",
       "      <td>500.0</td>\n",
       "      <td>1.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>λ</th>\n",
       "      <td>3.153</td>\n",
       "      <td>0.284</td>\n",
       "      <td>2.655</td>\n",
       "      <td>3.698</td>\n",
       "      <td>0.011</td>\n",
       "      <td>0.008</td>\n",
       "      <td>658.0</td>\n",
       "      <td>1025.0</td>\n",
       "      <td>1.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>μ</th>\n",
       "      <td>0.916</td>\n",
       "      <td>0.115</td>\n",
       "      <td>0.698</td>\n",
       "      <td>1.137</td>\n",
       "      <td>0.004</td>\n",
       "      <td>0.003</td>\n",
       "      <td>645.0</td>\n",
       "      <td>826.0</td>\n",
       "      <td>1.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>diff</th>\n",
       "      <td>-2.237</td>\n",
       "      <td>0.301</td>\n",
       "      <td>-2.805</td>\n",
       "      <td>-1.681</td>\n",
       "      <td>0.012</td>\n",
       "      <td>0.008</td>\n",
       "      <td>670.0</td>\n",
       "      <td>1185.0</td>\n",
       "      <td>1.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "          mean     sd    hdi_3%   hdi_97%  mcse_mean  mcse_sd  ess_bulk  \\\n",
       "year  1890.405  2.405  1886.000  1894.461      0.108    0.076     537.0   \n",
       "λ        3.153  0.284     2.655     3.698      0.011    0.008     658.0   \n",
       "μ        0.916  0.115     0.698     1.137      0.004    0.003     645.0   \n",
       "diff    -2.237  0.301    -2.805    -1.681      0.012    0.008     670.0   \n",
       "\n",
       "      ess_tail  r_hat  \n",
       "year     500.0    1.0  \n",
       "λ       1025.0    1.0  \n",
       "μ        826.0    1.0  \n",
       "diff    1185.0    1.0  "
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "az.summary(trace)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "168fe0b2",
   "metadata": {},
   "source": [
    "## Model 2"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "00853354",
   "metadata": {
    "tags": [
     "hide-output"
    ]
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Auto-assigning NUTS sampler...\n",
      "Initializing NUTS using jitter+adapt_diag...\n",
      "Multiprocess sampling (4 chains in 4 jobs)\n",
      "NUTS: [year, z0, z1]\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "\n",
       "<style>\n",
       "    /* Turns off some styling */\n",
       "    progress {\n",
       "        /* gets rid of default border in Firefox and Opera. */\n",
       "        border: none;\n",
       "        /* Needs to be in here for Safari polyfill so background images work as expected. */\n",
       "        background-size: auto;\n",
       "    }\n",
       "    progress:not([value]), progress:not([value])::-webkit-progress-bar {\n",
       "        background: repeating-linear-gradient(45deg, #7e7e7e, #7e7e7e 10px, #5c5c5c 10px, #5c5c5c 20px);\n",
       "    }\n",
       "    .progress-bar-interrupted, .progress-bar-interrupted::-webkit-progress-bar {\n",
       "        background: #F44336;\n",
       "    }\n",
       "</style>\n"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "\n",
       "    <div>\n",
       "      <progress value='12000' class='' max='12000' style='width:300px; height:20px; vertical-align: middle;'></progress>\n",
       "      100.00% [12000/12000 05:01&lt;00:00 Sampling 4 chains, 0 divergences]\n",
       "    </div>\n",
       "    "
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Sampling 4 chains for 1_000 tune and 2_000 draw iterations (4_000 + 8_000 draws total) took 302 seconds.\n",
      "The rhat statistic is larger than 1.01 for some parameters. This indicates problems during sampling. See https://arxiv.org/abs/1903.08008 for details\n",
      "The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details\n",
      "Chain 0 reached the maximum tree depth. Increase `max_treedepth`, increase `target_accept` or reparameterize.\n",
      "Chain 1 reached the maximum tree depth. Increase `max_treedepth`, increase `target_accept` or reparameterize.\n",
      "Chain 2 reached the maximum tree depth. Increase `max_treedepth`, increase `target_accept` or reparameterize.\n",
      "Chain 3 reached the maximum tree depth. Increase `max_treedepth`, increase `target_accept` or reparameterize.\n"
     ]
    }
   ],
   "source": [
    "with pm.Model() as m:\n",
    "    year = pm.Uniform(\"year\", 1851, 1963)\n",
    "    z0 = pm.Normal(\"z0\", 0, tau=0.00001)\n",
    "    z1 = pm.Normal(\"z1\", 0, tau=0.00001)\n",
    "\n",
    "    λ = pm.Deterministic(\"λ\", exp(z0))\n",
    "    μ = pm.Deterministic(\"μ\", exp(z0 + z1))\n",
    "\n",
    "    diff = pm.Deterministic(\"diff\", μ - λ)\n",
    "\n",
    "    rate = pm.math.exp(z0 + switch(ge(y - year, 0), 1, 0) * z1)\n",
    "    pm.Poisson(\"lik\", mu=rate, observed=X)\n",
    "\n",
    "    trace = pm.sample(2000)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "c81e6b10",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>mean</th>\n",
       "      <th>sd</th>\n",
       "      <th>hdi_3%</th>\n",
       "      <th>hdi_97%</th>\n",
       "      <th>mcse_mean</th>\n",
       "      <th>mcse_sd</th>\n",
       "      <th>ess_bulk</th>\n",
       "      <th>ess_tail</th>\n",
       "      <th>r_hat</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>z0</th>\n",
       "      <td>1.143</td>\n",
       "      <td>0.089</td>\n",
       "      <td>0.971</td>\n",
       "      <td>1.306</td>\n",
       "      <td>0.005</td>\n",
       "      <td>0.003</td>\n",
       "      <td>378.0</td>\n",
       "      <td>748.0</td>\n",
       "      <td>1.01</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>z1</th>\n",
       "      <td>-1.233</td>\n",
       "      <td>0.145</td>\n",
       "      <td>-1.519</td>\n",
       "      <td>-0.972</td>\n",
       "      <td>0.006</td>\n",
       "      <td>0.004</td>\n",
       "      <td>529.0</td>\n",
       "      <td>1039.0</td>\n",
       "      <td>1.01</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>year</th>\n",
       "      <td>1890.258</td>\n",
       "      <td>2.311</td>\n",
       "      <td>1886.003</td>\n",
       "      <td>1893.954</td>\n",
       "      <td>0.117</td>\n",
       "      <td>0.083</td>\n",
       "      <td>409.0</td>\n",
       "      <td>478.0</td>\n",
       "      <td>1.01</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>λ</th>\n",
       "      <td>3.147</td>\n",
       "      <td>0.279</td>\n",
       "      <td>2.620</td>\n",
       "      <td>3.662</td>\n",
       "      <td>0.014</td>\n",
       "      <td>0.010</td>\n",
       "      <td>378.0</td>\n",
       "      <td>748.0</td>\n",
       "      <td>1.01</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>μ</th>\n",
       "      <td>0.921</td>\n",
       "      <td>0.116</td>\n",
       "      <td>0.703</td>\n",
       "      <td>1.141</td>\n",
       "      <td>0.004</td>\n",
       "      <td>0.003</td>\n",
       "      <td>946.0</td>\n",
       "      <td>1579.0</td>\n",
       "      <td>1.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>diff</th>\n",
       "      <td>-2.227</td>\n",
       "      <td>0.289</td>\n",
       "      <td>-2.766</td>\n",
       "      <td>-1.689</td>\n",
       "      <td>0.015</td>\n",
       "      <td>0.011</td>\n",
       "      <td>368.0</td>\n",
       "      <td>683.0</td>\n",
       "      <td>1.01</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "          mean     sd    hdi_3%   hdi_97%  mcse_mean  mcse_sd  ess_bulk  \\\n",
       "z0       1.143  0.089     0.971     1.306      0.005    0.003     378.0   \n",
       "z1      -1.233  0.145    -1.519    -0.972      0.006    0.004     529.0   \n",
       "year  1890.258  2.311  1886.003  1893.954      0.117    0.083     409.0   \n",
       "λ        3.147  0.279     2.620     3.662      0.014    0.010     378.0   \n",
       "μ        0.921  0.116     0.703     1.141      0.004    0.003     946.0   \n",
       "diff    -2.227  0.289    -2.766    -1.689      0.015    0.011     368.0   \n",
       "\n",
       "      ess_tail  r_hat  \n",
       "z0       748.0   1.01  \n",
       "z1      1039.0   1.01  \n",
       "year     478.0   1.01  \n",
       "λ        748.0   1.01  \n",
       "μ       1579.0   1.00  \n",
       "diff     683.0   1.01  "
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "az.summary(trace)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "7d36b9b1",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Last updated: Wed Mar 22 2023\n",
      "\n",
      "Python implementation: CPython\n",
      "Python version       : 3.11.0\n",
      "IPython version      : 8.9.0\n",
      "\n",
      "pytensor: 2.10.1\n",
      "\n",
      "pymc : 5.1.2\n",
      "arviz: 0.15.1\n",
      "numpy: 1.24.2\n",
      "\n"
     ]
    }
   ],
   "source": [
    "%load_ext watermark\n",
    "%watermark -n -u -v -iv -p pytensor"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "bb38c890",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.0"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}