{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "9883c1ad-7ebc-4e48-86ca-582dea0584ad",
   "metadata": {},
   "outputs": [],
   "source": [
    "import arviz as az\n",
    "import numpy as np\n",
    "import pymc as pm\n",
    "\n",
    "%load_ext lab_black"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c8819521-48a6-44a7-bb2e-56e5e87943f4",
   "metadata": {
    "tags": []
   },
   "source": [
    "# Counts of Alpha Particles*\n",
    "\n",
    "From the Unit 4.3 [supplementary exercises](https://www2.isye.gatech.edu/isye6420/Bank/Exercises43.pdf). \n",
    "\n",
    "```{warning}\n",
    "This page has the answer to a supplementary exercise! If you want to try the problem without seeing the answer, don't read any further!\n",
    "```\n",
    "\n",
    "Rutherford and Geiger (Rutherford, Geiger and Bateman, Phil. Mag. 20, 698, 19) provided counts of α-particles in their experiment. The counts, given in the table below, can be well–modeled by the Poisson distribution:\n",
    "\n",
    "|Particle ct.|0|1|2|3|4|5|6|7|8|9|10|11|$\\ge$12|\n",
    "|-|-|-|-|-|-|-|-|-|-|-|-|-|-|\n",
    "|Interval ct.|57|203|383|525|532|408|273|139|45|27|10|4|2|\n",
    "|Theoretical|54|210|407|525|508|394|254|140|68|29|11|4|6|\n",
    "1. Find sample size n and sample mean $\\bar{X}$. In calculations for $\\bar{X}$ take ≥ 12 as 12.\n",
    "\n",
    "The sample size is the sum of the number of intervals here. Then the sample mean is the total particle count divided by sample size. I tracked down the original source of the data, [which you can find on page 701 here](https://archive.org/details/londonedinburg6201910lond/page/700/mode/2up). The X data here is the particle count, while y is the number of 7.5-second intervals where that count was observed. The professor's example cuts off at $ \\ge$ 12 particles, but the original data went to 14 particles.\n",
    "\n",
    "$$\\bar{X} = 10094/2608 = 3.8704$$\n",
    "\n",
    "2. Elicit a gamma prior for λ with rate parameter β = 10 and shape parameter α selected in such a way that the prior mean is 7.\n",
    "\n",
    "The Gamma distribution mean is $\\frac{\\alpha}{\\beta}$. So we need a $\\alpha$ of 70 for the mean to equal 7.\n",
    "\n",
    "3. Find the Bayes estimator of λ using the prior from (1). Is the problem conjugate? Use the fact that $\\sum_{i=1}^n X_i \\sim Poi(n\\lambda)$.\n",
    "\n",
    "\n",
    "The Bayes estimator we're looking for is the mean of the posterior distribution. If our prior is $Ga(70, 10)$ and our likelihood is $Pois(n\\lambda)$, the posterior is proportional to:\n",
    "\n",
    "$$\\lambda^y e^{-n\\lambda} \\lambda^{\\alpha -1}e^{-\\beta \\lambda}$$\n",
    "\n",
    "$$\\lambda^{y + \\alpha -1} e^{-(n + \\beta)\\lambda}$$\n",
    "\n",
    "\n",
    "which is $Gamma(y + \\alpha, n + \\beta)$. So the Bayes estimator is:\n",
    "\n",
    "$$\\frac{y + \\alpha}{n + \\beta} = \\frac{10094 + 70}{2608 + 10} = 3.8824$$\n",
    "\n",
    "4. Write a WinBUGS script that simulates the Bayes estimator for λ and compare its output with the analytic solution from (3).\n",
    "\n",
    "Pymc version below.\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "0065e274-6ccf-453f-bca2-36b3e3bf7a1a",
   "metadata": {
    "tags": [
     "hide-output"
    ]
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Auto-assigning NUTS sampler...\n",
      "Initializing NUTS using jitter+adapt_diag...\n",
      "Multiprocess sampling (4 chains in 4 jobs)\n",
      "NUTS: [λ]\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "\n",
       "<style>\n",
       "    /* Turns off some styling */\n",
       "    progress {\n",
       "        /* gets rid of default border in Firefox and Opera. */\n",
       "        border: none;\n",
       "        /* Needs to be in here for Safari polyfill so background images work as expected. */\n",
       "        background-size: auto;\n",
       "    }\n",
       "    progress:not([value]), progress:not([value])::-webkit-progress-bar {\n",
       "        background: repeating-linear-gradient(45deg, #7e7e7e, #7e7e7e 10px, #5c5c5c 10px, #5c5c5c 20px);\n",
       "    }\n",
       "    .progress-bar-interrupted, .progress-bar-interrupted::-webkit-progress-bar {\n",
       "        background: #F44336;\n",
       "    }\n",
       "</style>\n"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "\n",
       "    <div>\n",
       "      <progress value='12000' class='' max='12000' style='width:300px; height:20px; vertical-align: middle;'></progress>\n",
       "      100.00% [12000/12000 00:00&lt;00:00 Sampling 4 chains, 0 divergences]\n",
       "    </div>\n",
       "    "
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Sampling 4 chains for 1_000 tune and 2_000 draw iterations (4_000 + 8_000 draws total) took 1 seconds.\n"
     ]
    }
   ],
   "source": [
    "freq = np.array([57, 203, 383, 525, 532, 408, 273, 139, 45, 27, 10, 4, 2])\n",
    "X = np.arange(13)\n",
    "n = np.sum(freq)\n",
    "sum_x = np.sum(X * freq)\n",
    "\n",
    "α = 70\n",
    "β = 10\n",
    "\n",
    "with pm.Model() as m:\n",
    "    λ = pm.Gamma(\"λ\", alpha=α, beta=β)\n",
    "    nlambda = n * λ\n",
    "    pm.Poisson(\"y\", nlambda, observed=sum_x)\n",
    "\n",
    "    trace = pm.sample(2000)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "1674b69c",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>mean</th>\n",
       "      <th>sd</th>\n",
       "      <th>hdi_3%</th>\n",
       "      <th>hdi_97%</th>\n",
       "      <th>mcse_mean</th>\n",
       "      <th>mcse_sd</th>\n",
       "      <th>ess_bulk</th>\n",
       "      <th>ess_tail</th>\n",
       "      <th>r_hat</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>λ</th>\n",
       "      <td>3.881</td>\n",
       "      <td>0.038</td>\n",
       "      <td>3.812</td>\n",
       "      <td>3.957</td>\n",
       "      <td>0.001</td>\n",
       "      <td>0.0</td>\n",
       "      <td>3413.0</td>\n",
       "      <td>4795.0</td>\n",
       "      <td>1.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    mean     sd  hdi_3%  hdi_97%  mcse_mean  mcse_sd  ess_bulk  ess_tail  \\\n",
       "λ  3.881  0.038   3.812    3.957      0.001      0.0    3413.0    4795.0   \n",
       "\n",
       "   r_hat  \n",
       "λ    1.0  "
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "az.summary(trace)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "26c98c6a-218f-4058-a090-184ca075d948",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Last updated: Wed Mar 22 2023\n",
      "\n",
      "Python implementation: CPython\n",
      "Python version       : 3.11.0\n",
      "IPython version      : 8.9.0\n",
      "\n",
      "pytensor: 2.10.1\n",
      "\n",
      "arviz: 0.15.1\n",
      "numpy: 1.24.2\n",
      "pymc : 5.1.2\n",
      "\n"
     ]
    }
   ],
   "source": [
    "%load_ext watermark\n",
    "%watermark -n -u -v -iv -p pytensor"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5e49700f-e04e-4662-954f-b6aaa482c9fa",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}