{ "cells": [ { "cell_type": "code", "execution_count": 1, "id": "9883c1ad-7ebc-4e48-86ca-582dea0584ad", "metadata": {}, "outputs": [], "source": [ "import arviz as az\n", "import numpy as np\n", "import pymc as pm\n", "\n", "%load_ext lab_black" ] }, { "cell_type": "markdown", "id": "c8819521-48a6-44a7-bb2e-56e5e87943f4", "metadata": { "tags": [] }, "source": [ "# Counts of Alpha Particles*\n", "\n", "From the Unit 4.3 [supplementary exercises](https://www2.isye.gatech.edu/isye6420/Bank/Exercises43.pdf). \n", "\n", "```{warning}\n", "This page has the answer to a supplementary exercise! If you want to try the problem without seeing the answer, don't read any further!\n", "```\n", "\n", "Rutherford and Geiger (Rutherford, Geiger and Bateman, Phil. Mag. 20, 698, 19) provided counts of α-particles in their experiment. The counts, given in the table below, can be well–modeled by the Poisson distribution:\n", "\n", "|Particle ct.|0|1|2|3|4|5|6|7|8|9|10|11|$\\ge$12|\n", "|-|-|-|-|-|-|-|-|-|-|-|-|-|-|\n", "|Interval ct.|57|203|383|525|532|408|273|139|45|27|10|4|2|\n", "|Theoretical|54|210|407|525|508|394|254|140|68|29|11|4|6|\n", "1. Find sample size n and sample mean $\\bar{X}$. In calculations for $\\bar{X}$ take ≥ 12 as 12.\n", "\n", "The sample size is the sum of the number of intervals here. Then the sample mean is the total particle count divided by sample size. I tracked down the original source of the data, [which you can find on page 701 here](https://archive.org/details/londonedinburg6201910lond/page/700/mode/2up). The X data here is the particle count, while y is the number of 7.5-second intervals where that count was observed. The professor's example cuts off at $ \\ge$ 12 particles, but the original data went to 14 particles.\n", "\n", "$$\\bar{X} = 10094/2608 = 3.8704$$\n", "\n", "2. Elicit a gamma prior for λ with rate parameter β = 10 and shape parameter α selected in such a way that the prior mean is 7.\n", "\n", "The Gamma distribution mean is $\\frac{\\alpha}{\\beta}$. So we need a $\\alpha$ of 70 for the mean to equal 7.\n", "\n", "3. Find the Bayes estimator of λ using the prior from (1). Is the problem conjugate? Use the fact that $\\sum_{i=1}^n X_i \\sim Poi(n\\lambda)$.\n", "\n", "\n", "The Bayes estimator we're looking for is the mean of the posterior distribution. If our prior is $Ga(70, 10)$ and our likelihood is $Pois(n\\lambda)$, the posterior is proportional to:\n", "\n", "$$\\lambda^y e^{-n\\lambda} \\lambda^{\\alpha -1}e^{-\\beta \\lambda}$$\n", "\n", "$$\\lambda^{y + \\alpha -1} e^{-(n + \\beta)\\lambda}$$\n", "\n", "\n", "which is $Gamma(y + \\alpha, n + \\beta)$. So the Bayes estimator is:\n", "\n", "$$\\frac{y + \\alpha}{n + \\beta} = \\frac{10094 + 70}{2608 + 10} = 3.8824$$\n", "\n", "4. Write a WinBUGS script that simulates the Bayes estimator for λ and compare its output with the analytic solution from (3).\n", "\n", "Pymc version below.\n", "\n" ] }, { "cell_type": "code", "execution_count": 2, "id": "0065e274-6ccf-453f-bca2-36b3e3bf7a1a", "metadata": { "tags": [ "hide-output" ] }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Auto-assigning NUTS sampler...\n", "Initializing NUTS using jitter+adapt_diag...\n", "Multiprocess sampling (4 chains in 4 jobs)\n", "NUTS: [λ]\n" ] }, { "data": { "text/html": [ "\n", "\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", "
\n", " \n", " 100.00% [12000/12000 00:00<00:00 Sampling 4 chains, 0 divergences]\n", "
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stderr", "output_type": "stream", "text": [ "Sampling 4 chains for 1_000 tune and 2_000 draw iterations (4_000 + 8_000 draws total) took 1 seconds.\n" ] } ], "source": [ "freq = np.array([57, 203, 383, 525, 532, 408, 273, 139, 45, 27, 10, 4, 2])\n", "X = np.arange(13)\n", "n = np.sum(freq)\n", "sum_x = np.sum(X * freq)\n", "\n", "α = 70\n", "β = 10\n", "\n", "with pm.Model() as m:\n", " λ = pm.Gamma(\"λ\", alpha=α, beta=β)\n", " nlambda = n * λ\n", " pm.Poisson(\"y\", nlambda, observed=sum_x)\n", "\n", " trace = pm.sample(2000)" ] }, { "cell_type": "code", "execution_count": 3, "id": "1674b69c", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
meansdhdi_3%hdi_97%mcse_meanmcse_sdess_bulkess_tailr_hat
λ3.8810.0383.8123.9570.0010.03413.04795.01.0
\n", "
" ], "text/plain": [ " mean sd hdi_3% hdi_97% mcse_mean mcse_sd ess_bulk ess_tail \\\n", "λ 3.881 0.038 3.812 3.957 0.001 0.0 3413.0 4795.0 \n", "\n", " r_hat \n", "λ 1.0 " ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "az.summary(trace)" ] }, { "cell_type": "code", "execution_count": 4, "id": "26c98c6a-218f-4058-a090-184ca075d948", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Last updated: Wed Mar 22 2023\n", "\n", "Python implementation: CPython\n", "Python version : 3.11.0\n", "IPython version : 8.9.0\n", "\n", "pytensor: 2.10.1\n", "\n", "arviz: 0.15.1\n", "numpy: 1.24.2\n", "pymc : 5.1.2\n", "\n" ] } ], "source": [ "%load_ext watermark\n", "%watermark -n -u -v -iv -p pytensor" ] }, { "cell_type": "code", "execution_count": null, "id": "5e49700f-e04e-4662-954f-b6aaa482c9fa", "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.5" } }, "nbformat": 4, "nbformat_minor": 5 }