{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "89d60ca3",
   "metadata": {
    "editable": true,
    "slideshow": {
     "slide_type": ""
    },
    "tags": []
   },
   "outputs": [],
   "source": [
    "import arviz as az\n",
    "import numpy as np\n",
    "import pymc as pm\n",
    "from pymc.math import switch, ge"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1bb211f0",
   "metadata": {
    "tags": []
   },
   "source": [
    "# Hypothesis Testing*\n",
    "\n",
    "## Psoriasis: Two-sample problem with paired data\n",
    "\n",
    "This is our first example of hypothesis testing.\n",
    "\n",
    "Adapted from [Unit 6: psoriasis.odc](https://raw.githubusercontent.com/areding/6420-pymc/main/original_examples/Codes4Unit6/psoriasis.odc)."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "423b0d14",
   "metadata": {},
   "source": [
    "Woo and McKenna (2003) investigated the effect of broadband ultraviolet B (UVB) therapy and topical calcipotriol cream used together on areas of psoriasis. One of the outcome variables is the Psoriasis Area and Severity Index (PASI), where a lower score is better.\n",
    "\n",
    "The PASI scores for 20 subjects are measured at baseline and after 8 treatments.  Do these data provide sufficient evidence to indicate that the combination therapy reduces PASI scores?\n",
    "\n",
    "Classical Analysis:\n",
    "```\n",
    "   d = baseline - after;\n",
    "   n = length(d);\n",
    "   dbar = mean(d);  dbar = 6.3550\n",
    "   sdd = sqrt(var(d));  sdd = 4.9309\n",
    "   tstat = dbar/(sdd/sqrt(n));  tstat = 5.7637\n",
    "\n",
    "   Reject H_0 at the level alpha=0.05 since the p_value = 0.00000744 < 0.05\n",
    "\n",
    "   95% CI is [4.0472, 8.6628]\n",
    "   ```\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "d25b6da3",
   "metadata": {},
   "outputs": [],
   "source": [
    "# fmt: off\n",
    "baseline = np.array((5.9, 7.6, 12.8, 16.5, 6.1, 14.4, 6.6, 5.4, 9.6, 11.6, \n",
    "                     11.1, 15.6, 9.6, 15.2, 21.0, 5.9, 10.0, 12.2, 20.2, \n",
    "                     6.2))\n",
    "\n",
    "after = np.array((5.2, 12.2, 4.6, 4.0, 0.4, 3.8, 1.2, 3.1, 3.5, 4.9, 11.1,\n",
    "                  8.4, 5.8, 5, 6.4, 0.0, 2.7, 5.1, 4.8, 4.2))\n",
    "# fmt: on"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "d9255f51",
   "metadata": {
    "tags": [
     "hide-output"
    ]
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Auto-assigning NUTS sampler...\n",
      "Initializing NUTS using jitter+adapt_diag...\n",
      "Multiprocess sampling (4 chains in 4 jobs)\n",
      "NUTS: [mu, prec]\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "\n",
       "<style>\n",
       "    /* Turns off some styling */\n",
       "    progress {\n",
       "        /* gets rid of default border in Firefox and Opera. */\n",
       "        border: none;\n",
       "        /* Needs to be in here for Safari polyfill so background images work as expected. */\n",
       "        background-size: auto;\n",
       "    }\n",
       "    progress:not([value]), progress:not([value])::-webkit-progress-bar {\n",
       "        background: repeating-linear-gradient(45deg, #7e7e7e, #7e7e7e 10px, #5c5c5c 10px, #5c5c5c 20px);\n",
       "    }\n",
       "    .progress-bar-interrupted, .progress-bar-interrupted::-webkit-progress-bar {\n",
       "        background: #F44336;\n",
       "    }\n",
       "</style>\n"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "\n",
       "    <div>\n",
       "      <progress value='24000' class='' max='24000' style='width:300px; height:20px; vertical-align: middle;'></progress>\n",
       "      100.00% [24000/24000 00:01&lt;00:00 Sampling 4 chains, 0 divergences]\n",
       "    </div>\n",
       "    "
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Sampling 4 chains for 1_000 tune and 5_000 draw iterations (4_000 + 20_000 draws total) took 1 seconds.\n"
     ]
    }
   ],
   "source": [
    "with pm.Model() as m:\n",
    "    # priors\n",
    "    mu = pm.Normal(\"mu\", mu=0, sigma=316)\n",
    "    prec = pm.Gamma(\"prec\", alpha=0.001, beta=0.001)\n",
    "    sigma = pm.Deterministic(\"sigma\", 1 / pm.math.sqrt(prec))\n",
    "\n",
    "    ph1 = pm.Deterministic(\"ph1\", switch(mu >= 0, 1, 0))\n",
    "\n",
    "    diff = pm.Normal(\"diff\", mu=mu, sigma=sigma, observed=baseline - after)\n",
    "\n",
    "    # start sampling\n",
    "    trace = pm.sample(5000)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "f2fc5b7b",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/Users/aaron/mambaforge/envs/pymc/lib/python3.11/site-packages/arviz/stats/diagnostics.py:592: RuntimeWarning: invalid value encountered in scalar divide\n",
      "  (between_chain_variance / within_chain_variance + num_samples - 1) / (num_samples)\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>mean</th>\n",
       "      <th>sd</th>\n",
       "      <th>hdi_2.5%</th>\n",
       "      <th>hdi_97.5%</th>\n",
       "      <th>mcse_mean</th>\n",
       "      <th>mcse_sd</th>\n",
       "      <th>ess_bulk</th>\n",
       "      <th>ess_tail</th>\n",
       "      <th>r_hat</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>mu</th>\n",
       "      <td>6.355</td>\n",
       "      <td>1.167</td>\n",
       "      <td>4.149</td>\n",
       "      <td>8.766</td>\n",
       "      <td>0.009</td>\n",
       "      <td>0.007</td>\n",
       "      <td>16065.0</td>\n",
       "      <td>12098.0</td>\n",
       "      <td>1.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>sigma</th>\n",
       "      <td>5.125</td>\n",
       "      <td>0.883</td>\n",
       "      <td>3.568</td>\n",
       "      <td>6.887</td>\n",
       "      <td>0.007</td>\n",
       "      <td>0.005</td>\n",
       "      <td>15790.0</td>\n",
       "      <td>13476.0</td>\n",
       "      <td>1.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>ph1</th>\n",
       "      <td>1.000</td>\n",
       "      <td>0.000</td>\n",
       "      <td>1.000</td>\n",
       "      <td>1.000</td>\n",
       "      <td>0.000</td>\n",
       "      <td>0.000</td>\n",
       "      <td>20000.0</td>\n",
       "      <td>20000.0</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "        mean     sd  hdi_2.5%  hdi_97.5%  mcse_mean  mcse_sd  ess_bulk  \\\n",
       "mu     6.355  1.167     4.149      8.766      0.009    0.007   16065.0   \n",
       "sigma  5.125  0.883     3.568      6.887      0.007    0.005   15790.0   \n",
       "ph1    1.000  0.000     1.000      1.000      0.000    0.000   20000.0   \n",
       "\n",
       "       ess_tail  r_hat  \n",
       "mu      12098.0    1.0  \n",
       "sigma   13476.0    1.0  \n",
       "ph1     20000.0    NaN  "
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "az.summary(trace, hdi_prob=0.95, var_names=[\"~prec\"])"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "492a8e1d",
   "metadata": {},
   "source": [
    "Our model agrees with the BUGS results:\n",
    "\n",
    "|          | mean     | sd      | MC_error | val2.5pc | median | val97.5pc | start | sample |\n",
    "|----------|----------|---------|----------|----------|--------|-----------|-------|--------|\n",
    "| pH1      | 1        | 0       | 3.16E-13 | 1        | 1      | 1         | 1001  | 100000 |\n",
    "| mu       | 6.352    | 1.169   | 0.003657 | 4.043    | 6.351  | 8.666     | 1001  | 100000 |\n",
    "| sigma    | 5.142    | 0.8912  | 0.003126 | 3.74     | 5.026  | 7.203     | 1001  | 100000 |\n",
    "| pval     | 4.20E-04 | 0.02049 | 6.04E-05 | 0        | 0      | 0         | 1001  | 100000 |"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "24d78eb5-222e-4f97-b329-c01dc77d2e11",
   "metadata": {},
   "source": [
    "## Equivalence of generic and brand-name drugs\n",
    "\n",
    "Adapted from [Unit 6: equivalence.odc](https://raw.githubusercontent.com/areding/6420-pymc/main/original_examples/Codes4Unit6/equivalence.odc)."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "54b32700-ca79-47f7-b7b7-afe6544a95df",
   "metadata": {},
   "source": [
    "The manufacturer wishes to demonstrate that their generic drug for a particular metabolic disorder is equivalent to a brand name drug. One of indication of the disorder is an abnormally low concentration of levocarnitine, an amino acid derivative, in the plasma. The treatment with the brand name drug substantially increases this concentration.\n",
    "\n",
    "A small clinical trial is conducted with 43 patients, 18 in the Brand Name Drug arm and 25 in the Generic Drug arm. The increases in the log-concentration of levocarnitine are in the data below.\n",
    "\n",
    "The FDA declares that bioequivalence among the two drugs can be established if the difference in response to the two drugs is within 2 units of log-concentration. Assuming that the log-concentration measurements follow normal distributions with equal population variance, can these two drugs be declared bioequivalent within a tolerance +/-2  units?"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "163792d5-2ff5-48da-bf3e-2606f3f61f06",
   "metadata": {},
   "source": [
    "---\n",
    "The way the data is set up in the .odc file is strange. It seems simpler to just have a separate list for each increase type."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "6e464e00-182a-4dc4-b831-8e6d257e2668",
   "metadata": {},
   "outputs": [],
   "source": [
    "# fmt: off\n",
    "increase_type1 = [7, 8, 4, 6, 10, 10, 5, 7, 9, 8, 6, 7, 8, 4, 6, 10, 8, 9]\n",
    "increase_type2 = [6, 7, 5, 9, 5, 5, 3, 7, 5, 10, 8, 5, 8, 4, 4, 8, 6, 11, \n",
    "                  7, 5, 5, 5, 7, 4, 6]\n",
    "# fmt: on"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "366d59bd-1b5d-45e3-93a0-39c47f02e2cd",
   "metadata": {},
   "source": [
    "We're using ```pm.math.switch``` and ```pm.math.eq``` to recreate the BUGS ```step()``` function for the ```probint``` variable."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "bf6e2deb-9710-494f-bd4d-83e3f45c369d",
   "metadata": {
    "tags": [
     "hide-output"
    ]
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Auto-assigning NUTS sampler...\n",
      "Initializing NUTS using jitter+adapt_diag...\n",
      "Multiprocess sampling (4 chains in 4 jobs)\n",
      "NUTS: [mu1, mu2, prec]\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "\n",
       "<style>\n",
       "    /* Turns off some styling */\n",
       "    progress {\n",
       "        /* gets rid of default border in Firefox and Opera. */\n",
       "        border: none;\n",
       "        /* Needs to be in here for Safari polyfill so background images work as expected. */\n",
       "        background-size: auto;\n",
       "    }\n",
       "    progress:not([value]), progress:not([value])::-webkit-progress-bar {\n",
       "        background: repeating-linear-gradient(45deg, #7e7e7e, #7e7e7e 10px, #5c5c5c 10px, #5c5c5c 20px);\n",
       "    }\n",
       "    .progress-bar-interrupted, .progress-bar-interrupted::-webkit-progress-bar {\n",
       "        background: #F44336;\n",
       "    }\n",
       "</style>\n"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "\n",
       "    <div>\n",
       "      <progress value='24000' class='' max='24000' style='width:300px; height:20px; vertical-align: middle;'></progress>\n",
       "      100.00% [24000/24000 00:01&lt;00:00 Sampling 4 chains, 0 divergences]\n",
       "    </div>\n",
       "    "
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Sampling 4 chains for 1_000 tune and 5_000 draw iterations (4_000 + 20_000 draws total) took 1 seconds.\n"
     ]
    }
   ],
   "source": [
    "with pm.Model() as m:\n",
    "    # priors\n",
    "    mu1 = pm.Normal(\"mu1\", mu=10, sigma=316)\n",
    "    mu2 = pm.Normal(\"mu2\", mu=10, sigma=316)\n",
    "    mudiff = pm.Deterministic(\"mudiff\", mu1 - mu2)\n",
    "    prec = pm.Gamma(\"prec\", alpha=0.001, beta=0.001)\n",
    "    sigma = 1 / pm.math.sqrt(prec)\n",
    "\n",
    "    probint = pm.Deterministic(\n",
    "        \"probint\",\n",
    "        switch(ge(mudiff + 2, 0), 1, 0) * switch(ge(2 - mudiff, 0), 1, 0),\n",
    "    )\n",
    "\n",
    "    y_type1 = pm.Normal(\"y_type1\", mu=mu1, sigma=sigma, observed=increase_type1)\n",
    "    y_type2 = pm.Normal(\"y_type2\", mu=mu2, sigma=sigma, observed=increase_type2)\n",
    "\n",
    "    # start sampling\n",
    "    trace = pm.sample(5000)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "7977090b-a37f-4b59-9ca0-5b2651028850",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>mean</th>\n",
       "      <th>sd</th>\n",
       "      <th>hdi_2.5%</th>\n",
       "      <th>hdi_97.5%</th>\n",
       "      <th>mcse_mean</th>\n",
       "      <th>mcse_sd</th>\n",
       "      <th>ess_bulk</th>\n",
       "      <th>ess_tail</th>\n",
       "      <th>r_hat</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>mu1</th>\n",
       "      <td>7.334</td>\n",
       "      <td>0.470</td>\n",
       "      <td>6.414</td>\n",
       "      <td>8.269</td>\n",
       "      <td>0.003</td>\n",
       "      <td>0.002</td>\n",
       "      <td>27558.0</td>\n",
       "      <td>15811.0</td>\n",
       "      <td>1.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>mu2</th>\n",
       "      <td>6.198</td>\n",
       "      <td>0.397</td>\n",
       "      <td>5.438</td>\n",
       "      <td>6.984</td>\n",
       "      <td>0.002</td>\n",
       "      <td>0.002</td>\n",
       "      <td>26341.0</td>\n",
       "      <td>14856.0</td>\n",
       "      <td>1.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>prec</th>\n",
       "      <td>0.264</td>\n",
       "      <td>0.058</td>\n",
       "      <td>0.159</td>\n",
       "      <td>0.381</td>\n",
       "      <td>0.000</td>\n",
       "      <td>0.000</td>\n",
       "      <td>24987.0</td>\n",
       "      <td>13816.0</td>\n",
       "      <td>1.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>mudiff</th>\n",
       "      <td>1.136</td>\n",
       "      <td>0.613</td>\n",
       "      <td>-0.032</td>\n",
       "      <td>2.382</td>\n",
       "      <td>0.004</td>\n",
       "      <td>0.003</td>\n",
       "      <td>28981.0</td>\n",
       "      <td>15171.0</td>\n",
       "      <td>1.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>probint</th>\n",
       "      <td>0.921</td>\n",
       "      <td>0.270</td>\n",
       "      <td>0.000</td>\n",
       "      <td>1.000</td>\n",
       "      <td>0.002</td>\n",
       "      <td>0.002</td>\n",
       "      <td>16160.0</td>\n",
       "      <td>16160.0</td>\n",
       "      <td>1.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "          mean     sd  hdi_2.5%  hdi_97.5%  mcse_mean  mcse_sd  ess_bulk  \\\n",
       "mu1      7.334  0.470     6.414      8.269      0.003    0.002   27558.0   \n",
       "mu2      6.198  0.397     5.438      6.984      0.002    0.002   26341.0   \n",
       "prec     0.264  0.058     0.159      0.381      0.000    0.000   24987.0   \n",
       "mudiff   1.136  0.613    -0.032      2.382      0.004    0.003   28981.0   \n",
       "probint  0.921  0.270     0.000      1.000      0.002    0.002   16160.0   \n",
       "\n",
       "         ess_tail  r_hat  \n",
       "mu1       15811.0    1.0  \n",
       "mu2       14856.0    1.0  \n",
       "prec      13816.0    1.0  \n",
       "mudiff    15171.0    1.0  \n",
       "probint   16160.0    1.0  "
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "az.summary(trace, hdi_prob=0.95)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8b42da1d-cb93-416c-9bb1-1adad6cd1a06",
   "metadata": {},
   "source": [
    "BUGS results:\n",
    "\n",
    "|          | mean   | sd      | MC_error | val2.5pc | median | val97.5pc | start | sample |\n",
    "|----------|--------|---------|----------|----------|--------|-----------|-------|--------|\n",
    "| mu[1]    | 7.332  | 0.473   | 0.001469 | 6.399    | 7.332  | 8.264     | 1001  | 100000 |\n",
    "| mu[2]    | 6.198  | 0.4006  | 0.001213 | 5.406    | 6.199  | 6.985     | 1001  | 100000 |\n",
    "| mudiff   | 1.133  | 0.618   | 0.00196  | -0.07884 | 1.134  | 2.354     | 1001  | 100000 |\n",
    "| prec     | 0.2626 | 0.05792 | 1.90E-04 | 0.1617   | 0.2584 | 0.3877    | 1001  | 100000 |\n",
    "| probint  | 0.9209 | 0.2699  | 9.07E-04 | 0        | 1      | 1         | 1001  | 100000 |"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "8aa3cde4",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Last updated: Mon Oct 30 2023\n",
      "\n",
      "Python implementation: CPython\n",
      "Python version       : 3.11.5\n",
      "IPython version      : 8.15.0\n",
      "\n",
      "pytensor: 2.17.1\n",
      "\n",
      "pymc : 5.9.0\n",
      "numpy: 1.25.2\n",
      "arviz: 0.16.1\n",
      "\n"
     ]
    }
   ],
   "source": [
    "%load_ext watermark\n",
    "%watermark -n -u -v -iv -p pytensor"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "1ef6ea1f",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}