14. Change Point Problem

14. Change Point Problem#

In this lesson, we’re trying to infer the “change point” in the widely-used UK coal mining disaster dataset, originally published by MAGUIRE et al. [1952] and updated by Jarrett [1979]. The dataset spans from 1851 to 1962 and includes only incidents where at least 10 people died. At some point, the rate of disasters decreased significantly.

Model#

The model comes from Carlin et al. [1992]. This paper is very readable and useful for double-checking your understanding of the concepts we’ve seen so far. The model itself is a great example of how intuitive and flexible the Bayesian hierarchical framework can be.

We assume two different Poisson processes governing the rates of coal mining disasters. Before the change point $m$, we’ll call the Poisson likelihood’s rate $\lambda$. After $m$, the rate will be $\mu$ for our second likelihood.

The prior on $m$ will be discrete uniform $[1, n]$, where $n=112$, the number of years in the dataset. Priors for $\lambda$ and $\mu$ are gamma distributions with hyperparameters $ (\alpha, \beta) $ and $ (\gamma, \delta) $, respectively.

Summarized:

\[\begin{split} \begin{align*} f(x_i | \lambda) &\sim \text{Pois}(\lambda) && \text{For the first $ m $ years.} \\ f(x_i | \mu) &\sim \text{Pois}(\mu) && \text{For the remaining $ n-m $ years.} \\ m &\sim \text{Uniform}(1, n) && \\ \lambda &\sim \text{Gamma}(\alpha, \beta) && \\ \mu &\sim \text{Gamma}(\gamma, \delta) && \end{align*} \end{split}\]

Odds ratio#

This model will find some change point since it’s built in to our assumptions. But how do we know whether that change point is actually meaningful? (Carlin et al. [1992]) say: “The question of whether or not a change has occurred is addressed through the posterior odds for no change.”

In our case, that would be $P(m=n \mid X)/\left({1 - P(m=n \mid X)}\right)$, comparing the odds that $m=112$ to the odds it doesn’t.

Conditionals#

The posterior is proportional to the product of the likelihoods and the priors for $ \lambda $, $ \mu $, and $ m $.

\[\begin{split} \begin{align*} p(\lambda, \mu, m | X) &\propto \left[ \prod_{i=1}^{m} \text{Poisson}(x_i | \lambda) \right] \left[ \prod_{i=m+1}^{n} \text{Poisson}(x_i | \mu) \right]\cdot \text{Gamma}(\lambda | \alpha, \beta) \cdot \text{Gamma}(\mu | \gamma, \delta) \cdot \text{Uniform}(m | 1, n) \\ &\propto \left[ \lambda^{\sum_{i=1}^{m} x_i} e^{-m\lambda} \right] \left[ \mu^{\sum_{i=m+1}^{n} x_i} e^{-(n-m)\mu} \right] \lambda^{\alpha - 1} e^{-\beta \lambda} \mu^{\gamma - 1} e^{-\delta \mu} \\ &\propto \lambda^{\alpha + \sum_{i=1}^{m} x_i - 1} e^{-(\beta + m)\lambda} \mu^{\gamma + \sum_{i=m+1}^{n} x_i - 1} e^{-(\delta + n - m)\mu} \end{align*} \end{split}\]

So, the full conditional for $\lambda$ is:

\[ \lambda | \mu, m, X \sim \text{Gamma} \left( \alpha + \sum_{i=1}^{m} x_i, \beta + m \right) \]

And for $\mu$:

\[ \mu | \lambda, m, X \sim \text{Gamma} \left( \gamma + \sum_{i=m+1}^{n} x_i, \delta + (n - m) \right) \]

Finding the full conditional for $m$ is a little different than what we’ve seen in lectures so far and I didn’t immediately understand it, so I worked out the steps below. Since $m$ is discrete, it’s simple to normalize at the end, but we still want to simplify the expression as much as possible for computational efficiency.

Let’s start with the joint distribution again, without any of the priors since we know they won’t have any terms with $m$.

\[\begin{split} \begin{align*} \pi(m|\mu, \lambda, X) &\propto \left[ \prod_{i=1}^{m} \text{Poisson}(x_i | \lambda) \right] \left[ \prod_{i=m+1}^{n} \text{Poisson}(x_i | \mu) \right] \\ &\propto \left( \prod_{i=1}^{m} \frac{\lambda^{x_i} e^{-\lambda}}{x_i!} \prod_{i=m+1}^{n} \frac{\mu^{x_i} e^{-\mu}}{x_i!} \right) \\ & \propto e^{-m\lambda} e^{-(n-m)\mu} \left( \prod_{i=1}^{m} \frac{\lambda^{x_i}}{x_i!} \right) \left( \prod_{i=m+1}^{n} \frac{\mu^{x_i}}{x_i!} \right) && \text{Group exponential terms.} \\ & \propto e^{m(\mu -\lambda)} e^{-\mu n} \left( \prod_{i=1}^{m} \frac{\lambda^{x_i}}{x_i!} \right) \left( \prod_{i=m+1}^{n} \frac{\mu^{x_i}}{x_i!} \right) && \text{Rearrange exponential terms.} \\ & \propto e^{m(\mu -\lambda)} e^{-\mu n} \left( \prod_{i=1}^{m} \frac{\left(\lambda \frac{\mu}{\mu} \right)^{x_i}}{x_i!} \right) \left( \prod_{i=m+1}^{n} \frac{\mu^{x_i}}{x_i!} \right) && \text{Introduce a factor of 1 as } \frac{\mu}{\mu} \text{.} \\ & \propto e^{m(\mu -\lambda)} e^{-\mu n} \left( \prod_{i=1}^{m} \frac{\left(\mu \frac{\lambda}{\mu} \right)^{x_i}}{x_i!} \right) \left( \prod_{i=m+1}^{n} \frac{\mu^{x_i}}{x_i!} \right) && \text{Rearrange terms.} \\ & \propto e^{m(\mu -\lambda)} e^{-\mu n} \left(\frac{\lambda}{\mu}\right)^{\sum_{i=1}^{m} x_i} \left( \prod_{i=1}^{m} \frac{\mu^{x_i}}{x_i!} \right) \left( \prod_{i=m+1}^{n} \frac{\mu^{x_i}}{x_i!} \right) && \text{Pull out the } \frac{\lambda}{\mu} \text{ term.} \\ & \propto e^{m(\mu -\lambda)} e^{-\mu n} \left(\frac{\lambda}{\mu}\right)^{\sum_{i=1}^{m} x_i} \left( \prod_{i=1}^{n} \frac{\mu^{x_i}}{x_i!} \right) && \text{Combine the two products since they go from 1 to n.}\\ & \propto e^{m(\mu -\lambda)} \left(\frac{\lambda}{\mu}\right)^{\sum_{i=1}^{m} x_i} && \text{Remove constant terms.} \end{align*} \end{split}\]

We know our PMF will be proportional to $\pi(m) = e^{m(\mu -\lambda)} \left(\frac{\lambda}{\mu}\right)^{\sum_{i=1}^{m} x_i}$. To normalize $\pi(m)$, we divide by the sum of $\pi$ evaluated at all possible values of $m$.

\[P(m=k) = \frac{\pi(k)}{\sum_{i=1}^{n} \pi(i)}\]

Again, this is possible because this is a discrete distribution—no integration needed.

Now that we have all these full conditionals, we can code the Gibbs sampler in the next lesson.

14. Change Point Problem

Contents

14. Change Point Problem#

Model#

Odds ratio#

Conditionals#