import pymc as pm
import numpy as np
import arviz as az
import pandas as pd

6. Predicting Using Censored Data*#

Icelandic volcano eruptions#

Adapted from Unit 10: katla.odc.

Data can be found here.

In April 2010, the volcano Eyjafjallajökull in Iceland erupted. The resulting ash cloud was blown towards Western Europe and caused severe disruption to air travel for the following few weeks. A report into the eruption and its impact (Day et al. [2010]) reviewed how well the risk had been managed. One question was whether potentially more devastating eruptions from the larger neighbouring volcano Katla can be predicted from a recent eruption of Eyjafjallajökull. The report provides the dates of all 18 eruptions of Katla since the year 1177, with a corresponding indicator of whether Eyjafjallajökull had also erupted within the previous year.

Lunn et al. [2012] p. 254

Thanks to William Naramore, this example has been updated to work with PyMC v5!

# fmt: off
D = np.array(
    (1177, 1262, 1311, 1357, 1416, 1440, 1450, 1500, 
     1550, 1580, 1612, 1625, 1660, 1721, 1755, 1823, 
     1860, 1918, np.inf) # 
)
# fmt: on

# probabilities
ps = [1, 5, 10, 50]

# time between eruptions
t = np.diff(D)
t[t > 100] = 100
with pm.Model() as m:
    α = pm.TruncatedNormal("α", mu=0, sigma=5, lower=0)  # v in BUGS model

    σ = pm.Gamma("σ", 0.1, 0.1)
    λ = 1 / σ**α
    β = λ ** (-1 / α)

    _t = pm.Weibull.dist(α, β)
    pm.Censored("likelihood", _t, lower=None, upper=100, observed=t)

    median = pm.Deterministic("median tte", σ * np.log(2) ** (1 / α))

    for p in ps:
        pm.Deterministic(
            f"p_erupt_{p}", 1 - pm.math.exp((100 / σ) ** α - ((100 + p) / σ) ** α)
        )

    trace = pm.sample(3000, init="jitter+adapt_diag_grad", target_accept=.95)
Hide code cell output
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag_grad...
Multiprocess sampling (4 chains in 4 jobs)
NUTS: [α, σ]
100.00% [16000/16000 00:01<00:00 Sampling 4 chains, 0 divergences]
Sampling 4 chains for 1_000 tune and 3_000 draw iterations (4_000 + 12_000 draws total) took 1 seconds.
az.summary(trace)
mean sd hdi_3% hdi_97% mcse_mean mcse_sd ess_bulk ess_tail r_hat
α 1.937 0.380 1.208 2.627 0.005 0.004 4801.0 5240.0 1.0
σ 50.277 6.448 38.132 62.464 0.086 0.061 5584.0 5754.0 1.0
median tte 41.406 6.064 29.797 52.494 0.082 0.058 5416.0 5553.0 1.0
p_erupt_1 0.075 0.031 0.026 0.133 0.000 0.000 5439.0 5645.0 1.0
p_erupt_5 0.322 0.111 0.130 0.527 0.001 0.001 5423.0 5693.0 1.0
p_erupt_10 0.535 0.147 0.268 0.807 0.002 0.001 5408.0 5657.0 1.0
p_erupt_50 0.966 0.053 0.868 1.000 0.001 0.001 5281.0 5581.0 1.0
%load_ext watermark
%watermark -n -u -v -iv -p pytensor
Last updated: Mon Nov 27 2023

Python implementation: CPython
Python version       : 3.11.5
IPython version      : 8.15.0

pytensor: 2.17.1

pandas: 2.1.0
arviz : 0.16.1
pymc  : 5.9.2
numpy : 1.25.2