import pymc as pm
import numpy as np
import arviz as az
import pandas as pd

6. Predicting Using Censored Data*#

Icelandic volcano eruptions#

Adapted from Unit 10: katla.odc.

Data can be found here.

In April 2010, the volcano Eyjafjallajökull in Iceland erupted. The resulting ash cloud was blown towards Western Europe and caused severe disruption to air travel for the following few weeks. A report into the eruption and its impact (Day et al. [2010]) reviewed how well the risk had been managed. One question was whether potentially more devastating eruptions from the larger neighbouring volcano Katla can be predicted from a recent eruption of Eyjafjallajökull. The report provides the dates of all 18 eruptions of Katla since the year 1177, with a corresponding indicator of whether Eyjafjallajökull had also erupted within the previous year.

Lunn et al. [2012] p. 254

# fmt: off
D = np.array(
    (1177, 1262, 1311, 1357, 1416, 1440, 1450, 1500, 
     1550, 1580, 1612, 1625, 1660, 1721, 1755, 1823, 
     1860, 1918, np.inf) # 
)
# fmt: on

# probabilities
ps = [1, 5, 10, 50]

# time between eruptions
t = np.diff(D)
t[t > 100] = 100
with pm.Model() as m:
    α = pm.TruncatedNormal("α", mu=0, sigma=5, lower=0)  # v in BUGS model

    σ = pm.Gamma("σ", 0.1, 0.1)
    λ = 1 / σ**α
    β = λ ** (-1 / α)

    _t = pm.Weibull.dist(α, β)
    pm.Censored("likelihood", _t, lower=None, upper=100, observed=t)

    median = pm.Deterministic("median tte", σ * np.log(2) ** (1 / α))

    for p in ps:
        pm.Deterministic(
            f"p_erupt_{p}", 1 - pm.math.exp((100 / σ) ** α - ((100 + p) / σ) ** α)
        )

    trace = pm.sample(3000, init="jitter+adapt_diag_grad", target_accept=0.95)
Hide code cell output
Initializing NUTS using jitter+adapt_diag_grad...
Multiprocess sampling (4 chains in 4 jobs)
NUTS: [α, σ]

Sampling 4 chains for 1_000 tune and 3_000 draw iterations (4_000 + 12_000 draws total) took 1 seconds.
az.summary(trace)
mean sd hdi_3% hdi_97% mcse_mean mcse_sd ess_bulk ess_tail r_hat
α 1.930 0.387 1.217 2.659 0.005 0.004 5405.0 5983.0 1.0
σ 50.213 6.729 36.975 62.083 0.095 0.067 4955.0 4686.0 1.0
median tte 41.318 6.316 29.582 53.020 0.091 0.064 4823.0 4579.0 1.0
p_erupt_1 0.075 0.031 0.024 0.133 0.000 0.000 6487.0 6132.0 1.0
p_erupt_5 0.320 0.111 0.135 0.539 0.001 0.001 6468.0 6115.0 1.0
p_erupt_10 0.533 0.148 0.253 0.799 0.002 0.001 6443.0 6136.0 1.0
p_erupt_50 0.965 0.056 0.861 1.000 0.001 0.001 6296.0 6208.0 1.0

Authors#

Aaron Reding and Jason Naramore

%load_ext watermark
%watermark -n -u -v -iv -p pytensor
Last updated: Mon Nov 27 2023

Python implementation: CPython
Python version       : 3.11.5
IPython version      : 8.15.0

pytensor: 2.17.1

pandas: 2.1.0
arviz : 0.16.1
pymc  : 5.9.2
numpy : 1.25.2