import pymc as pm
import numpy as np
import arviz as az
import pandas as pd
import aesara.tensor as at
from pytensor.tensor.subtensor import set_subtensor as set_st
from pymc.math import eq
import seaborn as sns
import matplotlib.pyplot as plt

%load_ext lab_black

4. Hald*#

Adapted from Unit 9: Hald.odc.

A dataset on Portland cement originally due to Woods, Steinour and Starke (1932), and which has since then been widely analysed is now referred as Hald data cf. e.g., Hald (1952, pp. 635†652),

These data come from an experimental investigation of the heat evolved during the setting and hardening of Portland cements of varied composition and the dependence of this heat on the percentages of four compounds in the clinkers from which the cement was produced.

As observed by Woods, Steinour and Starke (1932, p. 1207):

This property is of interest in the construction of massive works such as dams, in which the great thicknesses severely hinder the outflow of the heat. The consequent rise in temperature while the cement is hardening may result in contractions and cracking when the eventual cooling to the surrounding temperature takes place.

The four compounds considered by Woods, Steinour and Starke (1932) are tricalcium aluminate: 3CaO-Al2O3, tricalcium silicate: 3CaO-SiO2, tetracalcium aluminoferrite: 4CaO-Al2O3-Fe2O3, and beta-dicalcium silicate: 2CaO-SiO2, which we will denote by x1, x2, x3, and x4, respectively. The heat evolved after 180 days of curing, which we will denote by y, is measured in calories per gram of cement.

REFS: Hald, Anders (1952). Statistical Theory with Engineering Applications. Wiley, New York.

Woods, H., Steinour, H. H., and Starke, H. R. (1932). Effect of composition of Portland cement on heat evolved during hardening. Industrial and Engineering Chemistry, 24, 1207†1214.

data = pd.read_csv("../data/hald_data.csv")
y = data["y"].to_numpy()
X = data.drop("y", axis=1).to_numpy()
Y = y.repeat(7).reshape(13, 7)

This was a pain to type out. Would be nice to find a cleaner method.

Model 1 shows how to use the Laud-Ibrahim criterion (Laud and Ibrahim, 1995) to choose between models. The L metric is based on a comparison between the original target variable and the posterior predictive distribution. In the BUGS example, the Y.new contains the results of the Posterior Predictive Check (PPC). As the variable name indicates, it’s just a set of new Y values generated from the posterior. The L metric is a measure of distance between the original y values and the PPC. The lower L is, the better.

Note

The BUGS example has 8 models, but I only have 7 here. That’s because models 6 and 8 were identical, so I just dropped model 8.

with pm.Model() as m:
    a = pm.Normal("a", 0, tau=0.00001, shape=4)
    b = pm.Normal("b", 0, tau=0.00001, shape=4)
    c = pm.Normal("c", 0, tau=0.00001, shape=4)
    d = pm.Normal("d", 0, tau=0.00001, shape=5)
    e = pm.Normal("e", 0, tau=0.00001, shape=3)
    f = pm.Normal("f", 0, tau=0.00001, shape=3)
    g = pm.Normal("g", 0, tau=0.00001, shape=4)
    tau = pm.Gamma("tau", 12.5, 62.5, shape=7)
    # fmt: off
    _mu = [
        a[0] + a[1] * X[:, 0] + a[2] * X[:, 1] + a[3] * X[:, 3],  # i013
        b[0] + b[1] * X[:, 0] + b[2] * X[:, 1] + b[3] * X[:, 2],  # i012
        c[0] + c[1] * X[:, 0] + c[2] * X[:, 2] + c[3] * X[:, 3],  # i023
        d[0] + d[1] * X[:, 0] + d[2] * X[:, 1] + d[3] * X[:, 2] + d[4] * X[:, 3],  # i0123
        e[0] + e[1] * X[:, 0] + e[2] * X[:, 1],  # i01
        f[0] + f[1] * X[:, 0] + f[2] * X[:, 3],  # i03
        g[0] + g[1] * X[:, 1] + g[2] * X[:, 2] + g[3] * X[:, 3],  # i123
    ]
    # fmt: on
    mu = pm.math.stack(_mu)
    pm.Normal("likelihood", mu=mu.T, tau=tau, observed=Y)

    trace = pm.sample(5000, target_accept=0.95)

    pm.sample_posterior_predictive(trace, extend_inferencedata=True)
Hide code cell output
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (4 chains in 4 jobs)
NUTS: [a, b, c, d, e, f, g, tau]
100.00% [24000/24000 22:13<00:00 Sampling 4 chains, 0 divergences]
Sampling 4 chains for 1_000 tune and 5_000 draw iterations (4_000 + 20_000 draws total) took 1333 seconds.
Sampling: [likelihood]
100.00% [20000/20000 00:01<00:00]
Y_new = az.summary(trace.posterior_predictive)["mean"].values.reshape(13, 7)
D2 = (Y - Y_new) ** 2
L = np.sqrt(np.sum(D2, axis=0) + np.std(Y_new, axis=0) ** 2)
Hide code cell source
colors = ["grey" if m > L.min() else "green" for m in L]
bp = sns.barplot(x=list(range(1, 8)), y=L, palette=colors)
bp.set(ylim=(15, 17), xlabel="model", ylabel="L")
bp.bar_label(bp.containers[0])
plt.show()
../_images/8034c38815487a5f35c9b2426a2f273ba16e98c6f8b08b44ec5978887342702f.png

The results are not identical to BUGS, so I may need to come back and double-check this. I also haven’t calculated the model-by-model comparisons.

%load_ext watermark
%watermark -n -u -v -iv -p pytensor
Last updated: Wed Mar 22 2023

Python implementation: CPython
Python version       : 3.11.0
IPython version      : 8.9.0

pytensor: 2.10.1

arviz     : 0.15.1
aesara    : 2.8.10
matplotlib: 3.6.3
seaborn   : 0.12.2
pandas    : 1.5.3
numpy     : 1.24.2
pymc      : 5.1.2