import pymc as pm
import numpy as np
import pandas as pd
import arviz as az

10. SSVS: Hald*#

Adapted from Haldssvs.odc.

SVSS can be looked at as adding an indicator variable to each covariate, then randomly turning each one on and off as you sample.

data = pd.read_csv("../data/hald_data.csv")
y = data["y"].to_numpy()
X = data.drop("y", axis=1).to_numpy()

X_centered = (X - X.mean(axis=0)) / X.std(axis=0)
_, p = X.shape
draws = 10000
chains = 4

with pm.Model() as m_svss:
    # SVSS prior
    delta = pm.Bernoulli("delta", p=0.5, shape=p)
    alpha = pm.Normal("alpha", 0, tau=0.1, shape=p)
    beta = pm.Deterministic("beta", delta * alpha)

    tau = pm.Gamma("tau", 0.1, 0.1)
    intercept = pm.Normal("intercept", 0, tau=0.001)

    mu = intercept + pm.math.dot(X_centered, beta)
    pm.Normal("likelihood", mu, tau=tau, observed=y)

    trace = pm.sample(draws=draws, chains=chains, target_accept=0.95)
Hide code cell output
Multiprocess sampling (4 chains in 4 jobs)
CompoundStep
>BinaryGibbsMetropolis: [delta]
>NUTS: [alpha, tau, intercept]
100.00% [44000/44000 00:18<00:00 Sampling 4 chains, 0 divergences]
Sampling 4 chains for 1_000 tune and 10_000 draw iterations (4_000 + 40_000 draws total) took 19 seconds.
az.summary(trace, var_names="delta", kind="stats")
mean sd hdi_3% hdi_97%
delta[0] 1.000 0.021 1.0 1.0
delta[1] 0.931 0.254 0.0 1.0
delta[2] 0.430 0.495 0.0 1.0
delta[3] 0.941 0.235 1.0 1.0

Delta values, representing how often each feature was selected for the model, match the BUGS results. But what about looking at how often each model was selected? I really don’t want to code that quadruple for-loop (the model variable in the BUGS example) or try to vectorize it.

Let’s look in the trace. We can view the delta variable across all chains and samples.

trace.posterior.delta
<xarray.DataArray 'delta' (chain: 4, draw: 10000, delta_dim_0: 4)>
array([[[1, 1, 0, 1],
        [1, 1, 1, 1],
        [1, 1, 1, 1],
        ...,
        [1, 1, 0, 0],
        [1, 1, 0, 0],
        [1, 1, 1, 0]],

       [[1, 1, 0, 1],
        [1, 1, 0, 1],
        [1, 1, 0, 1],
        ...,
        [1, 1, 1, 1],
        [1, 1, 1, 1],
        [1, 1, 0, 1]],

       [[1, 1, 1, 1],
        [1, 1, 1, 1],
        [1, 1, 1, 1],
        ...,
        [1, 1, 0, 1],
        [1, 1, 0, 1],
        [1, 1, 0, 1]],

       [[1, 1, 0, 1],
        [1, 1, 0, 1],
        [1, 1, 0, 1],
        ...,
        [1, 1, 0, 1],
        [1, 1, 0, 1],
        [1, 1, 0, 1]]])
Coordinates:
  * chain        (chain) int64 0 1 2 3
  * draw         (draw) int64 0 1 2 3 4 5 6 ... 9994 9995 9996 9997 9998 9999
  * delta_dim_0  (delta_dim_0) int64 0 1 2 3

We just need a way to count each unique combination of predictors. I didn’t see an obvious way to do that in xarray, so I’m copying the delta values to a NumPy array, then reshaping to include all draws for each chain along dimension 0, and finally getting the unique vectors of length p along with their counts, which we can use to calculate the probability of each model being selected.

rows = chains * draws

deltas = trace.posterior.delta.to_numpy()
models, counts = np.unique(
    deltas.reshape((rows, p)), axis=0, return_counts=True
)

for model, count in zip(models, counts):
    print(f"{model}: prob={count/rows:.3}")
[0 0 1 1]: prob=0.000175
[0 1 1 1]: prob=0.00025
[1 0 0 1]: prob=0.0112
[1 0 1 1]: prob=0.0578
[1 1 0 0]: prob=0.0394
[1 1 0 1]: prob=0.519
[1 1 1 0]: prob=0.0192
[1 1 1 1]: prob=0.353

The most chosen models are the same as BUGS: intercept + x0 + x1 + x3 is selected over 50% of the time, intercept + x0 + x1 + x2 + x3 about a third of the time, and so on.

%load_ext watermark
%watermark -n -u -v -iv -p pytensor
Last updated: Wed Mar 22 2023

Python implementation: CPython
Python version       : 3.11.0
IPython version      : 8.9.0

pytensor: 2.10.1

pymc  : 5.1.2
numpy : 1.24.2
pandas: 1.5.3
arviz : 0.14.0