6. Time-to-event Models: Gastric Cancer*

6. Time-to-event Models: Gastric Cancer*#

Adapted from code for Unit 8: gastric.odc.

Data can be found here.

Problem statement#

Stablein, Carter, and Novak [1981] provide data on 90 patients affected by locally advanced, nonresectable gastric carcinoma. The patients are randomized to two treatments: chemotherapy alone (coded as 0) and chemotherapy plus radiation (coded as 1). Survival time is reported in days. Recorded times are censored if the patient stopped participating in the study before it finished.

Data#

Columns are, from left to right:

type: Treatment type, chemotherapy (0) or chemotherapy + radiation (1)
censored: If censored, meaning the patient survived the observation period, the time in days appears here rather than in the times column. 0 if not censored.
times: Recorded days without cancer recurrence. NaN if censored.

Model changes#

PyMC really did not like the noninformative exponential prior on v (α in this model). For some reason, the equivalent Gamma distribution is more stable. I found passing an initial value also helps avoid divergences here.

Method 1: `pm.Censored`#

The way PyMC censoring works is described in some detail in this notebook (Vincent [2023]). For right-censoring, try this: pm.Censored("name", dist, lower=None, upper=censored, observed=y). The censored values can be an array of the same shape as the y values.

If the y value equals the right-censored value, pm.Censored returns the complement to the CDF evaluated at the censored value. If the y value is greater than the censored value, it returns -np.inf. Otherwise, the distribution you passed to the dist parameter works as normal. What I’ve been doing is setting the values in the censored array to np.inf if the corresponding y value is not censored, and equal to the y value if it should be censored.

Note

I’ve noticed that this method is unstable with some distributions. Try using the imputed censoring model (below) if this one isn’t working.

data = np.loadtxt("../data/gastric.txt")
data.shape

(90, 3)

x = data[:, 0].copy()
censored = data[:, 1].copy()
y = data[:, 2].copy()
# for pymc, right-censored values must be greater than or equal to than the "upper" value
y[np.isnan(y)] = censored[np.isnan(y)]
censored[censored == 0] = np.inf

Warning

PyMC and BUGS do not specify the Weibull distribution in the same way! The below model is meant to be a direct translation of the lecture model, so we converted the parameters:

\[ \alpha = \nu \]

\[ \beta = \lambda ** (-1 / \alpha) \]

Warning

We have noticed that using the tau (precision) parameter to define the coefficients is also unstable. In order to stay as close as possible to the lecture model, I’ve kept it here, but when using pm.Censored, it’s better to define the normal priors for your coefficients and intercept with sigma (standard deviation).

This:

beta0 = pm.Normal("beta0", 0, sigma=10)
beta1 = pm.Normal("beta1", 0, sigma=5)

Not this:

beta0 = pm.Normal("beta0", 0, tau=0.01)
beta1 = pm.Normal("beta1", 0, tau=0.1)

log2 = np.log(2)

with pm.Model() as m_censored:
    beta0 = pm.Normal("beta0", 0, tau=0.01)
    beta1 = pm.Normal("beta1", 0, tau=0.1)
    alpha = pm.Gamma("alpha", 1, 0.001)

    lamb = exp(beta0 + beta1 * x)
    beta = lamb ** (-1 / alpha)

    obs_latent = pm.Weibull.dist(alpha=alpha, beta=beta)
    likelihood = pm.Censored(
        "likelihood",
        obs_latent,
        lower=None,
        upper=censored,
        observed=y,
    )

    median0 = pm.Deterministic("median0", (log2 * exp(-beta0)) ** (1 / alpha))
    median1 = pm.Deterministic(
        "median1", (log2 * exp(-beta0 - beta1)) ** (1 / alpha)
    )

    S = pm.Deterministic("survival", exp(-lamb * (likelihood**alpha)))
    pdf = lamb * alpha * (likelihood ** (alpha - 1)) * S
    pm.Deterministic("hazard", pdf / S)

    trace = pm.sample(
        10000, tune=2000, initvals={"alpha": 0.25}, target_accept=0.9
    )

az.summary(trace, var_names=["~S", "~f", "~h"], hdi_prob=0.9)

	mean	sd	hdi_5%	hdi_95%	mcse_mean	mcse_sd	ess_bulk	ess_tail	r_hat
beta0	-6.764	0.669	-7.839	-5.638	0.006	0.004	12567.0	15165.0	1.0
beta1	0.261	0.232	-0.120	0.639	0.002	0.001	18180.0	17129.0	1.0
alpha	1.024	0.098	0.868	1.189	0.001	0.001	12742.0	15317.0	1.0
median0	523.222	88.206	381.286	664.081	0.587	0.485	22881.0	26115.0	1.0
median1	405.071	69.937	294.164	518.142	0.454	0.412	23896.0	25037.0	1.0

This is the preferred way to implement the model in PyMC.

log2 = np.log(2)

with pm.Model() as m_censored:
    beta0 = pm.Normal("beta0", 0, sigma=10)
    beta1 = pm.Normal("beta1", 0, sigma=3.2)
    alpha = pm.Gamma("alpha", 1, 0.001)

    beta = 1 / exp(beta0 + beta1 * x)

    obs_latent = pm.Weibull.dist(alpha=alpha, beta=beta)
    likelihood = pm.Censored(
        "likelihood",
        obs_latent,
        lower=None,
        upper=censored,
        observed=y,
    )

    median0 = pm.Deterministic("median0", (log2 * exp(-beta0)) ** (1 / alpha))
    median1 = pm.Deterministic(
        "median1", (log2 * exp(-beta0 - beta1)) ** (1 / alpha)
    )

    S = pm.Deterministic("survival", exp(-((likelihood / beta) ** alpha)))
    pdf = (
        (alpha / beta)
        * ((likelihood / beta) ** (alpha - 1))
        * exp(-((likelihood / beta) ** alpha))
    )
    pm.Deterministic("hazard", pdf / S)

    trace = pm.sample(
        10000, tune=2000, initvals={"alpha": 0.25}, target_accept=0.9
    )

az.summary(trace, var_names=["~survival", "~hazard"], hdi_prob=0.9)

Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (4 chains in 4 jobs)
NUTS: [beta0, beta1, alpha]

Sampling 4 chains for 2_000 tune and 10_000 draw iterations (8_000 + 40_000 draws total) took 4 seconds.

	mean	sd	hdi_5%	hdi_95%	mcse_mean	mcse_sd	ess_bulk	ess_tail	r_hat
beta0	-6.604	0.164	-6.865	-6.327	0.001	0.001	22467.0	22594.0	1.0
beta1	0.255	0.233	-0.132	0.632	0.002	0.001	21877.0	22078.0	1.0
alpha	1.011	0.098	0.857	1.177	0.001	0.001	22416.0	23873.0	1.0
median0	632.289	559.645	121.467	1173.620	3.886	20.380	22038.0	24899.0	1.0
median1	475.234	383.121	106.540	862.676	2.585	18.081	23324.0	25162.0	1.0

Method 2: pm.Potential#

This method uses pm.Potential to achieve the same thing as above by evaluating the censored datapoints differently. It came from this notebook(Junpeng Lao [2023]).

x = data[:, 0].copy()
censored_vals = data[:, 1].copy()
y = data[:, 2].copy()

# we need to separate the observed values and the censored values
observed_mask = censored_vals == 0

y_censored = censored_vals[~observed_mask]
y_uncensored = y[observed_mask]
x_censored = x[~observed_mask]
x_uncensored = x[observed_mask]

n_right_censored = int(x_censored.shape[0])
n_observed = int(x_uncensored.shape[0])

# see https://www.pymc.io/projects/examples/en/latest/survival_analysis/weibull_aft.html
def weibull_lccdf(x, alpha, beta):
    """Log complementary cdf of Weibull distribution."""
    return -((x / beta) ** alpha)

log2 = np.log(2)

with pm.Model() as m_potential:
    beta0 = pm.Normal("beta0", 0, tau=0.01)
    beta1 = pm.Normal("beta1", 0, tau=0.1)
    alpha = pm.Gamma("alpha", 1, 0.001, initval=0.25)

    lamb_censored = exp(beta0 + beta1 * x_censored)
    beta_censored = lamb_censored ** (-1 / alpha)

    lamb_uncensored = exp(beta0 + beta1 * x_uncensored)
    beta_uncensored = lamb_uncensored ** (-1 / alpha)

    pm.Weibull(
        "observed",
        alpha=alpha,
        beta=beta_uncensored,
        observed=y_uncensored,
        shape=n_observed,
    )
    pm.Potential("censored", weibull_lccdf(y_censored, alpha, beta_censored))

    median0 = pm.Deterministic("median0", (log2 * exp(-beta0)) ** (1 / alpha))
    median1 = pm.Deterministic(
        "median1", (log2 * exp(-beta0 - beta1)) ** (1 / alpha)
    )

    trace = pm.sample(10000, tune=2000, target_accept=0.9)

Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (4 chains in 4 jobs)
NUTS: [beta0, beta1, alpha]

Sampling 4 chains for 2_000 tune and 10_000 draw iterations (8_000 + 40_000 draws total) took 10 seconds.

az.summary(trace, hdi_prob=0.9)

	mean	sd	hdi_5%	hdi_95%	mcse_mean	mcse_sd	ess_bulk	ess_tail	r_hat
beta0	-6.764	0.662	-7.828	-5.652	0.006	0.005	12566.0	14049.0	1.0
beta1	0.259	0.234	-0.127	0.638	0.002	0.001	16451.0	16081.0	1.0
alpha	1.024	0.097	0.864	1.181	0.001	0.001	12864.0	13806.0	1.0
median0	522.005	88.886	373.861	656.463	0.625	0.534	20871.0	23343.0	1.0
median1	405.233	70.329	287.568	513.743	0.469	0.377	22645.0	25564.0	1.0

Old imputed censoring method#

This method is from an older version of this notebook by Luis Mario Domenzain, George Ho, and Dr. Ben Vincent. The newer version doesn’t work for our purposes at this time, so I’ll be on the lookout for another way to do imputed censoring with varying censoring cutoff values.

I’m just going to preserve it here for posterity.

Warning

pm.Bound is deprecated, so this method has stopped working.

data = np.loadtxt("../data/gastric.txt")
x = data[:, 0].copy()
censored_vals = data[:, 1].copy()
y = data[:, 2].copy()

# we need to separate the observed values and the censored values
observed_mask = censored_vals == 0

censored = censored_vals[~observed_mask]
y_uncensored = y[observed_mask]
x_censored = x[~observed_mask]
x_uncensored = x[observed_mask]

log2 = np.log(2)

with pm.Model() as m:
    beta0 = pm.Normal("beta0", 0, tau=0.0001)
    beta1 = pm.Normal("beta1", 0, tau=0.0001)
    alpha = pm.Exponential("alpha", 3)

    lamb_censored = exp(beta0 + beta1 * x_censored)
    beta_censored = lamb_censored ** (-1 / alpha)

    lamb_uncensored = exp(beta0 + beta1 * x_uncensored)
    beta_uncensored = lamb_uncensored ** (-1 / alpha)

    impute_censored = pm.Bound(
        "impute_censored",
        pm.Weibull.dist(alpha=alpha, beta=beta_censored),
        lower=censored,
        shape=censored.shape[0],
    )

    likelihood = pm.Weibull(
        "likelihood",
        alpha=alpha,
        beta=beta_uncensored,
        observed=y_uncensored,
        shape=y_uncensored.shape[0],
    )

    median0 = pm.Deterministic("median0", (log2 * exp(-beta0)) ** (1 / alpha))
    median1 = pm.Deterministic(
        "median1", (log2 * exp(-beta0 - beta1)) ** (1 / alpha)
    )

    trace = pm.sample(10000, tune=2000, target_accept=0.9)

az.summary(trace, hdi_prob=0.9, kind="stats")

	mean	sd	hdi_5%	hdi_95%
beta0	-6.619	0.654	-7.658	-5.505
beta1	0.261	0.236	-0.135	0.642
α	1.002	0.096	0.844	1.158
impute_censored[0]	1470.516	624.422	882.003	2238.512
impute_censored[1]	1485.333	638.066	892.025	2267.560
impute_censored[2]	1623.087	636.271	1031.002	2399.749
impute_censored[3]	1629.358	644.748	1033.058	2417.048
impute_censored[4]	1896.556	636.864	1306.001	2680.457
impute_censored[5]	1927.291	645.113	1335.035	2706.771
impute_censored[6]	2044.492	647.964	1452.027	2827.988
impute_censored[7]	2064.511	637.872	1472.006	2851.689
impute_censored[8]	1144.755	823.971	381.008	2153.162
impute_censored[9]	1295.370	823.647	529.021	2299.400
impute_censored[10]	1717.105	832.210	945.080	2742.788
impute_censored[11]	1948.776	828.073	1180.049	2974.388
impute_censored[12]	2045.309	839.943	1277.020	3052.989
impute_censored[13]	2169.509	847.547	1397.058	3197.841
impute_censored[14]	2286.423	854.267	1512.011	3298.995
impute_censored[15]	2287.177	829.874	1519.089	3316.162
median0	520.012	90.909	369.907	658.412
median1	400.322	70.953	290.464	517.338

%load_ext watermark
%watermark -n -u -v -iv -p pytensor

Last updated: Sun Apr 13 2025

Python implementation: CPython
Python version       : 3.12.7
IPython version      : 8.29.0

pytensor: 2.30.2

arviz: 0.21.0
pymc : 5.22.0
numpy: 1.26.4

6. Time-to-event Models: Gastric Cancer*

Contents

6. Time-to-event Models: Gastric Cancer*#

Problem statement#

Data#

Model changes#

Method 1: pm.Censored#

Method 2: pm.Potential#

Old imputed censoring method#

Method 1: `pm.Censored`#