17. Multinomial regression*

Contents

import pymc as pm
import numpy as np
import arviz as az
from pytensor.tensor import subtensor as st

%load_ext lab_black
%load_ext watermark

17. Multinomial regression*#

Adapted from Unit 7: NHANESmulti.odc

Problem statement#

The National Health and Nutrition Examination Survey (NHANES) is a program of studies designed to assess the health and nutritional status of adults and children in the United States. The survey is unique in that it combines interviews and physical examinations.

Assume that N subjects select a choice form K categories. The i-th subject is characterized by 3 covariates x[i,1], x[i,2], and x[i,3]. Given the covariates, model the probability of a subject selecting the category k, k=1,…,K.

# data
# fmt: off
y = np.array([[1, 0, 0, 0, 0],
              [0, 1, 0, 0, 0],
              [1, 0, 0, 0, 0],
              [0, 0, 1, 0, 0],
              [0, 1, 0, 0, 0],
              [0, 0, 1, 0, 0],
              [0, 0, 0, 1, 0],
              [0, 0, 0, 0, 1],
              [0, 0, 0, 0, 1],
              [0, 0, 0, 1, 0]])

X =np.array([[2, 4, 9],
             [1, 5, 10],
             [1, 6, 14],
             [2, 4, 21],
             [2, 4, 22],
             [2, 6, 30],
             [3, 3, 33],
             [3, 2, 36],
             [3, 1, 40],
             [4, 1, 44]])
# fmt: on

y.shape

(10, 5)

X_aug = np.concatenate((np.ones((X.shape[0], 1)), X), axis=1)
X_aug.shape

(10, 4)

with pm.Model() as m:
    y_data = pm.Data("y", y, mutable=False)
    X_data = pm.Data("X", X_aug, mutable=True)

    _beta = pm.Normal("_beta", 0, tau=0.1, shape=(4, 5))
    # first col in BUGS example is assigned 0
    # might be better to prepend a column of zeroes rather than overwrite
    beta = pm.Deterministic("beta", st.set_subtensor(_beta[:, 0], 0))
    eta = pm.math.exp(pm.math.dot(X_data, beta))
    p = eta / pm.math.sum(eta, axis=1)

    pm.Multinomial("likelihood", n=1, p=p, observed=y)

    trace = pm.sample(10000)

/Users/aaron/mambaforge/envs/pymc_env2/lib/python3.11/site-packages/multipledispatch/dispatcher.py:27: AmbiguityWarning: 
Ambiguities exist in dispatched function _unify

The following signatures may result in ambiguous behavior:
	[object, ConstrainedVar, Mapping], [ConstrainedVar, object, Mapping]
	[ConstrainedVar, object, Mapping], [object, ConstrainedVar, Mapping]
	[ConstrainedVar, Var, Mapping], [object, ConstrainedVar, Mapping]
	[object, ConstrainedVar, Mapping], [ConstrainedVar, Var, Mapping]


Consider making the following additions:

@dispatch(ConstrainedVar, ConstrainedVar, Mapping)
def _unify(...)

@dispatch(ConstrainedVar, ConstrainedVar, Mapping)
def _unify(...)

@dispatch(ConstrainedVar, ConstrainedVar, Mapping)
def _unify(...)

@dispatch(ConstrainedVar, ConstrainedVar, Mapping)
def _unify(...)
  warn(warning_text(dispatcher.name, ambiguities), AmbiguityWarning)
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (4 chains in 4 jobs)
NUTS: [_beta]

100.00% [44000/44000 01:18<00:00 Sampling 4 chains, 0 divergences]

Sampling 4 chains for 1_000 tune and 10_000 draw iterations (4_000 + 40_000 draws total) took 79 seconds.

X_new = np.array([1, 3, 3, 30]).reshape((1, 4))

with m:
    pm.set_data({"X": X_new})
    ppc = pm.sample_posterior_predictive(trace, predictions=True)

Sampling: [likelihood]

100.00% [40000/40000 00:05<00:00]

az.summary(ppc.predictions.mean(dim=("likelihood_dim_2")))

	mean	sd	hdi_3%	hdi_97%	mcse_mean	mcse_sd	ess_bulk	ess_tail	r_hat
likelihood[0]	0.008	0.050	0.0	0.0	0.000	0.000	35823.0	35822.0	1.0
likelihood[1]	0.038	0.118	0.0	0.2	0.001	0.000	40230.0	38995.0	1.0
likelihood[2]	0.118	0.208	0.0	0.6	0.001	0.001	40260.0	37228.0	1.0
likelihood[3]	0.683	0.310	0.1	1.0	0.002	0.001	40808.0	37358.0	1.0
likelihood[4]	0.153	0.228	0.0	0.6	0.001	0.001	43723.0	38330.0	1.0

az.summary(ppc.predictions.mean(dim=("likelihood_dim_2")))["mean"].sum()

1.0

%watermark -n -u -v -iv -p aesara,aeppl

Last updated: Fri Feb 03 2023

Python implementation: CPython
Python version       : 3.11.0
IPython version      : 8.9.0

aesara: 2.8.10
aeppl : 0.1.1

pymc    : 5.0.1
arviz   : 0.14.0
pytensor: 2.8.11
numpy   : 1.24.1