9. Stochastic Search Variable Selection#
SSVS is a variable selection technique for linear models and GLMs. Consider the linear equation for \(\mu\)
\[\begin{split} \begin{align*} \mu & = \beta_0 + \beta_1 x_1 + ... + \beta_k x_k \\
\beta_i & = \delta_i \alpha_i \\
\alpha_i & \sim N(0,\tau) \\
\delta_i & \sim Bern(p_i)
\end{align*} \end{split}\]
where each \(\beta_i\) coefficient consists of a Normally-distributed \(\alpha_i\) multiplied by a \(\delta_i\) indicatator that is Bernoulli-distributed. If the indicator equals 1, then its \(\beta_i\) equals \(\alpha_i\), otherwise the coefficient is zero and the variable is not selected.
A posteriori, we can analyze the number of times each variable is selected in the MCMC samples, and then choose the model (combination of \(\delta\)’s) visited most. All columns of predictor data must be on the same scale for the comparison to work.