1. Basic Distributions#
From this lecture, make sure you understand what a random variable is, the difference between discrete and continuous distributions, PDF/PMF vs. CDF, and the different types of parameters (shape, scale, rate, location). I’ll slowly expand this list until I’ve got all the distributions we use in the course.
Be careful about what parameterization you’re using, as it will change depending on software. I will try to use Vidakovic [2017] versions here, but may add alternate parameterizations.
Discrete#
Bernoulli Distribution#
PMF: \(P(X=k|p) = p^k(1-p)^{1-k}\) for \(k \in \{0, 1\}\)
CDF: \(F(k|p) = \begin{cases} 0 & \text{for } k < 0 \\ 1-p & \text{for } 0 \leq k < 1 \\ 1 & \text{for } k \geq 1 \end{cases}\)
Mean: \(p\)
Variance: \(p(1-p)\)
Median: \(\begin{cases} 0 & \text{if } p < \dfrac{1}{2} \\ \text{any value in } [0,1] & \text{if } p = \dfrac{1}{2} \\ 1 & \text{if } p > \dfrac{1}{2} \end{cases}\)
Mode: \(\begin{cases} 0 & \text{if } p < \dfrac{1}{2} \\ 0, 1 & \text{if } p = \dfrac{1}{2} \\ 1 & \text{if } p > \dfrac{1}{2} \end{cases}\)
Support: \(\{0, 1\}\)
Parameters: \(p\) (probability of success)
Notation: \(X \sim \text{Bernoulli}(p)\)
Binomial Distribution#
PMF: \(P(X=k|n,p) = \binom{n}{k} p^k(1-p)^{n-k}\) for \(k \in \{0, 1, 2, ..., n\}\)
CDF: \(F(k|n,p) = \sum_{i=0}^{k} \binom{n}{i} p^i(1-p)^{n-i}\)
Mean: \(np\)
Variance: \(np(1-p)\)
Support: \(\{0, 1, 2, ..., n\}\)
Parameters: \(n\) (number of trials), \(p\) (probability of success)
Notation: \(X \sim \text{Bin}(n, p)\)
Poisson Distribution#
PMF: \(P(X=k|\lambda) = e^{-\lambda}\dfrac{\lambda^k}{k!}\) for \(k \in \{0, 1, 2, ...\}\)
CDF: \(F(k|\lambda) = e^{-\lambda}\sum_{i=0}^{k} \dfrac{\lambda^i}{i!}\)
Mean: \(\lambda\)
Variance: \(\lambda\)
Support: \(\{0, 1, 2, ...\}\)
Parameters: \(\lambda\) (rate)
Notation: \(X \sim \text{Poisson}(\lambda)\)
Geometric Distribution#
PMF: \(P(X=k|p) = p(1-p)^k\) for \(k \in \{0, 1, 2, ...\}\)
CDF: \(F(k|p) = 1 - (1-p)^{k+1}\) for \(x \ge 0\), else \(0\).
Mean: \(\dfrac{1-p}{p}\)
Variance: \(\dfrac{1-p}{p^2}\)
Mode: \(0\)
Support: \(\{0, 1, 2, ...\}\)
Parameters: \(p\) (probability of success)
Notation: \(X \sim \text{Geometric}(p)\)
Continuous#
Normal Distribution#
PDF (variance): \(f(x|\mu,\sigma^2) = \dfrac{1}{\sqrt{2\pi\sigma^2}} e^{-\dfrac{(x-\mu)^2}{2\sigma^2}}\)
PDF (precision): \(f(x|\mu,\tau) = \sqrt{\dfrac{\tau}{2\pi}} e^{-\dfrac{\tau}{2} (x-\mu)^2}\)
CDF: \(\Phi(x|\mu,\sigma) = \dfrac{1}{2}\left[1 + \text{erf}\left(\dfrac{x-\mu}{\sigma\sqrt{2}}\right)\right]\)
Mean: \(\mu\)
Variance: \(\sigma^2\)
Median: \(\mu\)
Mode: \(\mu\)
Support: \((-\infty, \infty)\)
Parameters: \(\mu\) (mean), \(\sigma^2\) (variance), \(\tau\) (precision, defined as \(\tau = 1/\sigma^2\))
Notation: \(X \sim N(\mu, \sigma^2)\)
Beta Distribution#
PDF: \(f(x|\alpha,\beta) = \dfrac{x^{\alpha-1} (1-x)^{\beta-1}}{B(\alpha,\beta)}\) for \(x \in (0, 1)\)
CDF: \(F(x|\alpha,\beta) = I_x(\alpha,\beta)\)
Mean: \(\dfrac{\alpha}{\alpha+\beta}\)
Variance: \(\dfrac{\alpha \beta}{(\alpha+\beta)^2(\alpha+\beta+1)}\)
Mode:
If \(\alpha > 1\) and \(\beta > 1\), \(\text{Mode} = \dfrac{\alpha - 1}{\alpha + \beta - 2}\)
Support: \((0, 1)\)
Parameters: \(\alpha,\beta\) (shape parameters)
Notation: \(X \sim \text{Beta}(\alpha, \beta)\)
Cauchy Distribution#
PDF: \(f(x|x_0,\gamma) = \dfrac{1}{\pi\gamma\left[1 + \left(\dfrac{x-x_0}{\gamma}\right)^2\right]}\)
CDF: \(F(x|x_0,\gamma) = \dfrac{1}{\pi}\arctan\left(\dfrac{x-x_0}{\gamma}\right) + \dfrac{1}{2}\)
Mean: Undefined
Variance: Undefined
Median: \(x_0\)
Mode: \(x_0\)
Support: \((-\infty, \infty)\)
Parameters: \(x_0\) (location), \(\gamma\) (scale)
Notation: \(X \sim \text{Cauchy}(x_0, \gamma)\)
Exponential Distribution#
PDF: \(f(x|\lambda) = \lambda e^{-\lambda x}\) for \(x \geq 0\)
CDF: \(F(x|\lambda) = 1 - e^{-\lambda x}\) for \(x \geq 0\)
Mean: \(\dfrac{1}{\lambda}\)
Variance: \(\dfrac{1}{\lambda^2}\)
Median: \(\dfrac{\ln 2}{\lambda}\)
Mode: \(0\)
Support: \([0, \infty)\)
Parameters: \(\lambda\) (rate)
Notation: \(X \sim \text{Exponential}(\lambda)\)
Gamma Distribution#
PDF: \(f(x|\alpha,\beta) = \dfrac{\beta^\alpha}{\Gamma(\alpha)} x^{\alpha-1} e^{-\beta x}\) for \(x \geq 0\)
CDF: \(F(x|\alpha,\beta) = \dfrac{\gamma(\alpha, \beta x)}{\Gamma(\alpha)}\)
Mean: \(\dfrac{\alpha}{\beta}\)
Variance: \(\dfrac{\alpha}{\beta^2}\)
Mode:
For \(\alpha > 1\), \(\text{Mode} = \dfrac{\alpha - 1}{\beta}\)
For \(\alpha \leq 1\), \(\text{Mode} = 0\)
Support: \([0, \infty)\)
Parameters: \(\alpha\) (shape), \(\beta\) (rate)
Notation: \(X \sim \text{Gamma}(\alpha, \beta)\)
Uniform Distribution#
PDF: \(f(x|a,b) = \dfrac{1}{b-a}\) for \(x \in [a, b]\)
CDF: \(F(x|a,b) = \dfrac{x-a}{b-a}\) for \(x \in [a, b]\)
Mean: \(\dfrac{a+b}{2}\)
Variance: \(\dfrac{(b-a)^2}{12}\)
Median: \(\dfrac{a+b}{2}\)
Mode: Any value in \([a, b]\)
Support: \([a, b]\)
Parameters: \(a\) (lower bound), \(b\) (upper bound)
Notation: \(X \sim \text{Uniform}(a, b)\)
Weibull Distribution#
BUGS#
PDF: \(f(x|r, \lambda) = \lambda r x^{r-1} e^{-\lambda x^r}\), for \(x > 0\)
CDF: \(F(x|r, \lambda) = 1 - e^{-\lambda x^r}\)
Mean: \(\lambda^{-\frac{1}{r}} \Gamma\left(1 + \frac{1}{r}\right)\)
Variance: \(\frac{\Gamma(1+2/r) - [\Gamma(1+1/r)]^2}{\lambda^{2/r}}\)
Parameters: \(r\) (shape parameter), \(\lambda\) (rate parameter)
Support: \((0, \infty)\)
Notation: \(X \sim \text{Weibull}(r, \lambda)\)
PyMC#
PDF: \(f(x|\alpha, \beta) = \frac{\alpha x^{\alpha - 1} e^{-(x/\beta)^{\alpha}}}{\beta^\alpha}\), for \(x > 0\)
CDF: \(F(x|\alpha, \beta) = 1 - e^{-(x/\beta)^\alpha}\) for \(x > 0\)
Mean: \(\beta \Gamma(1 + \frac{1}{\alpha})\)
Variance: \(\beta^2 \Gamma(1+2/\alpha - \mu^2/\beta^2)\)
Parameters: \(\alpha\) (shape parameter, \(\alpha > 0\)), \(\beta\) (scale parameter, \(\beta > 0\))
\(\alpha = r\)
\(\beta = \lambda^{-1/\alpha}\)
Notation: \(X \sim \text{Weibull}(\alpha, \beta)\)
Pareto Distribution#
PDF: \(f(x|x_m,\alpha) = \dfrac{\alpha x_m^\alpha}{x^{\alpha+1}}\) for \(x \geq x_m\)
CDF: \(F(x|x_m,\alpha) = 1 - \left( \dfrac{x_m}{x} \right)^\alpha\) for \(x \geq x_m\)
Mean: \(\begin{cases} \dfrac{\alpha x_m}{\alpha - 1} & \text{for } \alpha > 1 \\ \infty & \text{for } \alpha \leq 1 \end{cases}\)
Variance: \(\begin{cases} \dfrac{\alpha x_m^2}{(\alpha - 1)^2 (\alpha - 2)} & \text{for } \alpha > 2 \\ \infty & \text{for } \alpha \leq 2 \end{cases}\)
Median: \(x_m 2^{1/\alpha}\)
Mode: \(x_m\)
Support: \([x_m, \infty)\)
Parameters: \(x_m\) (scale parameter, \(x_m > 0\)), \(\alpha\) (shape parameter, \(\alpha > 0\))
Notation: \(X \sim \text{Pareto}(x_m, \alpha)\)
Other resources#
I highly recommend this overview of probability density functions and families by Michael Betancourt, especially section 2.