2. Numerical Characteristics of Random Variables

2. Numerical Characteristics of Random Variables#

Expectation#

The expectation (mean) is computed by taking a weighted average of the possible outcomes, with the weights being the probabilities of each outcome. For a discrete random variable \(X\) with possible outcomes \(x_1, x_2, ..., x_n\) and corresponding probabilities \(p_1, p_2, ..., p_n\), the expectation is:

\[E[X] = \sum_{i=1}^{n}x_i p_i\]

For a continuous random variable with probability density function \(f(x)\), the expectation is:

\[E[X] = \int_{\mathbb{R}}xf(x)dx\]

A general function \(\phi(X)\) of a random variable \(X\) also has an expectation, given by:

\[E[\phi(X)] = \sum_{i=1}^{n}\phi(x_i)p_i\]

for discrete variables, and

\[E[\phi(X)] = \int_{\mathbb{R}}\phi(x)f(x)dx\]

for continuous variables.

Variance and standard deviation#

The variance of a random variable \(X\) can be calculated as:

\[Var(X) = E[(X - E[X])^2] = E[X^2] - (E[X])^2\]

The standard deviation, denoted \(\sigma\), is the square root of the variance.

\[\sigma = \sqrt{Var(X)}\]

Moments#

A more generalized way to describe expected value, variance, and other properties of distributions is to look at the moments. I was going to type up a brief explanation, but found this excellent post by Gregory Gundersen that has the best explanation of the concept I’ve ever read. We don’t often need to look at the higher-order moments in our course, but this post is really worth reading.

Quantiles#

Quantiles are points taken at regular intervals from the cumulative distribution function (CDF) of a random variable. For instance, the \(p\)-th quantile of a distribution \(F\) is the value \(\xi_p\) such that \(F(\xi_p) = p\).

For discrete distributions, \(\xi_p = \inf \{ x | \sum_{x_i \leq x} p_i \geq p \}\).

For continuous distributions, \(\xi_p\) satisfies \(\int_{0}^{\xi_p} f(x) dx = p\) or \(F^{-1} (p) = \xi_p\).

The median is the \(0.5\) quantile.

Mode#

The mode of a distribution is the most probable value. For a discrete random variable, it is the value \(x_i\) that maximizes the probability mass function, \(P(X=x_i)\). For a continuous random variable, it is the value that maximizes the density function.