In mathematics, probability, and statistics, a multivariate random variable or random vector is a list of mathematical variables each of whose value is unknown, either because the value has not yet occurred or because there is imperfect knowledge of its value. The individual variables in a random vector are grouped together because they are all part of a single mathematical system — often they represent different properties of an individual statistical unit. For example, while a given person has a specific age, height and weight, the representation of these features of an unspecified person from within a group would be a random vector. Normally each element of a random vector is a real number.
Contents
- Probability distribution
- Operations on random vectors
- Expected value covariance and cross covariance
- Expectation of a quadratic form
- Expectation of the product of two different quadratic forms
- Portfolio theory
- Regression theory
- Vector time series
- References
Random vectors are often used as the underlying implementation of various types of aggregate random variables, e.g. a random matrix, random tree, random sequence, stochastic process, etc.
More formally, a multivariate random variable is a column vector
Probability distribution
Every random vector gives rise to a probability measure on
The distributions of each of the component random variables
Operations on random vectors
Random vectors can be subjected to the same kinds of algebraic operations as can non-random vectors: addition, subtraction, multiplication by a scalar, and the taking of inner products.
Similarly, a new random vector
If
Expected value, covariance, and cross-covariance
The expected value or mean of a random vector
The covariance matrix (also called the variance-covariance matrix) of an
By extension, the cross-covariance matrix between two random vectors
where again the indicated matrix expectation is taken element-by-element in the matrix. Here the (i,j)th element is the covariance between the i th element of
Expectation of a quadratic form
One can take the expectation of a quadratic form in the random vector X as follows:
where C is the covariance matrix of X and tr refers to the trace of a matrix — that is, to the sum of the elements on its main diagonal (from upper left to lower right). Since the quadratic form is a scalar, so is its expectation.
Proof: Let
Then based on the formula for the covariance, if we denote
Hence
which leaves us to show that
This is true based on the fact that one can cyclically permute matrices when taking a trace without changing the end result (e.g.: tr(AB) = tr(BA)).
We see that
And since
is a scalar, then
trivially. Using the permutation we get:
and by plugging this into the original formula we get:
Expectation of the product of two different quadratic forms
One can take the expectation of the product of two different quadratic forms in a zero-mean Gaussian random vector X as follows:
where again C is the covariance matrix of X. Again, since both quadratic forms are scalars and hence their product is a scalar, the expectation of their product is also a scalar.
Portfolio theory
In portfolio theory in finance, an objective often is to choose a portfolio of risky assets such that the distribution of the random portfolio return has desirable properties. For example, one might want to choose the portfolio return having the lowest variance for a given expected value. Here the random vector is the vector
Regression theory
In linear regression theory, we have data on n observations on a dependent variable y and n observations on each of k independent variables xj. The observations on the dependent variable are stacked into a column vector y; the observations on each independent variable are also stacked into column vectors, and these latter column vectors are combined into a matrix X (not denoting a random vecctor in this context) of observations on the independent variables. Then the following regression equation is postulated as a description of the process that generated the data:
where β is a postulated fixed but unknown vector of k response coefficients, and e is an unknown random vector reflecting random influences on the dependent variable. By some chosen technique such as ordinary least squares, a vector
Then the statistician must analyze the properties of
Vector time series
The evolution of a k×1 random vector
where the i-periods-back vector observation