# Exploring Covariance and Its Role in Statistics

Comprehensive Definition, Description, Examples & Rules

## Introduction to Covariance:

A mathematical measure that helps us know how two random variables relate to each other is known as covariance. With the help of covariance, we can know how any change in one of the variables produces a change in another variable. When one variable increases, the other increases too, this is indicated by a positive covariance. On the other hand, a negative covariance shows us that when one variable increases, the other is liable to decrease. The covariance measure is dependent on the variable scale, which is computed with the help of correlation. The process of correlation represents the relationship between variables from -1 to +1 and makes the process of comparing different relationships easier.

## Covariance Formula:

Have a look at the covariance formula used to calculate the covariance between two variables denoted as “cov(X, Y)”:

Cov(X,Y)=E[(X−EX)(Y−EY)]=E[XY]−(EX)(EY).

In the given covariance equation,

- X and Y are random variables.
- Cov(X, Y) is the covariance between X and Y.

The covariance between X and Y describes how the values of X and Y move with one another. If X has huge values when Y is large, then (X-EX)(Y-EY) is, on average, positive. In this situation, the covariance is positive, and we refer to it as X and Y are related positively. If, on the other hand, X when Y is large, tends to be small, then (X-EX)(Y-EY) is, on average, negative. In this scenario, the covariance is negative, and we refer to it as X and Y are inversely connected.

## Interpretation of Covariance:

Here’s how you can discern the sign of covariance and what it represents about the relationship between variables:

- Positive Covariance: Positive covariance is represented in a situation when one variable is greater than its average and the other variable is also greater than its average. Similarly, when one of them is smaller than its average, the other is also smaller than its average. This situation symbolizes a positive tendency for the variables to move together.
- Negative Covariance: Negative covariance is represented in a situation where if one variable is greater than its mean, the other is likely to be smaller than its mean. This situation symbolizes a negative or inverse relationship between variables.
- Zero Covariance: Zero covariance represents no relationship between variables, i.e. any change in one variable will not be responsible for producing change in another variable. The zero covariance does not always represent variable independence because variables may also have a non-linear relationship with each other.

## Variance and Covariance:

In statistics, variance and covariance are two concepts that are interrelated with each other. While variance represents the measure or the value of how much a random variable is spread or dispersed, and how different are individual examinations from the average; covariance, on the other hand, analyzes how two random variables relate with one another and how different they are from each other. With the help of covariance, we can discern if changes in one variable produce any change in another variable. Variance and covariance are interconnected in another aspect: covariance plays a crucial role in computing the variance of sums and differences of randomly occurring variables. For instance, if we want to add two random variables, say, X and Y, to bring out a sum, S, the variance of this sum S has a variance that is dependent on the covariance between X and Y.

## Covariance and Correlation:

Covariance tells us how two random variables interact and change concerning each other. The direction of the linear relationship between variables is indicated through the use of covariance. When one variable increases, the other increases too, this is indicated by a positive covariance. On the other hand, a negative covariance shows us that when one variable increases, the other is liable to decrease.

On the other hand, Pearson’s correlation coefficient (r) brings the covariance to a standard form from -1 to +1. We can know about the direction, strength, and linearity of the relationship between the variables with the help of Pearson’s correlation coefficient. The covariance and correlation formula states that a correlation corresponding to +1 refers to a perfect positive linear relationship, while a correlation corresponding to -1 refers to a perfect negative linear relationship, and 0 corresponds to no relationship.

We can compute the covariance formula in terms of variance. The strength of the association between two variables is examined using correlation. To get the correlation between two variables, divide the covariance value by the standard deviation of both variables, this is stated by the covariance formula correlation. The standard deviation is a measure of the degree of variation in a data set.

## Covariance Properties:

There are various covariance properties. For example:

- Linearity: Covariance helps us find out the value of linear association between two random variables. It helps us to compute how any change in one of the variables is responsible for or influences changes in another variable related to it, along a straight or linear line. The covariance is unable to figure out the nature of two variables if the relationship between them is not linear.
- Symmetry: Covariance is symmetric by variables. In other simpler words, the pattern or order in which the variables are taken does not influence the covariance.

## Calculating Covariance:

Given that you have two sets of data, X and Y, here is a step-by-step explanation of how to calculate the covariance:

- Compute the average of X and Y.
- Compute the measure of difference from their averages.
- Multiply this difference for each of the data entities.
- Add up all the products obtained in the step above.
- If you are dealing with sample data, divide this sum by (n-1) or if you are calculating population data, divide this sum by n.

Go through the following example for a better understanding:

Compute the covariance of X and Y for the given data set X = {2,5,6,8,9} and Y = {4,3,7,5,6}

Solution:

Since X = {2,5,6,8,9}, Y = {4,3,7,5,6} and N = 5

Mean(X) = (2 + 5 + 6 + 8 + 9) / 5

= 30 / 5

= 6

Mean(Y) = (4 + 3 +7 + 5 + 6) / 5

= 25 / 5

= 5

Sample covariance Cov(X, Y) = ∑(xi – x ) × (yi – y)/ (N – 1)

= [(2 – 6)(4 – 5) + (5 – 6)(3 – 5) + (6 – 6)(7 – 5) + (8 – 6)(5 – 5) + (9 – 6)(6 – 5)] / 5 – 1

= 4 + 2 + 0 + 0 + 3 / 4

= 9 / 4

= 2.25

Population covariance Cov(x,y) = ∑(xi – x ) × (yi – y)/ (N)

= [(2 – 6)(4 – 5) + (5 – 6)(3 – 5) + (6 – 6)(7 – 5) + (8 – 6)(5 – 5) + (9 – 6)(6 – 5)] / 5

= 4 + 2 + 0 + 0 + 3 /

= 9 / 5

= 1.8

Hence, the sample covariance is 2.25 and the population covariance is 1.8.

## Covariance Matrix:

A covariance matrix works as a summarisation of relationships between more than two variables in a given dataset. In a covariance matrix, every entity represents the covariance between two interrelated variables. While diagonal entities represent the variances of individual variables, entities that are off-diagonal represent covariances between different pairs of variables. The covariance matrix is also used in multivariate statistics for various techniques in the field, for example analyzing the principal component, canonical correlation, linear discriminant, etc.

## Limitations of Covariance:

Covariance also has a lot of limitations and shortcomings, some of which are listed below:

- Sensitivity to the scales of variables: The measure of the covariance depends largely on the scale of the variables. With different units of measurement, the covariance gets affected, and comparing different covariances among different datasets becomes highly troublesome.
- Lack of standardized units: We measure or analyze covariance in units concerning the variables being considered. This makes it difficult to analyze and infer in varying situations, and hence, sound comparisons between different covariances cannot be made.

Step Up Your Math Game Today!

Free sign-up for a personalised dashboard, learning tools, and unlimited possibilities!

## Key Takeaways

- A mathematical measure that helps us know how two random variables relate to each other is known as covariance.
- When one variable increases, the other increases too, this is indicated by a positive covariance. On the other hand, a negative covariance shows us that when one variable increases, the other is liable to decrease.
- Pearson’s correlation coefficient (r) brings the covariance to a standard form from -1 to +1. We can know about the direction, strength, and linearity of the relationship between the variables with the help of Pearson’s correlation coefficient.
- A covariance matrix works as a summarisation of relationships between more than two variables in a given dataset. In a covariance matrix, every entity represents the covariance between two interrelated variables.
- Some limitations of using covariance as a measure of the association are its sensitivity to the scale of variables, lack of standardized units, etc.

## Quiz

#### Question comes here

## Frequently Asked Questions

A positive covariance suggests that both variables are likely to increase or decrease together.

No, covariance cannot be used to measure the strength of the relationship between variables, on its own.

Some limitations of using covariance as a measure of the association are its sensitivity to the scale of variables, lack of standardized units, etc.

No, covariance values cannot be compared across different datasets due to their sensitivity to the scale of variables and lack of standardized units.