What is variance and covariance




















One of the most commonly discussed disadvantages of variance is that it gives added weight to numbers that are far from the mean, or outliers. Squaring these numbers can at times result in skewed interpretations of the data set as a whole. More precisely, covariance refers to the measure of how two random variables in a data set will change together. A positive covariance means that the two variables at hand are positively related, and they move in the same direction. A negative covariance means that the variables are inversely related, or that they move in opposite directions.

In this formula, X represents the independent variable, Y represents the dependent variable, N represents the number of data points in the sample, x-bar represents the mean of the X, and y-bar represents the mean of the dependent variable Y.

While both covariance and correlation indicate whether variables are positively or inversely related to each other, they are not considered to be the same. This is because correlation also informs about the degree to which the variables tend to move together.

Covariance is used to measure variables that have different units of measurement. By leveraging covariance, researchers are able to determine whether units are increasing or decreasing, but they are unable to solidify the degree to which the variables are moving together due to the fact that covariance does not use one standardized unit of measurement. Correlation, on the other hand, standardizes the measure of interdependence between two variables and informs researchers as to how closely the two variables move together.

The correlation coefficient is the term used to refer to the resulting correlation measurement. It will always maintain a value between one and negative one. When the correlation coefficient is one, the variables under examination have a perfect positive correlation. In other words, when one moves, so does the other in the same direction, proportionally. If the correlation coefficient is less than one, but still greater than zero, it indicates a less than perfect positive correlation.

The closer the correlation coefficient gets to one, the stronger the correlation between the two variables. When the correlation coefficient is zero, it means that there is no identifiable relationship between the variables. Covariance can also be used as a tool to diversify an investor's portfolio. In order to do so, a portfolio manager should look for investments that have a negative covariance to one another. That means when one asset's return drops, another related asset's return rises.

So purchasing stocks with a negative covariance is a great way to minimize risk in a portfolio. The extreme peaks and valleys of the stocks' performance can be expected to cancel each other out, leaving a steadier rate of return over the years. Portfolio Management.

Investing Essentials. Fundamental Analysis. Risk Management. Your Privacy Rights. To change or withdraw your consent choices for Investopedia. At any time, you can update your settings through the "EU Privacy" link at the bottom of any page. These choices will be signaled globally to our partners and will not affect browsing data. We and our partners process data to: Actively scan device characteristics for identification.

I Accept Show Purposes. Your Money. Personal Finance. Your Practice. Popular Courses. Investing Portfolio Management. Variance vs. Covariance: An Overview Variance and covariance are mathematical terms frequently used in statistics and probability theory. In statistics, a variance is the spread of a data set around its mean value, while a covariance is the measure of the directional relationship between two random variables.

But this new measure we have come up with is only really useful when talking about these variables in isolation. Imagine we define 3 different Random Variables on a coin toss:.

Now visualize that each of these are attached to the same Sampler, such that each is receiving the same event at the same point in the process.

The only real difference between the 3 Random Variables is just a constant multiplied against their output, but we get very different Covariance between any pairs. The problem is that we are no longer accounting for the Variance of each individual Random Variable. The way we can solve this is to add a normalizing term that takes this into account.

We'll end up using. The property we've been trying to describe is the way each of these Random Variables correlate with one another. Putting everything we've found together we arrive at the definition of Correlation:.

The short answer is The Cauchy-Schwarz inequality. Exploring the relationship between Correlation and the Cauchy-Schwarz inequality deserves its own post to really develop the intuition. For now it is only important to realize that dividing Covariance by the square root of the product of the variance of both Random Variables will always leave us with values ranging from -1 to 1. These are basic components of probability and statistics. Despite this they tend to be often poorly understood.

I hope this post and the last have shown how they are all elegantly relate to one another:. If you want to dive even deeper into the why of Variance as well other other ways of summarizing a Random Variable checkout out this post on Moments of a Random Variable. Want to learn about Bayesian Statistics and probability? If you also like programming languages, you might enjoy my book Get Programming With Haskell from Manning.

Variance - squaring Expectations to measure change In mathematically rigorous treatments of probability we find a formal definition that is very enlightening.



0コメント

  • 1000 / 1000