ADVERTISEMENTS:
In this article we will discuss about:- 1. Meaning and Significance of Correlation 2. Types of Correlation 3. Measures.
Meaning and Significance of Correlation:
It is clear from the concepts of of variables and the difference between dependent and independent variables that variables may be related to each other. For instance, demand and supply are related to the price of the commodity, agricultural output is dependent on the amount of rainfall, marks of students are dependent on time spent on learning, quantity demanded may depend on advertisement expenditure, consumption is dependent on income and so on.
Correlation is a measure of the nature of relationship that exists between two or more variables. It ranges between -1 to +1. It helps in understanding the extent to which two variables are related and the direction of their relationship. In the words of Croxton and Cowden, “When the relationship is of a quantitative nature, the appropriate statistical tool for discovering and measuring the relationship and expressing it in brief formula is known as correlation.”
ADVERTISEMENTS:
Significance of the Study of Correlation:
1. Correlation measures the strength of relationship between two or more variables. For example, the relationship between income and consumption expenditure, price and quantity demanded etc.
2. When the nature of relationship between variables is known, it is easy to predict the value of one variable when the other variable is known.
3. It helps in understanding the behaviour of various economic variables like, demand, supply, GDP, interest, money supply, inflation, income and expenditure and so on.
ADVERTISEMENTS:
4. In business firms, it helps in making decisions on cost, price, sales, advertisement etc.
Types of Correlation:
Depending upon the nature of relationship between variables and the number of variables under study, correlation can be classified into following types:
1. On the basis of direction of change-Positive and negative correlation
2. On the basis of number of variables-Simple, partial and multiple correlation.
ADVERTISEMENTS:
3. On the basis of ratio of variation in the variables-Linear and non-linear correlation.
1. On the Basis of Direction of Change:
(i) Positive Correlation:
Correlation between two variables is said to be positive when both the variables move in the same direction. This means, when one variable increases, the other also increases and when one decreases, the other also decreases. For instance, correlation between income and expenditure is said to be positive because as one’s income increases, his expenditure also increases.
ADVERTISEMENTS:
(ii) Negative Correlation:
Correlation between two variables is said to be negative when both the variables move in the opposite direction. This means, when one variable increases, the other decreases and when one decreases, the other increases. For example, the correlation between demand and price is said to be negative because as price increases, the quantity demanded decreases and as price decreases, the quantity demanded increases.
2. On the Basis of Number of Variables:
Depending on the number of variables under study, correlation can be simple, partial or multiple.
ADVERTISEMENTS:
(i) Simple Correlation:
When the relationship between only two variables is studied, it is a simple correlation. In case of partial and multiple correlation, there are more than two variables that are related.
(ii) Partial Correlation:
In a partial correlation, there are more than two variables that are related but the relationship between two variables alone is studied, assuming the other variables to be constant.
(iii) Multiple Correlation:
In multiple correlation, the relationship between more than two variables is studied simultaneously.
For example- when quantity demanded is considered, it is affected by many variables like price, income, price of substitute products etc. In a partial correlation, we may study the relationship between quantity demanded and price of the commodity, assuming all other variables such as income, price of substitute products etc., to be constant.
In multiple correlation, however, we study the relationship between quantity demanded and price, income and prices of substitutes, simultaneously.
3. On the Basis of Ratio of Variation in the Variables:
(i) Linear Correlation:
When the ratio of change between two variables is constant, then the correlation is said to be linear. In linear correlation, the change in one variable is in a constant proportion to the other variable.
In the above example, the variables X and Y change in the same ratio of 2:5.
Hence, the correlation between the two variables would be linear.
(ii) Non-Linear Correlation:
When the ratio of change between two variables increases or decreases, then the correlation is said to be non-linear or curvi-linear.
In the example given below, the correlation between X and Y would be non-linear or curve-linear because the ratio of change is not constant.
Measures of Correlation:
The most popular and commonly used methods of studying correlation between two variables are:
1. Scatter diagram method
2. Karl Pearson’s coefficient of correlation
3. Spearman’s rank correlation coefficient
This is the simplest method of studying the relationship between two variables. In this method, the values of both the variables are plotted on a graph paper. If there are two variables, say X and Y, the variable X can be taken on the X-axis and Y on the Y- axis. For each pair of X and Y, a dot is plotted. Observing the way the points are scattered gives an idea as to how the two variables are related.
Various degrees of correlation between two variables can be shown with the help of scatter diagrams as given below:
i. Perfect Positive Correlation:
In a perfect positive correlation, all the dots lie in a straight line and are upward sloping. The correlation coefficient (r) would be equal to +1, when the correlation is perfectly positive.
ii. Perfect Negative Correlation:
In a perfect negative correlation, the dots lie on the same line and are downward sloping. The correlation coefficient (r) would be equal to -1, when the correlation is perfectly negative.
iii. High Degree of Positive Correlation:
When the points come closer to a straight line and are moving from bottom left to top right, there is said to be a high degree of positive correlation. The value of the correlation coefficient (r) would lie between + 0.7 and + 1.
iv. High Degree of Negative Correlation:
When the points come closer to a straight line and are moving from top left to bottom right, there is said to be a high degree of negative correlation. The value of the correlation coefficient (r) would lie between – 0.7 and – 1.
v. Low Degree of Positive Correlation:
In this case, the points are widely scattered but are rising from lower left to upper right. The value of correlation coefficient (r) would be close to 0 but positive.
vi. Low Degree of Negative Correlation:
In this case, the points are widely scattered but are falling from upper left to lower right. The value of correlation coefficient (r) would be close to 0 but negative.
vii. No Correlation:
When there is no relationship between variables, the points would be scattered all over and would not move in any direction. The value of correlation coefficient (r) would be equal to zero when there is no relationship between variables.
Merits of Scatter Diagram Method:
1. It is simple to understand as it is a non-mathematical method.
2. Relationship between variables can be understood by mere observation.
3. It is not affected by extreme items.
4. It is a preliminary step of investigating the relationship between two variables.
Demerits of Scatter Diagram Method:
1. It is not an accurate measure of correlation. The scatter diagram only gives the direction of relationship and shows whether the correlation is high or low. However, it does not give the exact degree of correlation between two variables.
2. The method is useful only when number of observations is small. A scatter diagram does not give a precise measurement of correlation when there are large numbers of observations.
2. Karl Pearson’s Coefficient of Correlation:
The Karl Pearson’s coefficient of correlation is denoted by r and can be used to measure correlation in case of both individual series as well as grouped data.
i. Direct Method:
There are two ways to calculate coefficient of correlation under this method:
(i) First way to calculate coefficient of correlation under the direct method is by using the formula given below:
(ii) When the mean is in decimals, then the calculation of deviations from the mean may become tedious.
In such a situation, the following formula can be applied to compute the correlation directly without taking deviations:
ii. Shortcut Method:
When the actual mean is in fraction, deviations can also be taken from the assumed mean. When deviations are taken from the assumed mean, the following formula is applied to compute the correlation coefficient.
Merits of Karl Pearson’s Correlation Method:
1. The Karl Pearson’s coefficient of correlation gives the exact measure of correlation between variables.
2. It gives both the direction and the degree of relationship between variables.
Demerits of Karl Pearson’s Correlation Method:
1. It always assumes a linear relationship between variables.
2. The value of the coefficient is affected by the presence of extreme values.
3. It takes time to calculate the correlation coefficient using this method and it is a complicated method as compared to other measures of correlation.
3. Spearman’s Rank Correlation Coefficient:
The Karl Pearson’s coefficient of correlation is computed based on the assumption that the observations are normally distributed. However, when the distribution of the observations is not known, then one cannot use the previously mentioned methods of calculating correlation. Also, Karl Pearson’s coefficient of correlation is unsuitable to study the correlation between two qualitative variables, such as honesty and beauty.
In all such cases, Spearman’s rank correlation coefficient can be applied to study the relationship between two variables. In this method, the variables need to be assigned ranks on the basis of their size from the smallest to the largest or from the largest to the smallest.
This method is named after the British Psychologist Charles Edward Spearman, who developed it in 1904.
Spearman’s rank correlation coefficient is computed in the following manner:
1. When Ranks are given:
When the ranks have already been assigned to the items, following steps are to be used in calculating correlation:
(i) Calculate the difference (D) between two ranks, i.e. Rx – Ry.
(ii) The differences have to be squared (D2) and their sum is to be taken as ZD2.
(iii) Then the following formula is to be used to calculate the correlation coefficient:
2. When Ranks are Not Given:
When the ranks are not already associated with the items and rather the marks or the values are assigned to each item, then the ranks have to be given to each item on the basis of the values or the marks attached to them.
Following steps are to be followed when ranks are not given:
(i) First the rank is to be assigned to each item in the distribution. The variables can be assigned ranks on the basis of their size from smallest to largest or from largest to smallest.
(ii) Calculate the difference (D) of the two ranks, i.e. Rx – Ry.
(iii) The differences have to be squared (D2) and their sum is to be taken as ΣD2.
(iv) Then the following formula is to be used:
3. When Ranks are Equal:
When there are equal ranks, for instance, when there are two 3rd ranks, then they are given the rank (3+4)/2 = 3.5 and if there are three 3rd ranks, then it becomes (3+4+5)/3=4.
This adjustment is incorporated in the formula as follows:
Where, D = Difference of rank in the two series
N = Total number of pairs
m = Number of times each rank repeats
Merits of Spearman’s Rank Correlation Coefficient:
1. It is simple to understand.
2. It is easy to calculate as compared to the Karl Pearson’s correlation method.
3. It can be easily applied when the data is qualitative in nature. For example, the level of satisfaction derived by the two consumers from different products can easily be ranked and degree of correlation can be computed.
4. This method can also be applied when the data is not in the form of ranks. The actual data can be converted to ranks in such cases.
Demerits of Spearman’s Rank Correlation Coefficient:
1. This method cannot be applied when the data is in the form of grouped frequency distribution.
2. The calculation of Spearman’s rank correlation coefficient becomes time consuming when the data is very large and when ranks are not given.
Comments are closed.