Correlation is a measure of association between two or more variables. When two or more variables very in sympathy so that movement in one tends to be accompanied by corresponding movements in the other variable(s), they are said to be correlated.

“The correlation between variables is a measure of the nature and degree of association between the variables”.

As a measure of the degree of relatedness of two variables, correlation is widely used in exploratory research when the objective is to locate variables that might be related in some way to the variable of interest.


Correlation can be classified in several ways. The important ways of classifying correlation are:

  1. Positive and negative,
  2. Linear and non-linear (curvilinear) and
  3. Simple, partial and

1. Positive and Negative Correlation

If both the variables move in the same direction, we say that there is a positive correlation, i.e., if one variable increases, the other variable also increases on an average or if one variable decreases, the other variable also decreases on an average.

On the other hand, if the variables are varying in opposite direction, we say that it is a case of negative correlation; e.g., movements of demand and supply.

2. Linear and Non-linear (Curvilinear) Correlation

 If the change in one variable is accompanied by change in another variable in a constant ratio, it is a case of linear correlation. Observe the following data:

X    :    10



40      50

Y :      25



100    125

The ratio of change in the above example is the same. It is, thus, a case of linear correlation. If we plot these variables on graph paper, all the points will fall on the same straight line.

On the other hand, if the amount of change in one variable does not follow a constant ratio with the change in another variable, it is a case of non-linear or curvilinear correlation. If a couple of figures in either series X or series Y are changed, it would give a non-linear correlation.

3. Simple, Partial and Multiple Correlation

The distinction amongst these three types of correlation depends upon the number of variables involved in a study. If only two variables are involved in a study, then the correlation is said to be simple correlation. When three or more variables are involved in a study, then it is a problem of either partial or multiple correlation. In multiple correlation, three or more variables are studied simultaneously. But in partial correlation we consider only two variables influencing each other while the effect of other variable(s) is held constant.

Suppose we have a problem comprising three variables X, Y and Z. X is the number of hours studied, Y is I.Q. and Z is the number of marks obtained in the examination. In a multiple correlation, we will study the relationship between the marks obtained (Z) and the two variables, number of hours studied (X) and I.Q. (Y). In contrast, when we study the relationship between X and Z, keeping an average I.Q. (Y) as constant, it is said to be a study involving partial correlation.

In this lesson, we will study linear correlation between two variables.


The correlation analysis, in discovering the nature and degree of relationship between variables, does not necessarily imply any cause and effect relationship between the variables. Two variables may be related to each other but this does not mean that one variable causes the other. For example, we may find that logical reasoning and creativity are correlated, but that does not mean if we could increase peoples’ logical reasoning ability, we would produce greater creativity. We need to conduct an actual experiment to unequivocally demonstrate a causal relationship.

But if it is true that influencing someones’ logical reasoning ability does influence their creativity, then the two variables must be correlated with each other. In other words, causation always implies correlation, however converse is not true.

Let us see some situations-

1. The correlation may be due to chance particularly when the data pertain to a small A small sample bivariate series may show the relationship but such a relationship may not exist in the universe.

2. It is possible that both the variables are influenced by one or more other

For example, expenditure on food and entertainment for a given number of households show a positive relationship because both have increased over time. But, this is due to rise in family incomes over the same period. In other words, the two variables have been influenced by another variable - increase in family incomes.

3. There may be another situation where both the variables may be influencing each other so that we cannot say which is the cause and which is the effect. For example, take the case of price and demand. The rise in price of a commodity may lead to a decline in the demand for it. Here, price is the cause and the demand is the effect. In yet another situation, an increase in demand may lead to a rise in Here, the demand is the cause while price is the effect, which is just the reverse of the earlier situation. In such situations, it is difficult to identify which variable is causing the effect on which variable, as both are influencing each other.

The foregoing discussion clearly shows that correlation does not indicate any causation or functional relationship. Correlation coefficient is merely a mathematical relationship and this has nothing to do with cause and effect relation. It only reveals co-variation between two variables. Even when there is no cause-and-effect relationship in bivariate series and one interprets the relationship as causal, such a correlation is called spurious or non-sense correlation. Obviously, this will be misleading. As such, one has to be very careful in correlation exercises and look into other relevant factors before concluding a cause-and-effect relationship.