So
far we have studied problems relating to one variable only. They are called univariate
distributions. In the preceding
chapters, we have discussed the measures of averages like mean, median, mode,
quartiles, etc., and measures of dispersion like Quartile Deviation, Mean
Deviation, Standard Deviation, etc., for a univariate distribution. However, more often we are required to study
the behaviour or relationship between two or more variables. For example, the price and demand, supply and
production, income and expenditure of a given family, import and export of a
commodity, etc.,
Correlation analysis deals with the
association between two or more variables.
The degree of relationship between two or more variables is called
correlation. If there are only two
variables, say X and Y, then it is called linear correlation. If there are more than two variables involved
in correlation, then it is called multiple correlations. For example, in agricultural experiment, the yield
obtained depends upon many factors like quality of seed, irrigation facilities,
fertility of soil, manure which applied, pesticide applied, etc.,
In partial correlation, only two
variables are studied after eliminating the effect of the other variables. e.g. correlation between yield of crops and
rainfall by eliminating effect of temperature both yield and rainfall. In linear correlations, we are dealing with
two variables known as cause variable (X) and effect variable (Y). if the two variables vary together in the
same direction or opposite directions, they are said to be correlated. If as X increases, Y increases consistently,
we say X and Y are positively correlated.
Some variables are negatively correlated, in which as X increases, Y
decreases and as X decreases Y increases.
e.g. Price increases, the demand decreases. If the change in one variable is proportional
to the change in the other, the two variables are said to be perfectly
correlated. Therefore, whether
correlation is positive (direct) or negative (inverse) would depend upon the
direction of change of the variables.
Study
of correlation analysis is of great importance in practical problems,
especially in a business, because of the following reasons.
Correlation analysis helps businessmen, economists to analyze the side effects due to change in one variable and also it gives guidelines to them regarding the effect of the change on the other variable.
There are various methods for getting
the degree of relationship between the variables. Generally there are 4 methods depending upon
the nature of the data. They are
1. Scatter Diagram
2. Correlation Graph
3. Correlation Table
4. Coefficient of Correlation
SCATTER
DIAGRAM
It is one of the simplest ways of diagrammatic
representation of a bivariate distribution and it provides us one of the
simplest tools of ascertaining the correlation between two variables. Suppose we are given n pairs of values (x1, y1), (x2 , y2), (x3,
y3) ……… (xn , yn) of the two variables X
and Y. After plotting the given set of
values as points on a graph paper, we can study the nature of the diagram. Then a straight line can be drawn by
inspection, which seems to be the best fit for the given set of points. Some points will lie on the line and the
others will be near the line. While drawing
the line, care has to be taken about the number of points above and below the
line, which should be approximately same.
The pairs of values of X and
Y are represented by points plotted on a graph paper. The graph so obtained is called a Scatter
Diagram. By studying the
diagram, the following conclusions can be drawn about the correlation.
If all the plotted points lie on a straight line as
shown in Fig. 1 , then the correlation is perfect positive.
If
the points cluster around and they
ascend from lower left hand corner to the upper right hand corner then
there is positive correlation. It
is shown in Fig. 2.
If
all the points lie on a straight line starting from upper left hand corner to
lower corner the correlation is said to the perfect and negative. Fig. 3 depicts this type of correlation.
If
the points tend to fall along a direction from upper left hand corner to lower
right hand corner then there is negative correlation. This is shown in figure Fig. 4.
If
the points are scattered over the graph paper such that no definite conclusion
can be drawn about the direction of the points then there is absence of
correlation.
Scatter diagram is an important step in analyzing
correlation. When amount of data is
limited, a scatter diagram is easy to make manually. It portrays the joint behaviour of the two
variables. But it gives only the
direction of correlation. It does not
give us the numerical measure of the degree of correlation.
COEFFICIENT
OF CORRELATION
The three methods mentioned above give
us the direction of correlation if it exists.
But we also require exact numerical measurement for the degree or extent
of correlation. It is useful to have a numerical
measure, which is independent of the units of the original data, so that the
two variables can be compared. For this
we calculate coefficient of correlation.
Its value always lies between –1 and +1.
The sign of the correlation coefficient indicates whether the variables
are related positively or negatively, and the value indicates the degree of
relationship.
Definition:
The coefficient of correlation denoted by “ r ”
and named after Karl Pearson is
defined as
The value of r always lies between –1 and +1.
- If 0 < r < 1,
the correlation is positive.
- If r = 1, the
correlation is perfect positive.
- If - 1 < r < 0, then the correlation
is negative.
- If r = - 1, the
correlation is perfect negative.
If there is no correlation between the
two variables, r = 0 but the converse is not true.
EXERCISE:
1.
The following data represents the time in weeks (X) and the output in
thousand units (Y). Find the coefficient
of correlation.
x:
|
7
|
5
|
4
|
11
|
10
|
12
|
14
|
9
|
y:
|
14
|
8
|
8
|
19
|
16
|
19
|
20
|
16
|
[
Answer: 0.9635 ]
2. Find the coefficient of
correlation for the following data:
x:
|
14
|
8
|
10
|
11
|
9
|
13
|
5
|
y:
|
14
|
9
|
11
|
13
|
11
|
12
|
4
|
[
Answer: 0.9231 ]
3. Find the
coefficient of correlation for the following data representing cost in Rs. (X)
and sales in Rs. (Y) of a product for a period of eight years.
x:
|
84
|
80
|
92
|
85
|
95
|
90
|
83
|
87
|
y:
|
115
|
104
|
122
|
116
|
125
|
120
|
112
|
120
|
[
Answer: 0.9358 ]
4. Calculate the coefficient of correlation
between marks in Economics (X) and marks in Accountancy (Y) of a group of 10
students.
x:
|
53
|
47
|
42
|
60
|
63
|
52
|
57
|
55
|
61
|
48
|
y:
|
72
|
61
|
62
|
85
|
80
|
65
|
79
|
75
|
84
|
73
|
[
Answer: 0.8831 ]
5. Calculate
the coefficient of correlation between X and Y.
x:
|
5
|
8
|
10
|
12
|
15
|
18
|
21
|
24
|
25
|
6
|
y:
|
25
|
21
|
20
|
18
|
16
|
15
|
14
|
12
|
11
|
24
|
[
Answer: - 0.9828 ]
6. The
distribution of marks in Advertising (x) and marks in Business Planning (y) for
a group of ten students is given below:
Calculate product moment coefficient of correlation.
x:
|
25
|
20
|
17
|
16
|
20
|
14
|
23
|
21
|
15
|
12
|
y:
|
24
|
17
|
22
|
18
|
20
|
18
|
24
|
20
|
16
|
14
|
[
Answer: 0.8168]
7. The
following data gives the experience (x) in years of eight machine operators and
their performance ratings (y). Calculate
the coefficient of correlation.
x:
|
16
|
13
|
17
|
4
|
3
|
11
|
7
|
14
|
y:
|
88
|
87
|
89
|
72
|
70
|
82
|
78
|
84
|
[
Answer: 0.9803]
8. Find the Pearson’s coefficient of
correlation for the following data:
x:
|
140
|
138
|
126
|
132
|
135
|
131
|
137
|
142
|
y:
|
122
|
140
|
118
|
119
|
132
|
125
|
145
|
150
|
[
Answer: 0.7043 ]
9.
Find the
coefficient of correlation for the following data:
x:
|
53
|
59
|
72
|
43
|
93
|
35
|
55
|
80
|
y:
|
35
|
49
|
63
|
36
|
75
|
28
|
38
|
71
|
[
Answer: 0.9676 ]
10. Calculate the coefficient of correlation
from the following data:
x:
|
20
|
22
|
18
|
17
|
10
|
25
|
7
|
15
|
y:
|
15
|
17
|
16
|
10
|
5
|
19
|
4
|
8
|
[
Answer: 0.9553 ]
11. Calculate
the coefficient of correlation for the following data of heights in cms. (x)
and weights in kgs. (y) of a group of 10 students.
x:
|
159
|
163
|
165
|
162
|
158
|
160
|
165
|
167
|
168
|
170
|
y:
|
51
|
57
|
58
|
50
|
49
|
54
|
55
|
56
|
58
|
57
|
[
Answer: 0.7940 ]
12. Below
are the heights in cms. (x) and weights in kgs. (y) of a group of
children. Find the coefficient of
correlation.
x:
|
130
|
128
|
132
|
135
|
140
|
142
|
137
|
139
|
y:
|
31
|
30
|
36
|
32
|
41
|
40
|
35
|
34
|
[
Answer: 0.8020 ]
13. Calculate
the product moment coefficient of correlation.
x:
|
212
|
214
|
205
|
220
|
225
|
214
|
218
|
y:
|
500
|
515
|
577
|
530
|
522
|
516
|
525
|
[
Answer: 0.6683 ]
14. Find
the Pearson’s coefficient of correlation from the following data.
x:
|
10
|
2
|
5
|
7
|
9
|
4
|
8
|
y:
|
8
|
4
|
4
|
8
|
5
|
3
|
7
|
[
Answer: 0.7352 ]
15. Find
the coefficient of correlation between the marks in Mathematics and Physics
from the following data.
x:
|
40
|
37
|
90
|
85
|
67
|
75
|
80
|
52
|
80
|
y:
|
50
|
40
|
80
|
85
|
75
|
80
|
85
|
65
|
85
|
[
Answer: 0.95 ]
RANK CORRELATION
In
certain types of characteristics it is not possible to get numerical
measurements; but we can rank the individuals in order according to our own
judgement. e.g., smartness, beauty,
talent, etc., If two persons rank a
given group of individuals and we have to find how far the two judges agree
with each other, the technique of rank correlation can be used. In some cases though actual measurements are
available we may still be interested in only ranks, that is, the relative
position of an individual in the group.
Here also rank correlation is used.
The formula for Spearman’s coefficient of Rank
Correlation is
If two or
more observations have the same value then common rank by considering the
average can be given to all repeated values.
Here a correction factor is to be added to Σ d2 while calculating the
rank correlation coefficient.
This correction factor must be added to every
repeating value in the data. Finally the
calculation for the coefficient of Rank Correlation remains the same after calculating Σ d2
EXERCISE:
1. Calculate the
coefficient of rank correlation for the following data giving working capital
in lakhs of Rs. (x) and profit in thousands of Rs. (y) of 10 companies for the
year 2003.
x:
|
15
|
32
|
25
|
30
|
35
|
20
|
19
|
22
|
27
|
31
|
y:
|
50
|
70
|
65
|
72
|
90
|
58
|
53
|
57
|
68
|
74
|
[
Answer: 0.9515 ]
2. Calculate Spearman’s rank correlation
coefficient for the following data.
x:
|
105
|
112
|
107
|
115
|
160
|
152
|
148
|
132
|
y:
|
120
|
127
|
135
|
123
|
140
|
142
|
138
|
110
|
[
Answer: 0.5394 ]
3. Quotations of index
numbers of security prices of debentures of a certain joint stock company and
of prices of preference shares for the years 1995 – 2002 are given below. Use the method of rank correlation to
determine the relationship between debentures and share prices.
Year:
|
1995
|
1996
|
1997
|
1998
|
1999
|
2000
|
2001
|
2002
|
Debenture
|
97.8
|
99.2
|
98.8
|
98.3
|
98.4
|
96.7
|
97.6
|
97.1
|
Share Price
|
78.9
|
85.8
|
81.2
|
83.8
|
84.2
|
80.1
|
80.6
|
77.6
|
[
Answer: 0.8095 ]
4. Find Spearman’s
coefficient of correlation for the following data representing the exports (x)
and local sales (y), both expressed in lakhs of Rs. of fashion garments for 10
years.
x:
|
12
|
15
|
13
|
20
|
15
|
14
|
19
|
13
|
21
|
18
|
y:
|
25
|
21
|
15
|
18
|
20
|
17
|
20
|
16
|
20
|
22
|
[
Answer: 0.1333 ]
5. Calculate the rank
correlation coefficient between age of husband (x) and age of wife (y), both
expressed in years, from the following data.
x:
|
60
|
30
|
37
|
30
|
42
|
37
|
55
|
45
|
y:
|
50
|
25
|
33
|
27
|
40
|
33
|
50
|
42
|
[
Answer: 0.9643 ]
6. Calculate rank
correlation coefficient for the following data showing respectively the marks
in Economics (x) and marks in English (y).
x:
|
56
|
37
|
65
|
60
|
54
|
51
|
40
|
70
|
y:
|
50
|
42
|
55
|
48
|
51
|
53
|
38
|
47
|
[
Answer: 0.381 ]
7. Find the Spearman’s coefficient of
correlation for the following data.
x:
|
33
|
37
|
42
|
23
|
21
|
15
|
13
|
30
|
39
|
y:
|
17
|
27
|
32
|
12
|
13
|
11
|
9
|
25
|
30
|
[
Answer: 0.9667 ]
8. Find the rank
correlation coefficient for the following data representing marks in terminal
(x) and the marks in Final examination for a group of 10 students.
x:
|
52
|
33
|
47
|
65
|
43
|
33
|
54
|
66
|
75
|
70
|
y:
|
65
|
59
|
72
|
72
|
82
|
60
|
57
|
58
|
72
|
90
|
[
Answer: 0.2303 ]
9. Find rank correlation coefficient.
x:
|
84
|
89
|
72
|
75
|
90
|
62
|
62
|
78
|
y:
|
65
|
75
|
58
|
65
|
75
|
54
|
51
|
57
|
[
Answer: 0.881 ]
10. Calculate Spearman’s rank correlation
coefficient for the following data.
x:
|
101
|
113
|
83
|
109
|
101
|
97
|
83
|
95
|
90
|
117
|
y:
|
53
|
59
|
52
|
57
|
59
|
50
|
54
|
58
|
59
|
61
|
[
Answer: 0.5212 ]
11. Find the rank correlation coefficient for
the following data.
x:
|
64
|
72
|
70
|
85
|
64
|
90
|
60
|
85
|
89
|
54
|
y:
|
47
|
43
|
29
|
47
|
25
|
52
|
47
|
50
|
51
|
20
|
[
Answer: 0.7677 ]
12. The marks obtained by 10
students are as follows: Calculate the
coefficient of rank correlation.
x:
|
90
|
88
|
90
|
76
|
88
|
62
|
98
|
90
|
70
|
76
|
y:
|
61
|
58
|
64
|
73
|
73
|
78
|
58
|
82
|
58
|
67
|
[
Answer: -0.20909 ]
13. The ranks of 10 students
in three subjects A, B and C are given below.
Find the rank correlation coefficient for each of the three possible
pairs and comment on the result.
Student No:
|
1
|
2
|
3
|
4
|
5
|
6
|
7
|
8
|
9
|
10
|
Rank in A:
|
1
|
3
|
4
|
2
|
5
|
10
|
8
|
6
|
7
|
9
|
Rank in B:
|
3
|
5
|
1
|
2
|
6
|
10
|
4
|
9
|
7
|
8
|
Rank in C:
|
2
|
3
|
5
|
1
|
4
|
9
|
6
|
7
|
8
|
10
|
[ Answer:
Coefficient of Rank correlation between A & B = 0.7333
Coefficient of Rank correlation between B
& C = 0.7576
Coefficient of Rank correlation between A
& C = 0.9273
Hence, there is maximum
correlation between subjects A and C ]
14. Three judges gave the following ranks to
eight participants in a personality contest.
Calculate coefficient of rank correlation for each of the three possible
pairs and decide which pair of judges has the most common approach.
Candidate No:
|
1
|
2
|
3
|
4
|
5
|
6
|
7
|
8
|
Rank by Judge A:
|
7
|
6
|
5
|
8
|
3
|
1
|
2
|
4
|
Rank by Judge B:
|
6
|
8
|
4
|
7
|
1
|
2
|
4
|
5
|
Rank by Judge C:
|
4
|
5
|
6
|
7
|
3
|
1
|
2
|
8
|
[ Answer: Coefficient of Rank correlation between A & B = 0.7976
Coefficient of Rank correlation between B
& C = 0.5833
Coefficient of Rank correlation between A
& C = 0.6667
Hence, there is maximum
correlation between Judges A and B ]
15. Three Judges X, Y, Z in a painting
competition judged the contestants as follows: Calculate coefficient of rank
correlation for each of the three possible pairs and decide which pair of
judges has the most common approach.
Contestant No:
|
1
|
2
|
3
|
4
|
5
|
6
|
7
|
8
|
Rank by Judge X:
|
1
|
2
|
3
|
5
|
4
|
6
|
7
|
8
|
Rank by Judge Y:
|
2
|
4
|
1
|
3
|
8
|
5
|
6
|
7
|
Rank by Judge Z:
|
1
|
3
|
2
|
5
|
4
|
8
|
7
|
6
|
COEFFICIENT OF CORRELATION
FOR A BI-VARIATE FREQUENCY DISTRIBUTION
The
coefficient of correlation in the case of a bi-variate frequency data, is
calculated using the following formula;
EXERCISE:
1. The following table gives bivariate
frequency distribution of 50 students according to age in years and height in
cms. Calculate the Pearson’s coefficient
of correlation.
Age in Years:
|
Height in Centimetres
|
||||
144 – 148
|
148 – 152
|
152 – 156
|
156 – 160
|
Total
|
|
10 – 12
|
7
|
2
|
-
|
-
|
9
|
12 – 14
|
3
|
5
|
3
|
3
|
14
|
14 – 16
|
-
|
3
|
8
|
6
|
17
|
16 – 18
|
-
|
-
|
5
|
5
|
10
|
Total
|
10
|
10
|
16
|
14
|
50
|
[
Answer: 0.6974 ]
2. Calculate
the coefficient of correlation for the following data expressing the service in
years and the salary in Rs. of 50 employees of a firm.
Service in Years:
|
Salary in Rupees.
|
||||
1000-1500
|
1500-2000
|
2000-2500
|
2500-3000
|
3000-3500
|
|
0 – 5
|
5
|
2
|
-
|
-
|
-
|
5 – 10
|
3
|
3
|
6
|
-
|
-
|
10 – 15
|
-
|
3
|
4
|
-
|
-
|
15 – 20
|
-
|
-
|
4
|
5
|
6
|
20 – 25
|
-
|
-
|
-
|
3
|
6
|
[ Answer: 0.8542 ]
3. The
following table represents height in cms.
and weights in kgs. of a group of
25 boys. Calculate the product moment
coefficient of correlation.
Weight in Kgs:
|
Height in Centimetres
|
||||
150 – 155
|
155 – 160
|
160 – 165
|
165 – 170
|
170 – 175
|
|
50 – 54
|
2
|
2
|
-
|
-
|
-
|
54 – 58
|
1
|
1
|
2
|
-
|
-
|
58 – 62
|
-
|
-
|
3
|
4
|
-
|
62 – 66
|
-
|
-
|
1
|
3
|
3
|
66 – 70
|
-
|
-
|
-
|
1
|
2
|
[
Answer: 0.8547 ]
4. The
following table represents food expenditure and family income of a few
families. Calculate the coefficient of
correlation.
Family Income in Rs:
|
Food Expenditure in percentage:
|
||||
10 – 15
|
15 – 20
|
20 – 25
|
25 – 30
|
30 – 35
|
|
1500 – 2000
|
3
|
3
|
-
|
-
|
-
|
2000 – 2500
|
2
|
2
|
3
|
-
|
-
|
2500 – 3000
|
-
|
2
|
2
|
3
|
-
|
3000 – 3500
|
-
|
-
|
3
|
2
|
2
|
3500 – 4000
|
-
|
-
|
-
|
2
|
1
|
[
Answer: 0.7897 ]
5. The
following data represents the sales in lakhs of Rs. and profit in thousands of Rs. of sixty six
companies. Find the coefficient of
correlation.
Profits in thousands of Rs.
|
Sales in lakhs of Rs.
|
||||
50 – 60
|
60 – 70
|
70 – 80
|
80 – 90
|
90 – 100
|
|
50 – 55
|
1
|
3
|
1
|
-
|
-
|
55 – 60
|
4
|
7
|
2
|
5
|
-
|
60 – 65
|
3
|
5
|
4
|
10
|
6
|
65 – 70
|
-
|
1
|
3
|
7
|
2
|
70 – 75
|
-
|
-
|
2
|
-
|
-
|
[
Answer: 0.4072 ]
6. Calculate the Karl
Pearson’s coefficient of correlation for the following distribution.
Marks in Civics
|
Marks in History
|
||||
0 – 10
|
10 – 20
|
20 – 30
|
30 – 40
|
40 – 50
|
|
0 – 10
|
2
|
1
|
-
|
-
|
-
|
10 – 20
|
4
|
3
|
2
|
-
|
-
|
20 – 30
|
3
|
2
|
2
|
3
|
1
|
30 – 40
|
-
|
1
|
1
|
2
|
1
|
40 – 50
|
-
|
-
|
-
|
2
|
-
|
[
Answer: 0.6133 ]
7. Calculate the product
moment coefficient of correlation.
Income in Rs.
|
Savings in Rs.
|
||||
0 – 400
|
400 – 800
|
800 – 1200
|
1200 – 1600
|
1600 – 2000
|
|
2500 – 3000
|
5
|
3
|
-
|
-
|
-
|
3000 – 3500
|
3
|
4
|
4
|
-
|
-
|
3500 – 4000
|
-
|
3
|
5
|
3
|
1
|
4000 – 4500
|
-
|
-
|
3
|
2
|
2
|
4500 – 5000
|
-
|
-
|
1
|
1
|
1
|
[
Answer: 0.7738 ]
8. A firm
administers a test to sales trainees before they go into the field. The management of the firm is interested in
determining the relationship between the test scores and the sales made by the
trainees at the end of one year in the field.
The following data were collected for 50 sales personnel who had been in
the field for one year. Calculate the
coefficient of correlation.
Test Score
|
Sales in thousands of Rs.
|
|||
10 – 12
|
12 – 14
|
14 – 16
|
16 – 18
|
|
60 – 70
|
2
|
3
|
-
|
-
|
70 – 80
|
3
|
4
|
2
|
-
|
80 – 90
|
-
|
7
|
12
|
2
|
90 – 100
|
-
|
-
|
8
|
7
|
[
Answer: 0.73 ]