

|
|
|
Correlation
- calculate the correlation coeficient Chapter 6
Introduction
Interest in the correspondence or relationship between 2 variables Similarity and attraction
Education level and salary Study time and grades
A
8
88
Scatter Plots Use pairs of scores as coordinate points and plot points A
8
88
(8,88)
A
8
88
(graph) Student Study time Exam score A
8
88
Line of Best Fit Line that best captures the pattern of the
coordinates
(graph)
1) Slope direction
From bottom left to top right
= Positive Relationship
= Direct Correlation
As one goes up other goes up
As one goes down other goes down
Negative Slope
= Inverse Correlation
As one goes up other goes down
As one goes down other goes up
Line of Best
Fit
2) Quality of fit
Best fit is not necessarily good fit (e.g.,
buying shoes)
Good = coordinates generally clustered around line of best fit
Shows the relationship between 2 variables
is systematic
Shows the relationship between 2 variables is somewhat systematic
Poor Fit Shows the relationship between 2 variables is not very systematic
Really Poor Fit Correlation
Coefficient
= indicates type of relationship (positive or negative) (graph)
(graph)
= how strong relationship is
When number = +1.0 = perfect fit ********************************************************************
When number = .75 to .99 = high fit Strong relationship between 2 vars. (graph)
When number = .30 to .75 = moderate fit
Weak relationship between 2 vars. (graph) *************** When number = .0 = no fit No relationship between 2 vars. (graph)
Magnitude and sign are independent (graph)
Perfect Correlation: +1.00 -1.00 High Correlation: + .80 - .80 Moderate Correlation: +. 50 - .50 Low Correlation: + .25 - .25 No Correlation: 0 0
Calculating
the Corr. Coefficient
Looking for systematic trends:
Both variables decrease together One increases while other decreases One decreases while other increases Let's start by ordering scores
Student Study time Exam score H 3 71 B 4 75 D 5 78 G 6 82 C 7 85 A 8 88 E 10 94 F 12 100
Student # Drinks Exam score H 3 100 B 4 94 D 5 88 G 6 85 C 7 82 A 8 78 E 10 75 F 12 71
Examples: # drinks (pints) vs. score (%)
# hours studies vs. score (%)
Solution = convert all numbers to z- scores
Remember - transformation does NOT
When Both Var. Increase Together Student Study time
Exam score ZxZy Sign
1 -1.34 -1.34 + 2 -0.80 -0.80 + 3 -0.27 -0.27 + 4 +0.27 +0.27 + 5 +0.80 +0.80 + 6 +1.34 +1.34 +
** Sum = + so Correlation Coefficient = +**
When Both Var. Decrease Together Student Study time
Exam score ZxZy Sign
1 +1.34 +1.34 + 2 +0.80 +0.80 + 3 +0.27 +0.27 + 4 -0.27 -0.27 + 5 -0.80 -0.80 + 6 -1.34 -1.34 +
** Sum = + so Correlation Coefficient = +**
Student Study time
Exam score ZxZy
1 -1.34 -1.34 1.7956 2 -0.80 -0.80 .64 3 -0.27 -0.27 .0729 4 +0.27 +0.27 .0729 5 +0.80 +0.80 .64 6 +1.34 +1.34 1.7956
Student Study time
Exam score ZxZy Sign
1 +1.34 -1.34 - 2 +0.80 -0.80 - 3 +0.27 -0.27 - 4 -0.27 +0.27 - 5 -0.80 +0.80 - 6 -1.34 +1.34 -
** Sum = - so Correlation Coefficient = -**
Student Study time
Exam score ZxZy
1 +1.34 -1.34 -1.7956 2 +0.80 -0.80 -.64 3 +0.27 -0.27 -.0729 4 -0.27 +0.27 -.0729 5 -0.80 +0.80 -.64 6 -1.34 +1.34 -1.7956
Computational Formula
(formula)
Try it
Given a high positive relationship between
# drinks and # errors...
Correlation and Variability
Variability
= change in # errors
To what extent did change in X account for change in Y?? To what extent did change in # drinks account
for change in # errors?
R2
= proportion of variability accounted for by X
(graph)
2 variables MUST be related if change in one causes the change in another
Correlation not necessarily causation
2) X causes Y 3) Y causes X 4) Third variable
|