Is Pearsons productmoment correlation the best way to test reliability between a test and its retest scores?

Question:

Is Pearson’s product–moment correlation the best way to test reliability between a test and its retest scores? Maybe not, say Miaofen Yen, PhD, RN, and Li-Hua Lo, PhD, RN, both professors at National Cheng Jung University in Tainan, Taiwan. In statistics, the authors explain, the reliability of a measure means “the proportion of the observed score variance due to the true scores, that is, the ratio of true score variance to total score variance.” Typically, statisticians use the Pearson correlation to calculate the reliability of results from test to retest, especially in nursing research, which is the domain of the authors. But research shows that this method has three limitations.

First, Pearson’s calculation is designed to show the relationship between two variables, but it is inappropriate to apply this correlation to two data sets on the same variable. Second, when multiple tests are employed, it’s hard to discern variations from test to test; when one concept is measured three times, generating three scores, one cannot create correlation coefficients for all three scores at the same time. Third, Pearson’s is unable to detect systematic errors (e.g., a miscalibrated measurement device that consistently reads 10 pounds heavier) even though test and retest scores may be “perfectly correlated,” as Miaofen and Li-Hua say.

An alternative approach called intraclass correlation, or ICC, and also known as generalizing coefficient, addresses these three limitations. Three issues need to be borne in mind when using ICC: First, the study design should focus on reliability, not correlation; second, the correct statistical model must be selected—either a one-way or a two-way random model depending on study conditions; and third, the number of measures in the study must be carefully considered.

Miaofen Yen and Li-Hua Lo demonstrated the strength of ICC on a study of competence for breast self-examination.
The study looked at perceived competence and perceived barriers to the self-exam. The doctors engaged 10 nurses to complete the research study twice over a twoweek period, polling them on 20 questions, each with a 5-point scale. Then they used ICC to run the test–retest reliability gauge. The calculations produced two ICC coefficients: The first was a single measure ICC (0.640), and the second was an average measure ICC (0.781). But it was the first result that the researchers found most applicable because in practical terms they would only give the test once.

Questions

1. Find several published examples of test–retest reliability studies done with Pearson’s product–moment correlation and evaluate whether ICC would have given better results.

2. Why is Pearson’s correlation unable to accurately reflect systematic errors, and how does ICC better accommodate this function?

Fantastic news! We've Found the answer you've been seeking!

Step by Step Answer:

Related Book For  book-img-for-question

Marketing Research

ISBN: 9781118808849

10th Edition

Authors: Carl McDaniel Jr, Roger Gates

Question Posted: