In many research methods or statistics courses you come across the idea that correlated errors signal a third variable. In other words, you have a missing, relevant variable that induces correlation among your residuals. That’s a tough idea to wrap your head around, but it is easier to consider with respect to a given topic: cheating on exams. This post builds intuition for “correlated errors with respect to missing third variables” in the context of college exams and cheating.

The Exam Structure

First, let’s get a feel for the exams. I’m going to use a lot of images in this post so it helps to walk through the basics of each plot. Imagine students taking a multiple choice test where they fill in one of five responses, “A,” “B,” “C,” “D,” or “E” for each question. The correct answer for question one is “C”

where the x-axis shows the response options a student can select for each question, and the y-axis shows the question number (there is only one question so far).

The correct answer for question two is also “C”

and that pattern continues for the rest of the questions on this 5-item test.

In other words, imagine a 5 question exam where the correct answer for each question is “C.” With the basic images in play, we can think about how students might respond.

No Cheating – What is the Pattern of Errors across Questions?

First, consider an exam where students do not cheat. If nobody cheats, then everyone’s errors will be dispersed about the true option for each question, “C.” Some people falsely select “A” whereas others falsely select “E,” and yet others falsely select “B.” Here is a plot that retains the purple crosses that mark the true option, “C,” but also includes student responses from Susie, Peter, and John.

For example, on question one Susie selects “A,” Peter selects “E,” and John selects “D,” meaning that none of the students get the answer correct. Every student, though, marks the correct response (C) for question 4.

There is no pattern in this plot. The green triangles, red circles, and blue-green squares are dispersed about the true score purple-crosses randomly. John gets some questions wrong, Susie gets some questions wrong, and Peter gets some questions wrong, but whether John incorrectly marks “A” or “E” doesn’t tell us anything about whether Peter incorrectly marks “A” or “E.” They are all wrong in a random way.

Bo$$^2$$m =)