Measures of Interobserver Agreement

Measures of Interobserver Agreement

Measures of Interobserver Agreement: What Every Copy Editor Needs to Know

As a copy editor, you know that accuracy and consistency are paramount when it comes to producing high-quality content. However, you may not be aware of the importance of interobserver agreement and the measures used to determine it. Interobserver agreement is the degree to which two or more observers or evaluators agree on their observations or assessments of a particular subject. Here’s what you need to know about measures of interobserver agreement.

Why Is Interobserver Agreement Important?

Interobserver agreement is essential in situations where multiple observers or evaluators are involved in making assessments or judgments, such as in scientific research, medical diagnosis, or even the editing of a manuscript. If there is low interobserver agreement, it means that the observers or evaluators are not consistent in their assessments, which can lead to errors, inaccurate conclusions, and a lack of trust in the results.

Measures of Interobserver Agreement

There are different measures of interobserver agreement, each with its own strengths and weaknesses. Here are three commonly used measures:

1. Percentage Agreement

Percentage agreement is the simplest and least accurate measure of interobserver agreement. It calculates the percentage of times that two or more observers or evaluators agree on their assessments. For example, if two copy editors edit the same manuscript and agree on 90% of their suggested changes, the percentage agreement is 90%. However, percentage agreement can be misleading because it does not take into account chance agreement, which can inflate the agreement rate.

2. Cohen’s Kappa

Cohen’s kappa is a statistical measure that takes into account chance agreement and adjusts for it. It calculates the degree of agreement between observers or evaluators beyond what would be expected by chance alone. Cohen’s kappa ranges from -1 to 1, with 1 indicating perfect agreement, 0 indicating chance agreement, and negative values indicating disagreement beyond chance. Cohen’s kappa is suitable for situations where there are more than two evaluators or observers.

3. Intraclass Correlation Coefficient (ICC)

ICC is another statistical measure used for assessing interobserver agreement. It examines the consistency of ratings or measurements made by different observers or evaluators over time. ICC can be used for both continuous and categorical data. ICC ranges from 0 to 1, with 1 indicating perfect agreement and 0 indicating no agreement.


It’s important for copy editors to be aware of interobserver agreement and the measures used to assess it. By using reliable measures such as Cohen’s kappa or ICC, editors can ensure that their work is accurate and consistent. Remember that interobserver agreement is not just about producing reliable results, but also about building trust in the assessment process. When readers trust that assessments have been made consistently and accurately, they are more likely to trust the content itself.