Skip to main content

Questions tagged [agreement-statistics]

Agreement is the degree to which two raters, instruments, etc, give the same value when applied to the same object. Special statistical methods have been designed for this task.

0 votes
0 answers
9 views

Per-Item inter rater reliability with multiple values and raters

I am trying to find the best statistical calculation to measure the agreement between 72 different raters on one item. My goal is to convey in a statistic how spread the raters are on their rating and ...
Jaromando's user avatar
  • 101
0 votes
0 answers
30 views

Is it possible to use unweighted Kappa for some questions and weighted for others to measure interrater agreement in the same questionnaire?

I intend to conduct an interrater agreement analysis on a questionnaire about student essays that contains binary options, mainly yes-no questions, ordinal variables, primarily a likert agreement ...
Christopher Michael Turner Mue's user avatar
2 votes
0 answers
14 views

What is the best metric to use to discard annotators with low IAA (inter-annotator agreement) with all others?

This question is specific to ordinal data collected on the likert scale What is the best metric to discard annotators with low inter-annotator agreement (IAA) with others? from e.g., Cohen’s Kappa, ...
user2160809's user avatar
0 votes
0 answers
11 views

How is Krippendorff's alpha defined when expected disagreement is 0?

I'm wondering what is the definition of Krippendorff's Alpha statistics when the expected disagreement is 0? The general form of Krippendorff's Alpha is: $\alpha = 1 - \frac{D_{o}}{D_{e}}$ I'm ...
Vera Bernhard's user avatar
1 vote
0 answers
63 views

Calculate inter-rater noise using Kahnemans (2021) approach

I need help calculating signal and noise based on the method described by Kahneman et al. (2021) in their book "Noise." They provide a technique for quantifying noise between raters ...
Magnus Nordmo's user avatar
3 votes
1 answer
67 views

How to improve testing method B, if Bland-Altman analysis show a near-perfect correlation between the difference and the mean between method A and B?

I have two sets of measurements (two different methods), A and B. Correlation between the measurements is only modest. Therefore, I wondered whether there is some inherent bias in (one or both) of ...
pishcotec_a's user avatar
1 vote
0 answers
20 views

How to calculate an ICC for test-retest, with more than one rater

In my data, i have two raters that rated multiple scores to do a test-retest analysis. The method fulfill the requirements for test-retest (short period between both visits) I would like to calculate ...
BPeif's user avatar
  • 143
0 votes
0 answers
13 views

Is it possible to calculate inter-rater reliability for one item rated by multiple raters with weights?

I have a survey with a number of statements that had participants categorizing the statements into one of the 4 options they were provided. The participants were then asked to rate the confidence of ...
Jay Jakka's user avatar
2 votes
0 answers
83 views

Comparing human estimates and algorithm on normalized score

By means of an algorithm, we want to predict a metric value $Y$ that can vary for different items $i=1,\ldots,n$. The quantities $y_i$ have been estimated by $N$ human annotators and we want to ...
cdalitz's user avatar
  • 5,392
2 votes
1 answer
63 views

method comparison/agreement - is Bland-Altman or equivalence of the mean best

I am interested in the appropriate technique for assessing the agreement of the paired values of measurements made by 2 measuring devices, an equivalence test of the mean of the difference in the ...
user3156942's user avatar
0 votes
0 answers
22 views

How should I approach calculating sample size for number of raters required for a study?

I am proposing a study for evaluating reports generated by 3 human experts vs 3 different LLMs on a set of 10 situations. Basically, we're trying to whether the human experts are better or if the LLMs ...
user2615936's user avatar
0 votes
0 answers
20 views

How to format data for one-way intraclass correlation coefficient?

I'm testing inter-rater reliability of an instrument on a sample of subjects from several sites. Each subject is assessed by two raters, different between each site. According to this paper, I should ...
Charlie's user avatar
  • 142
2 votes
0 answers
40 views

Correct approach for evaluating correlation between multiple measurement devices used on the same subject (repeated measures)

My scenario is as follows. I am using three different devices (TempDeviceA, TempDeviceB, and TempDeviceC) to take the temperature of each animal in a group every day over several weeks. TempDeviceC is ...
SoManyQuestions's user avatar
3 votes
1 answer
49 views

Proof of lower bound of Cohen's kappa

Can anyone refer me to a paper or book showing that Cohen's kappa is bounded below by $-1$? I've read various papers stating this, but I have never found a complete proof. This question was asked here,...
nahp's user avatar
  • 151
1 vote
0 answers
28 views

Extending paired t-test to compare agreement between three or more analytical instruments over time

I have three instruments that measure the concentration of dust in the air, and I want to test whether the mean difference between their measurements is zero (i.e that they give the same measurement). ...
fieldofsheep's user avatar

15 30 50 per page
1
2 3 4 5
30