0
$\begingroup$

I am trying to implement an adaptive test using 3PL IRT model. We need to screen the candidates and label their expertise as (beginner, intermediate, or expert).

We also need a percentage score for each examinee.

When the examinee is given a sufficient number of items, the initial estimate of ability should not have a major effect on the final estimate of ability.

The final estimate of ability is the last estimated theta value before the stopping rule. It can have values in the range -3 to +3.

Q: How do I convert this to a percentage score so that the end user can know their performance?

Q: How to determine the cut-score for each level of expertise? What are the things that I need to take into consideration while calculating this?

$\endgroup$

1 Answer 1

1
$\begingroup$

Theoretically in IRT, $\theta$ and $b$ (which are measured in the same scale) do not have lower or upper bounds. They can go from $-\infty$ to $\infty$. Sometimes a test is unable to capture the full extent of an examinee's knowledge because it doesn't have enough items in the lower or higher ends of the difficulty scale, for example. Anyway, it is not a straightforward task to convert from $\theta$ values to percentages.

One thing you can do (and I know of a national test in my country that does something similar) is to transform your data using a desired mean and standard deviation. For example, you can displace your mean from 0 to 500 by adding 500 to all examinees $\theta$ values and changing your scale accordingly. You can then present their scores as if they went from 0 to 1000 (which you know that theoretically is not the truth, it just means your scale is centered at 500).

Ultimately, you should have some kind of interpretation of your difficulty scale, by analyzing the items in the item bank and trying to separate them into different categories (which is what you ask in your second question). Then, you can use these categories to report your students progress. This analysis can be done by specialists in the topic of your test. They may use anchor items, for example (items which are very representative of what contents students should know) and not be fully guided by statistics.

After you have these categories, students who are at the lowest level or below it might be considered as knowing 0% of what you intended them to know, and students who are at the highest possible level the test can measure may be considered to know 100% of what you intended to know at that point in their education. The other students can be placed in between.

$\endgroup$
2
  • $\begingroup$ The link given in the answer says how to transform data to desired mean and standard deviation. This makes sense. But to do that we need the mean and the standard deviation of the current scale. Since the theta values can be anything and it depends on the item parameters and previous theta, how will I calculate the mean and standard deviation of the first scale? I will have to simulate all the possible response vectors? Or assume that the distribution is a standard normal distribution with mean 0 and standard deviation 1? $\endgroup$
    – Indra
    Commented Jan 25, 2019 at 10:31
  • 1
    $\begingroup$ You can subtract $\mu_{\theta}$ and divide by $\sigma_{\theta}$ from all yout $\theta$ values, then add a new $\mu_1$ and multiply by a new $\sigma_1$ of your choosing to get a new scale for all $\theta$. In my example, $\mu_1=500$ and $\sigma_1$ is a value you could experiment with to get your max. and min. $\theta$ values within the desired bounds. $\endgroup$ Commented Jan 26, 2019 at 4:13

Not the answer you're looking for? Browse other questions tagged or ask your own question.