Hi, I am working on Educational assessment item calibration task.
I am new to IRT and I have read 100s of papers still could not wrap around my head with invariance property of IRT and item calibration.
My question is suppose I have response data of 20,000 students on 10 items, now I want to estimate the item difficulty for those items.
To do so I have two options, either I can use CTT to calculate the p values(which is difficulty) or fit an IRT model to estimate the item difficulty.
I tried both the ways and now I am not sure on,
1. How IRT can say that the item difficulty is not item dependent? I took the sub sample of just high scoring students from 20,000 data and re-estimated the IRT parameters and found that item difficulty has been reduced and vise versa for low scoring student sample.
2. Isn't it making IRT sample dependent?
3. If IRT is sample dependent then how CTT is different from IRT
I am new to IRT and I have read 100s of papers still could not wrap around my head with invariance property of IRT and item calibration.
My question is suppose I have response data of 20,000 students on 10 items, now I want to estimate the item difficulty for those items.
To do so I have two options, either I can use CTT to calculate the p values(which is difficulty) or fit an IRT model to estimate the item difficulty.
I tried both the ways and now I am not sure on,
1. How IRT can say that the item difficulty is not item dependent? I took the sub sample of just high scoring students from 20,000 data and re-estimated the IRT parameters and found that item difficulty has been reduced and vise versa for low scoring student sample.
2. Isn't it making IRT sample dependent?
3. If IRT is sample dependent then how CTT is different from IRT