Question
. Test Item Analysis (TITAN) Using Test Item Analysis Reports to inform paper development It is common for item writers to refer to reports of
. Test Item Analysis (TITAN)
Using Test Item Analysis Reports to inform paper development
It is common for item writers to refer to reports of how similar items performed in some previous assessments or examinations in order to make decisions on using the item again or one of a similar nature, or discarding an item totally.
Thetest item analysis(TITAN) report provides information on how the item performed in a particular assessment, not what is good or not good about the item. What is not good about an item is determined through professional judgements based on actual inspection of the item itself. Item analysis statistics are flags, they are not decision makers.
There is a number ofstatistical informationthat are reported in an item analysis report. In the classical test theory (CTT), item analysis reports include:
- Facility value(orpvalue) - a number between 0 and 1; a measure of how easy or difficult and item is, and is roughly the proportion of candidates who get an item right. The higher thepvalue the easier the item. FV > 0.9 means the item is too easy; FV < 0.25 means the item is too difficult. In norm-referenced tests, a wide range of FV is helpful, so would range from 0.25 to 0.9. In criterion reference tests, FV between 0.6 and 0.8 is helpful.
- Discrimination index- item discrimination is a number between -1 and 1; it is a measure of how the item differentiates between students who 'know' and those who do not. The acceptable range is 0.20 to 1.00, with 0.4 to 0.7 being very good values. Items with good discrimination indices improve the assessment's ability to discriminate between participants of different ability levels. Item discrimination is influenced by the FV so expect lower DI values on very hard (very low FV value) or very easy (very high FV value) items. Items with low or negative DI values lower the reliability of the assessment or threaten the validity of results.
- Correlation coefficient- this is a measure of reliability, i.e. a measure of the likelihood of obtaining similar results if the same test is re-administered to another group of similar students. The value ranges from 0 to 1 and the higher the value the better is the test reliability. Lots of very difficult items or poorly written item can skew this value. The higher the variability (or spread of scores), the higher the reliability. The reliability coefficient is always presented together with anSEM (standard error on measurement) value.
In an item analysis report that uses the Item response theory (IRT), the statistics include:
- The item location estimate - similar function as FV
- The fit residual - similar function as DI
- The chi-square statistic - similar function as the correlation or reliability coefficient
While these IRT terminologies are for higher level test item analysis, the use of the Item Map is a more common practice for classroom test development.
Refer to some light reading provided underUS2 Relevant Readings>Item Analysis Statisticsfor further information.
Refer to some SPFSC TITAN reports for information on the types of information provided for each subject, inExamination Papers & TITAN reports Participants with HPE, Wood Tech, and FN majors are encouraged to analyse TITAN reports for any other subject of their choice. HPE and FN are not offered in SPFSC so TITAN reports are not available.
5. Test Item Analysis (TITAN)
5.1. US3 Activity 3
Refer to the sub-folder containing examination papers (2023 Final SPFSC ExaminationPapers) and their corresponding item analysis reports (2023 Test Item Analysis Report) for copies of 2023 examination papers and TITAN reports.
Analyse the Facility Values (FV) for each item in a paper of your choice to note items with high FV, and those with low FV.What do these values tell you about how students have done in these items?
Identifysix(6) items, whether individual items or groups,that had very low or abnormally high levels of expected student achievements and, for each item, Give an explanation on whether these were the result of (i) faulty items, or (ii) teaching issues, or (iii) both. If you feel that the item was faulty, then suggest another way we can develop this item that would make it a better item.