All Matches
Solution Library
Expert Answer
Textbooks
Search Textbook questions, tutors and Books
Oops, something went wrong!
Change your search query and then try again
Toggle navigation
FREE Trial
S
Books
FREE
Tutors
Study Help
Expert Questions
Accounting
General Management
Mathematics
Finance
Organizational Behaviour
Law
Physics
Operating System
Management Leadership
Sociology
Programming
Marketing
Database
Computer Network
Economics
Textbooks Solutions
Accounting
Managerial Accounting
Management Leadership
Cost Accounting
Statistics
Business Law
Corporate Finance
Finance
Economics
Auditing
Hire a Tutor
AI Study Help
New
Search
Search
Sign In
Register
study help
business
measurement theory in action
Questions and Answers of
Measurement Theory In Action
Imagine the case in which 14 SMEs were asked to provide CVR ratings for a five-item test. Compute the CVR for each of the items based on the ratings shown in Table 7.1.Table 7.1 Item 12345 Not
Is quantifying content validity through the use of the CVI, CVF, or other similar method necessary to establishing content validity? Explain.
Given that 14 SMEs were used to provide the ratings in question 10, which items do you feel have received a CVR so low that you would recommend deleting the item? Justify your response.
How does the criterion-related approach to test validation help provide evidence of the accuracy of the conclusions and inferences drawn from test scores?
What are the differences among predictive, concurrent, and postdictive criterion-related validation designs?
What concerns might you have in using a concurrent or postdictive criterion-related validation design?
The various criterion-related validity research designs might not be equally appropriate for a given situation. For each of the following criterion-related validity designs, provide an example
What factors would you consider to ensure that you have an appropriate criterion?
What factors might attenuate an observed correlation between test scores and criterion scores? Explain.
What might inflate an observed correlation between test scores and criterion scores? Explain.
For each of the following, explain how the correction formula provides a more accurate estimate of the true relationship between the predictor and the criterion:a. Correction for unreliability in
Although it is empirically possible to correct for attenuation due to unreliability in a predictor, this is a violation of ethics if we intend to use the predictor for applied purposes. Explain why
If conducting a correction for restriction in range of the predictor variable in a concurrent criterion-related validity study, who is the population referring to? How might you best estimate the
How could a small organization determine which selection tests might be appropriate for use in selection of new employees?
The unified view of test validation regards all aspects of validation as reaching for the same goal. What is the overall goal of test validation?
Explain why a thorough understanding of the construct measured is essential to the validation process.
What did Cronbach and Meehl (1955) mean by the term “nomological network”?
Can reliability estimates be used to provide evidence of the construct validity of test scores? Explain.
Explain how a researcher could conduct a “study of process” to provide evidence of the construct validity of test scores.
(a) Identify two established measures that could be used (other than those discussed previously) to examine the convergent validity of the Affective Empathy scale discussed above. (b) Identify two
Why is common method variance (CMV) a concern in construct validation studies that involve correlation matrices?
Correlations between what elements of an MTMM matrix would provide the best assessment of CMV?
How does use of an MTMM matrix provide evidence of the construct validity of test scores?
Messick (1995a) identified six aspects of construct validation. Choose any three of these aspects to discuss how Messick’s conceptualization has extended your awareness of the meaning of construct
Most papers and books on meta-analysis say one should include both published and unpublished studies on a given topic. How does one go about getting unpublished studies?
Based on the data in Figure 11.1, what would have happened if we had used a common regression line to predict suicide risk in all three age groups?Figure 11.1 Suicide Risk Old Young Middle Age Test
Can a single person conduct a meta-analysis or does it take a team of researchers? Why?
Assuming we did use the same regression line for all three groups, which group would be most likely to raise claims of test bias? Unfairness?
There are several options with regard to which analytical approach to use. How do you decide which one to use?
How do you decide which moderators to examine?
How does one go about narrowing down the seemingly endless list of potential “omitted variables” in moderated regression analysis used to determine test bias?
Why do you think that intercept bias is much more common than slope bias?
What other factors (besides a truly biased test or an omitted variable) might be falsely suggesting test bias when, in fact, the test is not biased?
Which stakeholders in the testing process are responsible for determining whether test bias actually exists or not?
Can a test that is determined to be biased still be a fair test? Alternatively, can a test that is determined to be unfair still be an unbiased test? Describe the process of back translation.
Why is back translation insufficient to guarantee equivalence?
Provide an example of each of the four types of test equivalence identified by Lonner (1990).
If you had recently translated a test into a different cultural context, how would you assess each of the four types of equivalence?
What factors should be considered when determining whether a requested test accommodation is reasonable?
Why is test-wiseness a problem in tests of maximal performance?
What do you think of intentionally incorporating test-wise characteristics into item distracters? Defend your position.
What are the advantages and disadvantages of selected-response items?
What are the advantages and disadvantages of free-response items?
Why shouldn’t use of “all of the above” be included in multiple-choice response options?
Why shouldn’t test takers be given a choice among several different essay items?
Why are multiple short-answer items preferable to one long essay question?
Why is pretesting of items important in test construction?
In what ways do Anderson and Krathwohl (2001) revision differ from Bloom’s original taxonomy?
Who would be appropriate to fulfill the role of SME for a test designed to assess knowledge of: a. 12th-grade mathematics? b. Modern automotive repair? c. American pop culture?
What is the difference between an item difficulty index and an item discrimination index?
How do you know whether to calculate the discrimination index (which contrasts extreme groups), the biserial correlation, or the point biserial correlation coefficient as your item discrimination
How do you decide which external criterion to use when computing an item-criterion index?
What corrections, if any, might you make to items 1, 2, 4, 5, and 8 in Table 13.2?Table 13.2 Seq. No. 1 2 3 4 10 6 7 Item Statistics 8 3 1 0-13 0-48 39.74 72.48 8.51 0.27 .14 -.12 Check the key A was
Is there ever a time when a .25 p value is good? How about a 1.00 p value?
Will your criteria for evaluating your item difficulty and discrimination indexes change if a test is norm referenced versus criterion referenced?
Will your criteria for evaluating your item difficulty and discrimination indexes change as the format of the item changes (e.g., true-false; three-, four-, or five-option multiple choice; Likert
Oftentimes in a classroom environment, you might have more students (subjects) than you have items. Does this pose a problem for interpreting your item analysis statistics?
How do we best define the “minimally competent person” when using judgmental methods such as the Angoff, Nedelsky, Ebel, and Bookmark methods?
When does a method for setting pass points go from being judgmental/ empirical to empirical/judgmental? Does it really matter?
What legal issues do we need to be concerned with when setting cutoff scores?
Does where we set the cutoff score affect the validity of the test? The utility?
How do we know whether we should minimize false-positive or false negative decisions? Will that decision impact the procedure we use to make the cutoff score decision?
Do we really even need to set cutoff scores? Why not just rank order all the test scores from highest to lowest and provide the valued outcome until it runs out?
What if we set a cutoff score and no one passes?
This module begins by discussing serious concerns with self-report measures. Do such concerns indicate we should abandon this type of inquiry? Explain.
Given the concerns in #1 above, do you think we should clearly provide respondents an option to respond “don’t know”? Explain.Question 1:This module begins by discussing serious concerns with
Why is defining the intended construct so essential to the development of a measure of typical performance?
In assessing someone’s opinion, when might you prefer to use a selected-response item format? When might you prefer to use a constructed-response item format?
Why is it sometimes appropriate to use emotionally loaded items when assessing self-report of a person’s behavior?
What is acquiescence? What can a test developer do to reduce our concern with acquiescence?
Why shouldn’t we ask respondents what they plan to do in the future? What should we do instead?
In reviewing the item writing tips in this module, is there any particular tip that you feel is especially important? Why? Are there any item writing tips that you would take issue with? Explain.
What is the major difference between rational and empirical methods of test development? Is rational test development unempirical? Is empirical test development irrational?
Is correcting for guessing appropriate in college-level courses where most individuals will not be guessing randomly, but rather will almost always be able to eliminate one or more distracter options?
In situations where individuals are unlikely to omit any of the questions on purpose, is it appropriate to correct for guessing?
What other personality characteristics, besides risk taking, do you think would be associated with guessing on multiple-choice tests?
What other factors, besides guessing, might contribute to extremely low or high levels of variability in knowledge test scores?
It was noted that if a test taker can eliminate at least one of the distracters, then corrections for guessing underestimate the extent of guessing. Is it possible to overestimate the extent of
Given that you cannot guess on short-answer essay questions, would they, by default, be more reliable?
What is the difference between response biases and response styles?
What are the best ways to reduce response biases? Response styles?
What factors influence the relative weighting of each predictor in an unstandardized multiple regression equation?
How does a standardized regression equation differ from an unstandardized regression equation?
If we were currently using four predictors to explain 40% of the variance in our criterion, would the addition of four more predictors with equal combined validity allow us to explain 80% of the
Why do we refer to the prediction of “reliable variance” in the criterion rather than just “variance”?
If in the previous question you added a fifth predictor to the original regression equation, what characteristics would you want from this predictor?
What information is provided by the standard error of estimate?
If we hoped to examine the predictive ability of four independent variables, what would you recommend as the minimum sample size?
Why is it necessary to compute the cross-validated correlation coefficient?
Why would you want to understand the dimensionality of a set of items?
Under what conditions might you choose to use PCA? EFA?
Under what conditions might you choose to use an orthogonal rotation of factors in an EFA? An oblique rotation?
What would you do if the expected dimensionality of your scale was very different from the results suggested by your factor analysis?
In conducting an EFA, describe the procedure you would follow to determine whether items found to load on a factor actually form a meaningful, interpretable subdimension.
In conducting an EFA, what would you do if a factor in the rotated factor (or pattern) matrix was composed of items that seem to having nothing in common from a rational or theoretical standpoint?
List the different types of decisions that you need to make when conducting an EFA.
Upon their introduction to factor analysis, many students are likely to agree with Pedhazur and Schmelkin’s (1991) assertion that factor analysis is like “a forest in which one can get lost in no
List four differences between CFA and EFA. What are similarities?
In scale construction, when would CFA be preferable to EFA? When would EFA be preferable to CFA?
In CFA, how would you determine if the data were consistent with your hypothesized model?
Showing 1100 - 1200
of 1226
1
2
3
4
5
6
7
8
9
10
11
12
13