That is, could one take a hypothetical normal distribution of the scores that could be obtained on the measure and create bands, starting at maybe the mean rather than the top or bottom, using Bobko and Roth’s SEM value recommendations, to create bands? This would be somewhat similar to creating bands like letter grade cutoffs, but using appropriate statistical measures to do so.
Bobko and Roth (2004) explain why top-score-referenced banding leads to bands that are wide and this is not a statistically correct technique. They explain that this approach is still being used especially by city and state agencies. If we were asked to consult with an agency using this, how would we describe alternative methods that might appeal to them to persuade the agency to discontinue this practice?
•Has there or can there be a study on comparing the angoff method and the top-score-reference banding method (i.e., statistically based banding methods) on the ability to enhance diversity in selection? I think we would see angoff methods allowing for more diversity yet still reliable and valid measures for performance. Has this article influenced many users of the top-sore-reference method to consider other approaches due to its statistical flaws and lack of ability to bring in diversity to the workforce? How do we persuade and influence practitioners to not use this incorrect, statistically speaking, approach?
Bobko and Roth (2004) said that the highest scoring applicant on a particular measure would most likely have a lower variance of scores than an applicant in the middle of the distribution. Would there not be any concern of a regression toward the mean because the highest score is so high? I find it strange that Bobko and Roth assumed this to be true. Why would they assume that?
Another issue with banding may be the sample of applicants in certain jobs. The people who would be using these banded predictors are going to be people who have worked, or are at least interested in, the domain subject matter. I would expect these individuals to score in the higher and have less variance (in certain, more specialized jobs), thus making the amount of applicants in one band enormous.
My understanding of the Bobko and Roth's (2004) basic contention with top-score-referenced banding is that it is based on the assumption that the SEM is the same for each person in the distribution even though it is calculated from the top score when in setting the band's width. The authors show that the SEM for each candidate is likely to increase substantially as the number of correct answers on the test decreases. Thus, the use of banding based on SEM is likely to ignore statistically significant differences in scores within the band while ignoring that scores at the bottom of the band and those just outside the band may not be statistically significant. Doesn't this argument make the use of banding statistically invalid?
If it is determined that banding will be used, do you think different people should do each step in the banding procedure? That is, if the same HR person administers and scores each of the steps in the banding procedure they may still be influenced by the applicants' order within the band, even though they are supposed to be treated as equivalent. What are some other ways that we could eliminate this potential bias?
That is, could one take a hypothetical normal distribution of the scores that could be obtained on the measure and create bands, starting at maybe the mean rather than the top or bottom, using Bobko and Roth’s SEM value recommendations, to create bands? This would be somewhat similar to creating bands like letter grade cutoffs, but using appropriate statistical measures to do so.
ReplyDeleteBobko and Roth (2004) explain why top-score-referenced banding leads to bands that are wide and this is not a statistically correct technique. They explain that this approach is still being used especially by city and state agencies. If we were asked to consult with an agency using this, how would we describe alternative methods that might appeal to them to persuade the agency to discontinue this practice?
ReplyDelete•Has there or can there be a study on comparing the angoff method and the top-score-reference banding method (i.e., statistically based banding methods) on the ability to enhance diversity in selection? I think we would see angoff methods allowing for more diversity yet still reliable and valid measures for performance. Has this article influenced many users of the top-sore-reference method to consider other approaches due to its statistical flaws and lack of ability to bring in diversity to the workforce? How do we persuade and influence practitioners to not use this incorrect, statistically speaking, approach?
ReplyDeleteBobko and Roth (2004) said that the highest scoring applicant on a particular measure would most likely have a lower variance of scores than an applicant in the middle of the distribution. Would there not be any concern of a regression toward the mean because the highest score is so high? I find it strange that Bobko and Roth assumed this to be true. Why would they assume that?
ReplyDeleteAnother issue with banding may be the sample of applicants in certain jobs. The people who would be using these banded predictors are going to be people who have worked, or are at least interested in, the domain subject matter. I would expect these individuals to score in the higher and have less variance (in certain, more specialized jobs), thus making the amount of applicants in one band enormous.
My understanding of the Bobko and Roth's (2004) basic contention with top-score-referenced banding is that it is based on the assumption that the SEM is the same for each person in the distribution even though it is calculated from the top score when in setting the band's width. The authors show that the SEM for each candidate is likely to increase substantially as the number of correct answers on the test decreases. Thus, the use of banding based on SEM is likely to ignore statistically significant differences in scores within the band while ignoring that scores at the bottom of the band and those just outside the band may not be statistically significant. Doesn't this argument make the use of banding statistically invalid?
ReplyDeleteIf it is determined that banding will be used, do you think different people should do each step in the banding procedure? That is, if the same HR person administers and scores each of the steps in the banding procedure they may still be influenced by the applicants' order within the band, even though they are supposed to be treated as equivalent. What are some other ways that we could eliminate this potential bias?
ReplyDelete