|
Our reconciliation strategy allows for a relatively straightforward test of reliability. We began by reconciling those constitutional events coded by two coders. As we describe previously, this process allows the PI or reconciler to accept all answers for which the coders agree. Then, for discrepancies between coders, the PI reviews the coders' responses, citations, and comments (as well as the constitution itself) in order to determine the “correct” answer. One simple measure of reliability (intercoder reliability) is the probability that the two coders agree. In our data, that probability is about 0.9.
We were also able to calculate individual reliability scores for each coder by calculating the probability that a coder either agreed with her counterpart or, in the case of a discrepancy, agreed with the reconciler. These individual reliability scores ranged from 0.69 to 0.93 with most clustered around 0.90. Again, we found no significant differences between the Political Science graduate students, the law students, and the undergraduates with respect to their reliability. This finding, together with the efficiency results, suggests to us that advanced undergraduates may indeed be interchangeable with graduate students in the coder position.
Finally, the intercoder reliability score allows us to calculate the probability of joint errors among between two or more coders. This quantity is of interest given that our coding strategy of two coders per event assumes a trivially small number of such errors. With an intercoder reliability of 0.9, the probability of joint errors with two coders is 0.08, or less than 1%, and with three coders 0.007. Given these calculations, our sense is that using two coders appropriately balances our goal of minimizing error with our financial constraints.
|