American Educational Research Association, American Psychological Association & National Council on Measurement in Education (2014). Standards for Educational and Psychological Testing. Washington, DC: American Educational Research Association.

Angoff, W. H. (1971). Norms, scales, and equivalent scores. In R. L. Thorndike (Ed.), Educational Measurement (2nd ed.). Washington, DC: American Council on Education.

Angoff, W. H. (1984). Scales, Norms and Equivalent Scores. NJ: Educational Testing Service.

Brennan, R. L. (2001). Generalizability theory. New York: Springer-Verlag.

Breyer, F., & Lewis, C. (1994). Pass-fail reliability for tests with cut scores: A simplified method. Princeton, NJ: Educational Testing Service.

Keats, J. A. (1957). Estimation of error variances of test scores. Psychometrika, 22.

Kolen, M.J., & Brennan, R. L. (2004) Test Equating, Scaling, and Linking: Methods and Practices (2nd edition), New York, NY: Springer Science and Business Media, LLC.

Kuder, G. F., & Richardson, M. W. (1937). The theory of the estimation of test reliability. Psychometrika, 2, 151–160.

Longford, N. T., Holland, P. W., & Thayer, D. T. (1993). Stability of the M-H D-DIF statistics across populations. In P. W. Holland & H. Wainer (Eds.), Differential item functioning (pp. 171–196). Hillsdale, NJ: Erlbaum.

Swaminathan, H., & Rogers, H.J. (1990). Detecting Differential Item Functioning Using Logistic Regression Procedures. Journal of Educational Measurement, 27(4), 361–370.

Top of Page