Generalized confidence interval for an agreement between raters.


Bhaumik DK(1)(2), Shi H(2), Reda DJ(2), Sinha BK(3).
Author information:
(1)Department of Psychiatry, University of Illinois at Chicago, Chicago, Illinois, USA.
(2)CSPCC, Hines VA Hospital, Hines, Illinois, USA.
(3)Indian Statistical Institute, Kolkata, India.


Estimation and inference are two key components toward the solution of any statistical problem; however, the inferential issues of statistical assessment of agreement among two or more raters have not been well developed as compared to the development of estimation procedures in this area. The fundamental reason for this gap is the complex expression of the concordance correlation coefficient (CCC) that is frequently used in assessing agreement among raters. Large sample-based statistical tests for CCC often fail to produce desired results for small samples. Hence, inferential procedures for small samples are urgently needed to evaluate agreement between raters. We argue that hypothesis testing of CCC has little value in practice due to the absence of a gold standard of agreement. In this article, we construct the generalized confidence interval (GCI) for CCC utilizing a bivariate normal distribution of measurements, and also develop a large sample-based confidence interval (LSCI). We establish satisfactory performance of GCI by providing the desired coverage probability (CP) via simulation. Results of GCI and LSCI are illustrated and compared with a data set of a recent study performed at U.S. Department of Veterans Affairs, Hines.