Assessing inter-rater agreement for compositional data

Ningning Chen, Purdue University


Compositional data are non-negative vectors whose elements sum to one (e.g., [0.1, 0.5, 0.4]). This type of data occurs in many research areas where the relative magnitudes between the vector’s elements are of primary interest. In this dissertation we propose novel methodology for assessing inter-rate agreement based on compositional data. This is needed because existing agreement measures either involve converting the vector to a univariate value, thereby losing information, or they fail to account for the sum-to-one restriction. We propose a novel Bayesian approach, enabled by Markov chain Monte Carlo, to investigate differences in the pattern of compositional vector scores. We extend our model to handle discrete compositional scores, comparisons involving more than two raters, and studies that involve replicate scores on the same subjects. Numerous simulation studies are used to demonstrate the validity of our model and the advantages of our approach. Both simulated data and a real scoring data set are analyzed to illustrate our method and compare it to traditional agreement indices. The application of this new methodology is focused on pathology, where pathologists rate immunohistochemistry (IHC) assays using compositional scores. To enhance the use of this methodology and help with the design of future agreement studies, an R Shiny package designed for the IHC agreement analysis is developed.




Craig, Purdue University.

Subject Area


Off-Campus Purdue Users:
To access this dissertation, please log in to our
proxy server