Evaluating observer agreement of scoring systems for foot integrity and footrot lesions in sheep

<p>Abstract</p> <p>Background</p> <p>A scoring scale with five ordinal categories is used for visual diagnosis of footrot in sheep and to study its epidemiology and control. More recently a 4 point ordinal scale has been used by researchers to score foot integrity (wall...

Full description

Bibliographic Details
Main Authors: Foddai Alessandro, Green Laura E, Mason Sam A, Kaler Jasmeet
Format: Article
Language:English
Published: BMC 2012-05-01
Series:BMC Veterinary Research
Online Access:http://www.biomedcentral.com/1746-6148/8/65
Description
Summary:<p>Abstract</p> <p>Background</p> <p>A scoring scale with five ordinal categories is used for visual diagnosis of footrot in sheep and to study its epidemiology and control. More recently a 4 point ordinal scale has been used by researchers to score foot integrity (wall and sole horn damage) in sheep. There is no information on observer agreement using either of these scales. Observer agreement for ordinal scores is usually estimated by single measure values such as weighted kappa or Kendall’s coefficient of concordance which provide no information where the disagreement lies. Modeling techniques such as latent class models provide information on both observer bias and whether observers have different thresholds at which they change the score given. In this paper we use weighted kappa and located latent class modeling to explore observer agreement when scoring footrot lesions (using photographs and videos) and foot integrity (using post mortem specimens) in sheep. We used 3 observers and 80 photographs and videos and 80 feet respectively.</p> <p>Results</p> <p>Both footrot and foot integrity scoring scales were more consistent within observers than between. The weighted kappa values between observers for both footrot and integrity scoring scales ranged from moderate to substantial. There was disagreement between observers with both observer bias and different thresholds between score values. The between observer thresholds were different for scores 1 and 2 for footrot (using photographs and videos) and for all scores for integrity (both walls and soles). The within observer agreement was higher with weighted kappa values ranging from substantial to almost perfect. Within observer thresholds were also more consistent than between observer thresholds. Scoring using photographs was less variable than scoring using video clips or feet.</p> <p>Conclusions</p> <p>Latent class modeling is a useful method for exploring components of disagreement within and between observers and this information could be used when developing a scoring system to improve reliability.</p>
ISSN:1746-6148