Thyroid nodule ultrasound characteristics are used as an indication for fine-needle aspiration cytology, usually as the basis for Thyroid Imaging Reporting and Data System (TIRADS) score calculation. Few studies on interobserver variation are available, all of which are based on analysis of preselected still ultrasound images and often lack surgical confirmation.
After the blinded online evaluation of video recordings of the ultrasound examinations of 47 consecutive malignant and 76 consecutive benign thyroid lesions, 7 experts from 7 thyroid centers answered 17 TIRADS-related questions. Surgical histology was the reference standard. Interobserver variations of each ultrasound characteristic were compared using Gwet’s AC1 inter-rater coefficients; higher values mean better concordance, the maximum being 1.0.
On a scale from 0.0 to 1.0, the Gwet’s AC1 values were 0.34, 0.53, 0.72, and 0.79 for the four most important features in decision-making, i.e. irregular margins, microcalcifications, echogenicity, and extrathyroidal extension, respectively. The concordance in the discrimination between mildly/moderately and very hypoechogenic nodules was 0.17. The smaller the nodule size the better the agreement in echogenicity, and the larger the nodule size the better the agreement on the presence of microcalcifications. Extrathyroidal extension was correctly identified in just 45.8% of the cases.
Examination of video recordings, closely simulating the real-world situation, revealed substantial interobserver variation in the interpretation of each of the four most important ultrasound characteristics. In view of the importance for the management of thyroid nodules, unambiguous and widely accepted definitions of each nodule characteristic are warranted, although it remains to be investigated whether this diminishes observer variation.