Objective: Diagnostic frequency ratios such as the atypia of undetermined significance (AUS):malignant ratio are touted to be useful for laboratory precision benchmarking. We therefore sought to examine their reproducibility and usefulness at a tertiary hospital. Methods: We reviewed thyroid fine-needle aspirates (FNA) submitted to our institution from outside laboratories and evaluated the ability of diagnostic frequency ratios to capture the complexity of The Bethesda System for Reporting Thyroid Cytopathology (TBSRTC). Specifically, we evaluated the ability of the AUS:malignant ratio to describe the frequencies of the other TBSRTC diagnoses. Results: A total of 2,784 cases from 19 laboratories were included. The use of the AUS category varied the most. There was insufficient reflection of the non-AUS nonmalignant TBSRTC diagnostic frequencies in our analysis, and these results do not appear to arise from observer variability in the outside laboratories. Conclusion: Diagnostic frequency ratios are not reproducible in our experience and fail to describe the other TBSRTC categories. As such, they are unlikely to prove sufficient for benchmarking laboratory precision with TBSRTC.