The value of repeating fine-needle aspiration for thyroid nodules
Abstract
BACKGROUND:
Fine-needle aspiration (FNA) is an invaluable technique used in the evaluation of thyroid nodules.
OBJECTIVES:
Evaluate the concordance of results for consecutive FNA readings.
DESIGN:
Retrospective, descriptive.
SETTINGS:
Two tertiary care centers.
METHODS:
Demographics were collected along with every FNA result and final pathology results for all patients (aged 9-90 years old) who underwent thyroid surgery from 2010 to 2017. The Bethesda system was used for cytology. Agreement levels were calculated and compared with final pathology.
SAMPLE SIZE:
Of 1237 initially included, 1134 had at least one FNA performed with results available for review.
RESULTS:
For the 1134 patients, demographic and clinical data were collection and a comparison was made between the three FNA results; the highest agreement was between FNA 2 and 3 (53.6%); however, the kappa value was consistently low for all comparisons, indicating a poor level of agreement overall. Also, the risk of malignancy was higher in this study than in the 2017 Bethesda system for reporting thyroid cytopathology in FNA cytology categories I and II.
CONCLUSION:
Repeating FNA biopsies yield different results every time; hence, there is a low level of agreement. The clinical decision should therefore include other important risk factors. Prospective studies could help shed more light on this topic.
LIMITATIONS:
Retrospective design.
CONFLICT OF INTEREST:
None.
INTRODUCTION
Fine-needle aspiration (FNA) of the thyroid gland is an invaluable and reliable diagnostic technique used to examine patients with thyroid nodules requiring either surgical excision or conservative management. The decision regarding management depends largely on the thyroid FNA result.1 To standardize the morphological criteria and diagnostic terminology for reporting thyroid fine-needle aspiration cytology (FNAC) results, the National Cancer Institute (NCI) organized the NCI Thyroid Fine-Needle Aspiration State of the Science Conference in 2007, which proposed a six-tier system called The Bethesda System for Reporting Thyroid Cytopathology (TBSRTC). The aim of the TBSRTC is to standardize the communication of FNAC results between clinicians and pathologists.2 This system also describes the risk of malignancy for each category and the recommended clinical management, which was updated in 2017 based on more recent data, which also considered a new category—noninvasive follicular thyroid neoplasm with papillary-like nuclear features (NIFTP). The six categories and their risks of malignancy when NIFTP is considered a thyroid carcinoma are as follows: (I) non-diagnostic or unsatisfactory, 5-10%; (II) benign, 0-3%; (III) atypia of undetermined significance/follicular lesions of undetermined significance (AUS/FLUS), 10-30%; (IV) follicular neoplasm/suspicious for follicular neoplasm (FN/SFN), 25-40%; (V) suspicious for malignancy (SM), 50-75%; and (VI) malignant, 97-99%.3
Reporting of thyroid FNAC results and diagnostic terminology varies among pathologists and even institutions, creating confusion in sharing data between pathologists and clinicians.4 FNAC reports based on the TBSRTC system are associated with sensitivity, specificity, and diagnostic accuracy of 97%, 50.7%, and 68.8%, respectively.5 Many studies have validated the efficacy of TBSRTC. However, multiple discrepancies have been reported. Our study aimed to evaluate the concordance of results for consecutive FNA biopsies and to determine the relevance of repetition on final pathology results.
METHODS
This retrospective, descriptive, observational study was approved by the institutional review board at King Abdulaziz University Hospital and King Fahad Medical City (IRB No. 19-139) and included all patients (aged 9-90 years old) who underwent thyroid surgery from 2010 to 2017. Patient demographics were collected from medical records and included age, sex, weight, height, thyroid function, size of the thyroid nodule, procedure performed, all FNA results, and the final pathology results. Cytology and pathology results were reported by different pathologists, but only the Bethesda system was used for cytology in both centers. Patients with missing records and those who did not have an FNA cytology result were excluded. For each patient, each FNA reading was compared to the final histopathology result (which was defined as the gold standard) to determine which reading was closest to the actual diagnosis, and to clarify whether the accuracy increased with an increase in the number of thyroid FNAs performed.
Statistical analysis was performed using IBM SPSS Statistics software, version 22 (IBM Corp., Armonk, NY, USA). Means and standard deviations (SDs) were obtained for continuous variables, whereas frequencies and percentages were computed for categorical variables. Weighted Cohen's kappa coefficient was used to assess the level of agreement between the repeated FNA results. The level of significance was set at .05.
RESULTS
Among the 1237 patients who initially met the inclusion criteria for this study, 1134 patients had at least one FNA performed with results available for review. The mean (SD) age was 41.4 (13.4) years, and 917 patients (80.9%) were female. Of the 1134 patients who had an FNA biopsy result, 217 (19.1%) underwent a second FNA, and 44 (3.9%) underwent a third FNA. Table 1 shows the FNA results for each set (i.e., first, second, and third) and their corresponding final pathology result. Sensitivity and specificity were calculated for each set. TBSRTC category VI was considered a positive test, and a final pathology report of malignant (excluding incidental microcarcinomas) was considered a positive result. The first FNA result had a sensitivity of 59% and a specificity of 98%. The second FNA result had a sensitivity of 52% and a specificity of a 100%. The third FNA result had a sensitivity of 50% and a specificity of 100%.
| FNA category* | Final pathology | Total | |||
|---|---|---|---|---|---|
| Benign N (%) | Malignant N (%) | Microcarcinoma N (%) | |||
| 1st FNA | I | 35 (59.3) | 12 (20.3) | 12 (20.3) | 59 (5.2) |
| II | 376 (70.8) | 78 (14.7) | 77 (14.5) | 531 (46.8) | |
| III | 142 (55.0) | 68 (26.4) | 48 (18.6) | 258 (22.8) | |
| IV | 37 (44.0) | 38 (45.2) | 9 (10.7) | 84 (7.4) | |
| V | 6 (9.7) | 40 (64.5) | 16 (25.8) | 62 (5.5) | |
| VI | 4 (2.9) | 116 (82.9) | 20 (14.3) | 140 (12.3) | |
| Total | 600 (52.9) | 352 (31) | 182 (16) | 1134 (100) | |
| 2nd FNA | I | 13 (76.5) | 2 (11.8) | 2 (11.8) | 17 (7.8) |
| II | 71 (68.3) | 17 (16.3) | 16 (15.4) | 104 (47.9) | |
| III | 30 (55.6) | 10 (18.5) | 14 (25.9) | 54 (24.9) | |
| IV | 6 (50.0) | 3 (25.0) | 3 (25.0) | 12 (5.5) | |
| V | 4 (40.0) | 3 (30.0) | 3 (30.0) | 10 (4.6) | |
| VI | 0 (0.0) | 19 (95.0) | 1 (5.0) | 20 (9.2) | |
| Total | 124 (57.1) | 54 (24.9) | 39 (18) | 217 (100) | |
| 3rd FNA | I | 2 (100.0) | 0 (0.0) | 0 (0.0) | 2 (4.5) |
| II | 16 (72.7) | 2 (9.1) | 4 (18.2) | 22 (50) | |
| III | 6 (42.9) | 4 (28.6) | 4 (28.6) | 14 (31.8) | |
| IV | 0 (0.0) | 1 (50.0) | 1 (50.0) | 2 (4.5) | |
| V | 0 (0.0) | 1 (100.0) | 0 (0.0) | 1 (2.2) | |
| VI | 0 (0.0) | 2 (66.7) | 1 (33.3) | 3 (6.8) | |
| Total | 24 (54.5) | 10 (22.7) | 10 (22.7) | 44 (100) | |
The three FNA results were compared while considering the total agreement among repeated tests. Results showed a total agreement of 47.4% between the first and second FNA results, with a weighted kappa of 0.30 (95%CI, 0.20–0.40; P<.001). Comparison between the second and third FNA results showed a total agreement of 53.6%, with a weighted kappa of 0.22 (95%CI, 0.06–0.37; P<.001], and a comparison between the first and third FNA results showed a total agreement of 45.5%, with a weighted kappa of 0.17 (95%CI, 0–0.36; P=.002). Figure 1 shows the details of the agreement between the first and second FNA results for every category.

In this population, the kappa value was consistently low for all comparisons, indicating a poor level of agreement overall. Notably, most patients with TBSRTC categories of IV to VI did not undergo a second FNA unless otherwise desired by the patient; for this reason, only 19 such cases had a repeat FNA, and in more than 50% of cases, their final pathology report showed a malignant neoplasm. Also, patients with Bethesda II FNA only had a repeat FNA when a high clinical suspicion existed or ultrasound changes were reported during the follow up. Table 2 shows the final pathology results when an FNA was repeated for TBSRTC categories I to III.
| 1st FNA | 2nd FNA | Final pathology | Total N (%) | ||
|---|---|---|---|---|---|
| Malignant N (%) | Benign N (%) | microPTC N (%) | |||
| I | I | 7 (100.0) | 0 (0.0) | 0 (0.0) | 7 (100.0) |
| II | 9 (64.3) | 2 (14.3) | 3 (21.4) | 14 (100.0) | |
| III | 4 (36.4) | 1 (9.1) | 6 (54.5) | 11 (100.0) | |
| IV | 0 (0.0) | 0 (0.0) | 0 (0.0) | 0 | |
| V | 0 (0.0) | 0 (0.0) | 1 (100.0) | 1 (100.0) | |
| VI | 0 (0.0) | 2 (100.0) | 0 (0.0) | 2 (100.0) | |
| Total | 20 (57.1) | 5 (14.3) | 10 (28.6) | 35 (100.0) | |
| II | I | 4 (50.0) | 2 (25.0) | 2 (25.0) | 8 (100.0) |
| II | 58 (71.6) | 12 (14.8) | 11 (13.6) | 81 (100.0) | |
| III | 22 (66.7) | 5 (15.2) | 6 (18.2) | 33 (100.0) | |
| IV | 4 (66.7) | 1 (16.7) | 1 (16.7) | 6 (100.0) | |
| V | 2 (66.7) | 1 (33.3) | 0 (0.0) | 3 (100.0) | |
| VI | 0 (0.0) | 6 (100.0) | 0 (0.0) | 6 (100.0) | |
| Total | 90 (65.7) | 27 (19.7) | 20 (14.6) | 137 (100.0) | |
| III | I | 1 (100.0) | 0 (0.0) | 0 ((0.0) | 1 (100.0) |
| II | 4 (50.0) | 2 (25.0) | 2 (25.0) | 8 (100.0) | |
| III | 4 (57.1) | 1 (14.3) | 2 (28.6) | 7 (100.0) | |
| IV | 1 (20.0) | 2 (40.0) | 2 (40.0) | 5 (100.0) | |
| V | 1 (33.3) | 1 (33.3) | 1 (33.3) | 3 (100.0) | |
| VI | 0 (0.0) | 3 (75.0) | 1 (25.0) | 4 (100.0) | |
| Total | 11 (39.3) | 9 (32.1) | 8 (28.6) | 28 (100.0) | |
The size of the malignant nodules as noted in the final pathology report was also evaluated, as well as its effect on FNA interpretation. Nodules >1.5 cm in size with FNA results of intermediate and high-grade TBSRTC categories were more likely associated with a malignant final pathology than nodules <1.5 cm in size (>47% vs. <23%, respectively) (Table 3). In fact, tumors <1.5 cm in size for any TBSRTC category were commonly found to be incidental microcarcinomas. Since most clinicians treat microcarcinomas as low-risk entities because of their relatively slow growth rates, these nodules were classified as neither benign nor malignant in this study and are reported as separate entities.
| FNA category | Malignant | microPTC | Total | ||
|---|---|---|---|---|---|
| Size <1.5 | Size >1.51 | Size <1.5 | Size >1.51 | ||
| 1 | 4 (16.7) | 8 (33.3) | 12 (50) | 0 (0) | 24 (100) |
| II | 24 (16.3) | 48 (32.7) | 75 (51) | 0 (0) | 147 (100) |
| III | 14 (12.2) | 54 (46.9) | 47 (40.9) | 0 (0) | 115 (100) |
| IV | 9 (19.5) | 28 (61.0) | 9 (19.5) | 0 (0) | 46 (100) |
| V | 7 (12.72) | 32 (58.2) | 16 (29.1) | 0 (0) | 55 (100) |
| VI | 32 (23.5) | 84 (61.8) | 20 (14.7) | 0 (0) | 136 (100) |
DISCUSSION
In this study, the sensitivity of FNA was much less than that reported in previous studies (59%). The specificity, however, was much higher (98%). When FNA was performed up to three times, sensitivity dropped to 50%, but specificity remained high. While some authors have reported that repeating thyroid FNA decreases false-negative rates,6 others found no improvement in the detection of malignancies.7
Non-diagnostic cytology
Repeating FNA for nodules with an initial non-diagnostic result could change the initial diagnosis for a significant proportion of patients, thereby modifying their therapeutic approach.8 Repeating the FNA has been found to provide a diagnosis in up to 60% of cases.9 In this study, 80% of the initially non-diagnostic results had a different second FNA result, while 20% had a second non-diagnostic result. This finding agrees with Graciano et al's findings that repeat FNA altered the initial diagnosis for 70.5% of non-diagnostic results. They also found non-diagnostic FNA to be the most common indication for a repeated examination.10 However, repeating FNA did not seem to decrease the rate at which a non-diagnostic cytology result was obtained here, which remained between 4.5% and 7.8% every time an FNA was performed.
In this study, all patients who had non-diagnostic cytology results on the first and second FNA tests had a benign final pathology. Some authors reported low rates of malignancy with non-diagnostic cytology, including Ycaza et al, who reported the prevalence of malignancy among 495 nodules with non-diagnostic cytology results to be 3%.11 Graciano et al, however, reported that 17.5% of patients with non-diagnostic cytology results had a malignant final pathology report.10 Other authors have reported higher malignancy rates (51-70.6%).12,13 Here, the rate of malignancy with an initial non-diagnostic cytology result (20.3%) did not increase with repeat non-diagnostic cytology results (14.3%), but remained much higher than the reported rate by the 2017 Bethesda system (50-10%).3 However, because repeating FNA for a non-diagnostic cytology result could yield a second non-diagnostic result and because the ultimate rate of malignancy could be high, aside from repeating the FNA, other factors that could help reach the diagnosis should be sought.
Benign cytology
In this analysis, repeating FNA did not decrease the rate at which a benign cytology report was obtained, which remained between 46.8% and 50%. Since agreement between the first and second FNA results was significantly high in cases with an initial benign result (59.1%), repeating FNA may not influence the clinical decision in the majority of such cases. Fernandes et al, similarly, found significantly high agreement between second FNA results and an initial benign cytology report: 85.8%.8 Glynn et al also found that only 7.1% of nodules with benign FNA were classified differently.14 When further work-up for an initially benign nodule is justified, such as for nodule growth or suspicious ultra-sonographic features, risk stratification and the overall clinical picture may be more beneficial than a repeat FNA.
Twelve (14.8%) patients included in this analysis had benign cytology reports for both the first and second FNA, yet their final pathology report revealed a malignancy, and 11 (13.6%) patients had a microcarcinoma. This is the same as the rate of malignancy when the FNA was performed only once (14.7%), providing support for the lack of an advantage of repetition. This rate remains much higher than that reported by the 2017 Bethesda system (0-3%).3 Clinicians should be alert to the possibility of a malignancy even when multiple benign FNA results are obtained, which are, therefore, non-confirmatory. Singh et al reported that the majority (85.3%) of patients with an initial benign cytology report also had benign results on repeat FNA; however, the prevalence of thyroid malignancy in their sample ranged from 1.2% to 4.1%.15 The risk of malignancy with benign cytology was much higher in our analysis and did not seem to change whether benign cytology results were for the first, second, or third FNA, nor did it decrease for patients with two benign cytology results.
AUS/FLUS cytology
The rate at which an AUS/FLUS cytology report was obtained here did not change significantly when FNA was repeated, remaining between 22.8% and 31.8%. Repeating FNA for AUS/FLUS results is debated in literature. In this population, seven patients had AUS/FLUS cytology on the first and second FNA, and four of those had a benign final pathology report. The other two had incidental microcarcinomas, leaving a single patient with a malignant final pathology report. However, about 26.4% of patients who had this cytology on their first FNA ended up having a malignant final pathology report (excluding those with incidental microcarcinomas). Studies have reported varying rates of malignancy with AUS/FLUS cytology reports, ranging from 2% to 48%.3,17-24 The 2017 Bethesda system reported the risk of malignancy to be around 10-30% for AUS/FLUS reports.3
The rate of malignancy for nodules with two AUS/FLUS cytology results was only 14.3% in this study. This rate is much lower than that described by Park et al, who reported that nodules with two AUS/FLUS results were highly likely to be malignant (31%) and even likelier to be a follicular variant of papillary thyroid carcinoma.25 Yoo et al had even more striking results for patients with AUS/FLUS on the initial FNA which showed a malignancy rate of 59.5%, and those with a repeat AUS/FLUS showed a malignancy rate of 73.1% (P<.05).26
The rate of reclassification with repeat FNA for an AUS/FLUS cytology report was 75%, which is in agreement with the findings of Brandler et al who found that repeating FNA for cases of AUS resulted in reclassification in 67.6% of cases.24 Jooya et al found that an average of 76% of patients did not change TBSRTC categories upon repeating ultrasound-guided FNA. Another 7.4% were downgraded to a benign category; however, approximately 20% were reclassified as either malignant or suspicious for malignancy.27 In this study, 50% of the cases of AUS/FLUS were recategorized as benign. The rate of reclassification, however, should not alter clinical risk stratification, since it does not mean that the second FNA result is more accurate. Patients with any AUS/FLUS cytology result have the same risk of malignancy, even if their second FNA yields a benign cytology. This relationship was demonstrated in this analysis, where the rate of malignancy with an AUS/FLUS cytology report on the first FNA was 26.4%, and patients with an initial AUS/FLUS cytology report and a benign second cytology report had a 25% rate of malignancy. Due to discrepancies in the reported malignancy rates for patients with AUS/FLUS cytology reports and the fact that the risk persists despite a benign second cytology result, repeating FNA does not seem to have real merit in these cases.
Suspicious and malignant cytology
Even though patients with TBSRTC category V/VI FNA results are advised to undergo surgery, six of our patients elected to have a repeat FNA. Two were downgraded to a benign and an AUS/FLUS category on repeat FNA; however, their final pathology was confirmed to be malignant. This result provides further support for clinicians to remain suspicious of FNA results and to take the worst cytology result into consideration, since the risk of malignancy for patients who underwent repeat FNA was as high as the risk for the worst cytology category obtained. Therefore, having a benign result on the second FNA does not justify treating the patient on that basis alone. Additionally, physicians should not repeat FNA for the purposes of confirmation, since the level of agreement between the three FNA sets in this study was persistently low. Repeating thyroid FNA may seem less invasive than other options to patients, but since the possibility of a benign result remains as high as 54.5% even after a third FNA, other methods should be sought.
This study was not without limitations, the first being the retrospective nature of the design, which posed some difficulty in obtaining more data from every included subject. Also, the fact that only 19% of the patients who were included had a repeat cytology presented a disadvantage in the analysis. More representative results could have been obtained with a prospective design. More studies are needed to determine the exact number and methodology for repeating FNA and when it is needed.
In conclusion, Interpretation of FNA results is debated in the literature, specifically for TBSRTC categories I and III, where repeat FNA cytology is a common practice. In this study, the malignancy rate for FNA cytology categories I and II was much higher than the risk reported by the TBSRTC. Additionally, repeating FNA biopsies seemed to yield different results every time; hence, there is a low level of agreement between repeated FNA results. The implied risk changes every time FNA is repeated, but the patient needs to be informed that the risk of malignancy may still be as high as that for the worst cytology report. Additionally, a lower TBSRTC category obtained from repeat FNA can result in more confusion, and the initial results should not be ignored. Clinical decisions should be based not only on FNA results but also on other important factors, including the patient's risk factors and high-risk ultrasound features. Physicians should be aware that repeat FNA may not alter clinical suspicion or confirm previous results.
ARTICLE REFERENCES:
1. . "The surgery and repeat aspiration outcomes of the atypia of undetermined significance/follicular lesion of undetermined significance category in the Bethesda System for Reporting Thyroid Cytopathology" . Asian J Surg.2019; 42(1):144–7.
2. . "The Bethesda system for reporting thyroid cytopathology, Conference NCITFSotS" . Am J Clin Pathol.2009; 132(5):658–65.
3. . "The 2017 Bethesda System for Reporting Thyroid Cytopathology" . Thyroid.2017; 27(11):1341–6.
4. . "A study of malignancy rates in different diagnostic categories of the Bethesda system for reporting thyroid cytopathology: An institutional experience" . Biomed J.2015; 38(6):517–22.
5. . "The Bethesda System for Reporting thyroid Cytopathology: A meta-analysis" . Acta cytol.2012; 56(4):333–9.
6. . "Value of re-aspirations in benign nodular thyroid disease" . Thyroid.1998; 8(12):1087–90.
7. . "Is repeated fine-needle aspiration cytology useful in the management of patients with benign nodular thyroid disease?" Thyroid. 2000; 10(6):489–92.
8. . "[Fine-needle aspiration of thyroid nodules: Is it worth repeating?]" . Acta med Port.2017; 30(6):472–8.
9. . "Value of repeating a nondi-agnostic thyroid fine-needle aspiration biopsy" . Endocr Pract.2007; 13(7):735–42.
10. . "[Repeated fine-needle aspiration cytology for the diagnosis and follow-up of thyroid nodules]" . Braz J Otorhinolaryngol.2014; 80(5):422–7.
11. , . "Risk of malignancy in thyroid nodules with non-diagnostic fine-needle aspiration: A retrospective cohort study" . Thyroid.2016; 26(11):1598–604.
12. . "Thyroid nodules with repeat nondiagnostic cytologic results: The role of clinical and ultrasonographic findings" . World J Surg.2015; 39(7):1721–7.
13. , . "Role of repeat fine-needle aspiration biopsy (FNAB) in the management of thyroid nodules" . Diagn Cytopathol.2003; 29(4):203–6.
14. , . "Utility of repeat cytological assessment of thyroid nodules initially classified as benign: Clinical insights from multidisciplinary care in an Irish tertiary referral centre" . BMC Endocr Disord.2016; 16(1):45.
15. . "The value of repeat thyroid fine-needle aspiration biopsy in patients with a previously benign result: How often does it alter management?" Thyroid. 2015; 25(10):1121–6.
16. . "Value of repeated fine needle aspiration cytology in patients with nodular goiter" . Acta cytol.2007; 51(6):850–2.
17. , . "Young investigator challenge: Atypia of undetermined significance in thyroid FNA: Standardized terminology without standardized management–A closer look at repeat FNA and quality measures" . Cancer Cytopathol.2016; 124(1):37–43.
18. . "Utilization and impact of repeat biopsy for follicular lesion/atypia of undetermined significance" . World J Surg.2014; 38(3):628–33.
19. . "Should atypia of undetermined significance be subclassified to better estimate risk of thyroid cancer?" Am J Surg. 2014; 207(3):331–6; discussion 5-6.
20. . "Thyroid fine-needle aspiration reporting rates and outcomes before and after Bethesda implementation within a combined academic and community hospital system" . Arch Pathol Lab Med.2013; 137(11):1664–8.
21. . "The Bethesda system for reporting thyroid cytopathology: An experience of 1,382 cases in a community practice setting with the implication for risk of neoplasm and risk of malignancy" . Diagn Cytopathol.2012; 40(5):399–403.
22. . "The atypical thyroid fine-needle aspiration: Past, present, and future" . Cancer Cytopathol.2012; 120(2):73–86.
23. , . "Malignancy rate in thyroid nodules classified as Bethesda category III (AUS/FLUS)" . Thyroid.2014; 24(5):832–9.
24. . "Clinical outcome for atypia of undetermined significance in thyroid fine-needle aspirations: Should repeated fna be the preferred initial approach?" Am J Clin Pathol. 2011;135(5):770–5.
25. . "Malignancy risk and characteristics of thyroid nodules with two consecutive results of atypia of undetermined significance or follicular lesion of undetermined significance on cytology" . Eur Radiol.2015; 25(9):2601–7.
26. . "Repeat diagnoses of Bethesda Category III thyroid nodules: What to do next?" PLOS ONE. 2015; 10(6):e0130138.
27. , . "The role of repeat fine needle aspiration in the management of indeterminate thyroid nodules" . J Otolaryngol Head Neck Surg.2016; 45(1):51.
