Can ChatGPT answer patient questions regarding reverse shoulder arthroplasty?

Lack, Benjamin T.; Mouhawasse, Edwin; Childers, Justin T.; Jackson, Garrett R.; Daji, Shay V.; Yerke-Hansen, Payton; Familiari, Filippo; Knapik, Derrick M.; Sabesan, Vani J.

doi:10.1016/j.jisako.2024.100323

Introduction: In recent years, artificial intelligence (AI) has seen substantial progress in its utilization, with Chat Generated Pre-Trained Transformer (ChatGPT) is emerging as a popular language model. The purpose of this study was to test the accuracy and reliability of ChatGPT's responses to frequently asked questions (FAQ) pertaining to reverse shoulder arthroplasty (RSA). Methods: The ten most common FAQs were queried from institution patient education websites. These ten questions were then input into the chatbot during a single session without additional contextual information. The responses were then critically analyzed by two orthopedic surgeons for clarity, accuracy, and the quality of evidence-based information using The Journal of the American Medical Association (JAMA) Benchmark criteria and the DISCERN score. The readability of the responses was analyzed using the Flesch-Kincaid Grade Level. Results: In response to the ten questions, the average DISCERN score was 44 (range 38-51). Seven responses were classified as fair and three were poor. The JAMA Benchmark criteria score was 0 for all responses. Furthermore, the average Flesch-Kincaid Grade Level was 14.35, which correlates to a college graduate reading level. Conclusion: Overall, ChatGPT was able to provide fair responses to common patient questions. However, the responses were all written at a college graduate reading level and lacked reliable citations. The readability greatly limits its utility. Thus, adequate patient education should be done by orthopedic surgeons. This study underscores the need for patient education resources that are reliable, accessible, and comprehensible. Level of evidence: IV.