Background
Artificial intelligence (AI) has rapidly advanced in healthcare and dental education, significantly impacting diagnostic processes, treatment planning, and academic training. The aim of this study is to evaluate the performance differences between different large language models (LLMs) by analyzing their accuracy rates in answers to multiple choice oral pathology questions.
Methods
This study evaluates the performance of eight LLMs (Gemini 1.5, Gemini 2, ChatGPT 4o, ChatGPT 4, ChatGPT o1, Copilot, Claude 3.5, Deepseek) in answering multiple-choice oral pathology questions from the Turkish Dental Specialization Examination (DUS). A total of 100 questions from 2012 to 2021 were analyzed. Questions were classified as “case-based” or “knowledge-based”. The responses were classified as “correct” or “incorrect” based on official answer keys. To prevent learning biases, no follow-up questions or feedback were provided after the LLMs’ responses.
Results
Significant performance differences were observed among the models (p < 0.001). ChatGPT o1 achieved the highest accuracy (96 correct, 4 incorrect), followed by Claude (84 correct), Gemini 2 and Deepseek (82 correct each). Copilot had the lowest performance (61 correct). Case-based questions showed notable performance variations (p = 0.034), where ChatGPT o1 and Claude excelled. For knowledge-based questions, ChatGPT o1 and Deepseek demonstrated the highest accuracy (p < 0.001). Post-hoc analysis revealed that ChatGPT o1 performed significantly better than most other models across both case-based and knowledge-based questions (p < 0.0031).
Conclusion
LLMs demonstrated variable proficiency in oral pathology questions, with ChatGPT o1 showing higher accuracy. LLMs shows promise as a supplementary educational tool, though further validation is required.
Authors: Birkan Eyup Yilmaz, Busra Nur Gokkurt Yilmaz, Furkan Ozbey
Source: https://link.springer.com/
This study was not funded by any organization or institution or any research grant company.
This manuscript describes strategies for assessment of precision of several diagnostic artificial intelligence (AI) tools in orthodontics
Editorials 16 March 2026
The Columbia University College of Dental Medicine, in partnership with the Columbia University Center for
Digital Dentistry 22 October 2025
Artificial intelligence in dentistry: Exploring emerging applications and future prospects
This narrative review aimed to explore the evolution and advancements of artificial intelligence technologies, highlighting their transformative impact on healthcare, education, and specific aspects...
Digital Dentistry 15 October 2025
The Impact of Artificial Intelligence on Diagnostic Accuracy and Treatment Planning in Dentistry
The use of AI in dentistry is revolutionizing the field of dentistry by enhancing the accuracy of diagnosis and treatment.
Smile Analysis is an Edra professional dentistry reference focused on clinical practice, education and treatment planning.
Following the total loss of its manufacturing facility in the 2025 Los Angeles fires, Wizard Wedges® are back in production and available through authorized dental dealers—the same
News 29 May 2026
News 29 May 2026
The American Association of Orthodontists (AAO) announced the successful conclusion of its 2026 Annual Session, held in Orlando, Florida, from May 1–3 and attended by more than 12
Editorials 29 May 2026
From California to Canada: CU Anschutz School of Dental Medicine at ADEA, GRC and IADR 2026
The CU Anschutz School of Dental Medicine will showcase a wide breadth of educational innovation, faculty development, clinical training and research at three major gatherings this