HOME - Clinical cases - Oral pathology
 
 
23 June 2025

Artificial intelligence performance in answering multiple-choice oral pathology questions: a comparative analysis


Background

Artificial intelligence (AI) has rapidly advanced in healthcare and dental education, significantly impacting diagnostic processes, treatment planning, and academic training. The aim of this study is to evaluate the performance differences between different large language models (LLMs) by analyzing their accuracy rates in answers to multiple choice oral pathology questions.

Methods

This study evaluates the performance of eight LLMs (Gemini 1.5, Gemini 2, ChatGPT 4o, ChatGPT 4, ChatGPT o1, Copilot, Claude 3.5, Deepseek) in answering multiple-choice oral pathology questions from the Turkish Dental Specialization Examination (DUS). A total of 100 questions from 2012 to 2021 were analyzed. Questions were classified as “case-based” or “knowledge-based”. The responses were classified as “correct” or “incorrect” based on official answer keys. To prevent learning biases, no follow-up questions or feedback were provided after the LLMs’ responses.

Results

Significant performance differences were observed among the models (p < 0.001). ChatGPT o1 achieved the highest accuracy (96 correct, 4 incorrect), followed by Claude (84 correct), Gemini 2 and Deepseek (82 correct each). Copilot had the lowest performance (61 correct). Case-based questions showed notable performance variations (p = 0.034), where ChatGPT o1 and Claude excelled. For knowledge-based questions, ChatGPT o1 and Deepseek demonstrated the highest accuracy (p < 0.001). Post-hoc analysis revealed that ChatGPT o1 performed significantly better than most other models across both case-based and knowledge-based questions (p < 0.0031).

Conclusion

LLMs demonstrated variable proficiency in oral pathology questions, with ChatGPT o1 showing higher accuracy. LLMs shows promise as a supplementary educational tool, though further validation is required.


Authors: Birkan Eyup Yilmaz, Busra Nur Gokkurt Yilmaz, Furkan Ozbey 

Source: https://link.springer.com/

Related articles

This study was not funded by any organization or institution or any research grant company.


This manuscript describes strategies for assessment of precision of several diagnostic artificial intelligence (AI) tools in orthodontics


This narrative review aimed to explore the evolution and advancements of artificial intelligence technologies, highlighting their transformative impact on healthcare, education, and specific aspects...


The use of AI in dentistry is revolutionizing the field of dentistry by enhancing the accuracy of diagnosis and treatment.


Read more

Smile Analysis - Edra Publishing book cover

Smile Analysis is an Edra professional dentistry reference focused on clinical practice, education and treatment planning.


Following the total loss of its manufacturing facility in the 2025 Los Angeles fires, Wizard Wedges® are back in production and available through authorized dental dealers—the same


Endodontists Share Tips to Save a Tooth in an Emergency During Save Your Tooth Month


The American Association of Orthodontists (AAO) announced the successful conclusion of its 2026 Annual Session, held in Orlando, Florida, from May 1–3 and attended by more than 12


The CU Anschutz School of Dental Medicine will showcase a wide breadth of educational innovation, faculty development, clinical training and research at three major gatherings this


 
 
 
 

 
 
 
 

Most popular

 
 

Events