ChatGPT-4, OpenAI’s artificial intelligence (AI) algorithm, has proven to have clinical diagnostic capabilities superior to those of healthcare professionals. This has been concluded by a study led by Adam Rodman, an expert in internal medicine at Beth Israel Deaconess Medical Center in Boston.
The work published in the magazine JAMA Network sought to determine whether the use of large language models (LLMs) could improve diagnostic reasoning among family, internal, and emergency medicine specialists. The scientists carried out a trial in which 50 resident and associate doctors from the United States participated. The objective was to evaluate the potential of ChatGPT-4 for medical assessments, compared to traditional support methods. The procedure used a standardized and third-party validated rubric. Performance was graded based on accuracy of results, response time, and relevance of confirmatory and oppositional factors.
The participants were tasked with evaluating six confidential and real stories to deliver an opinion in 60 minutes. They were divided into two groups. The first had access to a bot supported by ChatGPT-4, while the second could only use conventional exploratory techniques. Doctors who had the support of the AI system were correct in their diagnoses in 76% of cases. Their peers without technological assistance made an accurate judgment 74% of the time.
Rodman and his colleagues conducted a secondary analysis to measure the independent performance of the AI resource and compare it with the findings obtained in the first phase. He chatbot based on ChatGPT-4 achieved an accuracy rate of 90% on average. “The LLM alone demonstrated higher performance than both sets of doctors, indicating the need to work on developing the technology and educating the workforce to take advantage of the benefits of artificial intelligence in practice,” the authors note.
The research indicates that health professionals are increasingly exposed to AI solutions that could facilitate and optimize their work. Despite this, few know how to exploit the benefits of this technology. Rodman has attributed the trend to a cognitive bias that he says is common in the medical professions. In statements taken up by New York Times ensures that specialists tend to prioritize their judgment, almost always based on their previous experience, over objective and contradictory evidence.
The researchers caution that “the results of this experiment should not be interpreted as an indication that LLMs should be used for diagnosis autonomously and without physician supervision.” They explain that what they have shown is that greater development of interactions between humans and computers is required to take advantage of the capabilities of AI in clinical decision-making mechanisms. “AI systems should be doctor extenders that offer valuable second opinions on diagnoses,” Rodman reiterates.
#ChatGPT4 #clinical #diagnoses #accurate #doctors #study #finds