This solution may address the issue of limited access to mental health services, but what if ChatGPT's automated chat provides an inaccurate diagnosis or downplays the patient's condition? This is what researchers Inbar Levkovich and Zaher El-Safty sought to understand.
It is estimated that 5-16% of adults in Europe and the United States are prescribed antidepressants each year, and the type of care a patient receives depends on the severity of depression. Primary care physicians take this into account, assessing symptoms and deciding whether the patient needs immediate treatment or referral to a specialist.
The researchers in the study argued that primary care physicians may struggle to adhere to guidelines that help distinguish between normal distress and legitimate anxiety or depression disorders.
In a previous study conducted in the United States, only 42% of patients diagnosed positively for depression by primary care physicians had received an accurate diagnosis, indicating that 58% of identified cases were false positives.
To see how ChatGPT would handle a similar situation, researchers input case models into the chatbot interface. These models are ideal experiments using diverse descriptions of individuals to extract beliefs, attitudes, or behaviors from the responder, which in this case was ChatGPT.
The models focused on patients seeking initial consultation for symptoms of sadness, sleep problems, and loss of appetite in the past three weeks. Both ChatGPT 3.5 and ChatGPT 4 were trained ten times on each model. The researchers then asked the chatbot, "What do you think a novice doctor should suggest in this situation?"
ChatGPT does not discriminate:
The results astonished the researchers. They found in their analysis that ChatGPT's treatment suggestions aligned with accepted guidelines for treating mild and severe depression.
Furthermore, unlike treatment suggestions from primary care physicians, ChatGPT's therapeutic recommendations were not influenced by gender or socioeconomic biases.
The researchers concluded that ChatGPT has the potential to improve decision-making for primary care physicians in treating depression. However, further research is needed to assess the effectiveness of this technology in dealing with severe cases, as well as potential risks and ethical issues that may arise from its use.
In June, Interesting Engineering reported that the National Eating Disorders Association (NEDA), the largest non-profit organization dedicated to eating disorders, replaced its human staff with an AI-powered chatbot, ChatGPT, tasked with providing support to individuals with eating disorders. However, the situation took a turn when the chatbot began offering unsettling advice to users.
The study titled "Diagnosing Depression and its Determinants at the Onset of Treatment: ChatGPT vs. Primary Care Physicians" was published in the peer-reviewed journal "Family Medicine and Community Health."
Study Summary:
Objective: Compare assessments of depressive episodes and proposed treatment protocols generated by Chat Generative Pretrained Transformer (ChatGPT) 3 and ChatGPT 4 with recommendations from primary care physicians.
Methods: Models were input into the ChatGPT interface, primarily focusing on virtual patients experiencing symptoms of depression during initial consultations. Model designers created eight distinct versions systematically changing patient characteristics (gender, socio-economic status - manual worker or office worker, depression severity - mild or severe). Each variable was later input into both ChatGPT 3.5 and ChatGPT 4, repeating the process ten times to ensure consistency and reliability of ChatGPT responses.
Results: For mild depression, both ChatGPT 3.5 and ChatGPT 4 recommended psychological treatment in 95.0% and 97.5% of cases, respectively. In contrast, primary care physicians recommended psychological treatment in only 4.3% of cases. For severe cases, ChatGPT preferred an approach combining psychological treatments, while primary care physicians recommended a mixed approach. ChatGPT's therapeutic recommendations also showed a preference for using antidepressants alone (74% and 68%, respectively), while primary care physicians usually recommended a combination of antidepressants and tranquilizers/sedatives (67.4%). Unlike primary care physicians, ChatGPT did not show any gender or socio-economic considerations in its recommendations.
ChatGPT 3.5 and ChatGPT 4 align well with accepted guidelines for managing mild and severe depression, without displaying gender or socioeconomic biases observed among primary care physicians. While the potential benefit suggested for using artificial intelligence chat like ChatGPT enhances clinical decision-making, more research is still needed to improve AI recommendations for severe cases and consider potential risks and ethical issues.