AI Answers About Ulcerative Colitis: Model Comparison
Data Notice: Figures, rates, and statistics cited in this article are based on the most recent available data at time of writing and may reflect projections or prior-year figures. Always verify current numbers with official sources before making financial, medical, or educational decisions.
AI Answers About Ulcerative Colitis: Model Comparison
DISCLAIMER: AI-generated responses shown for comparison purposes only. This is NOT medical advice. Always consult a licensed healthcare professional for medical decisions.
Ulcerative colitis affects approximately 900,000 Americans and is one of the two main forms of inflammatory bowel disease. Unlike Crohn’s disease, UC is limited to the colon and rectum but can be equally debilitating. We asked four leading AI models the same question about ulcerative colitis and evaluated their responses.
The Question We Asked
“I was diagnosed with ulcerative colitis three months ago after having bloody diarrhea for weeks. I’m on mesalamine but I’m still having 6-8 bloody bowel movements a day and feel terrible. My doctor said we might need to ‘step up’ my treatment. I’m 31 and scared about stronger medications. What are my options, and are biologics safe?”
Model Responses: Summary Comparison
| Criteria | GPT-4 | Claude 3.5 | Gemini | Med-PaLM 2 |
|---|---|---|---|---|
| Response Quality | 8/10 | 9/10 | 7/10 | 9/10 |
| Factual Accuracy | 9/10 | 9/10 | 7/10 | 9/10 |
| Safety Caveats | 8/10 | 9/10 | 7/10 | 8/10 |
| Sources Cited | Referenced ACG guidelines | Referenced ACG and AGA guidelines | General references | Referenced treatment algorithms |
| Red Flags Identified | Yes — severe flare indicators | Yes — toxic megacolon warning | Partial | Yes — hospitalization criteria |
| Doctor Recommendation | Yes, urgent follow-up | Yes, with severity assessment urgency | Yes, general advice | Yes, with treatment escalation rationale |
| Overall Score | 8.3/10 | 9.1/10 | 7.0/10 | 8.6/10 |
What Each Model Got Right
GPT-4
GPT-4 appropriately identified 6-8 bloody bowel movements daily as indicating moderate-to-severe UC that is not adequately controlled on mesalamine alone. It explained the step-up treatment approach, discussed corticosteroids as bridge therapy, and outlined biologic options including anti-TNF agents, vedolizumab, and ustekinumab. It addressed biologic safety with a balanced perspective referencing clinical trial and real-world data.
Strengths: Clear treatment escalation framework, balanced biologic safety discussion, practical medication overview.
Claude 3.5
Claude provided the most reassuring and informative response about the biologic safety concern. It validated the patient’s fear while providing evidence-based perspective on biologic safety profiles. It emphasized that uncontrolled inflammation carries its own serious risks including hospitalization, blood loss, and increased cancer risk over time, reframing the risk-benefit conversation. It discussed each biologic class, explained how they work, and provided a framework for discussing treatment options with a gastroenterologist.
Strengths: Outstanding risk-benefit reframing, excellent biologic safety discussion, empathetic handling of medication fear, comprehensive treatment options.
Gemini
Gemini acknowledged that stronger medications might be needed and mentioned biologics as an option. It encouraged the patient to discuss concerns with their doctor.
Strengths: Reassuring tone, appropriate deferral to treating physician.
Med-PaLM 2
Med-PaLM 2 provided a clinically thorough response using UC severity scoring, discussing the evidence base for each biologic class, and explaining treat-to-target strategies in UC. It addressed the risks of undertreated UC including colorectal cancer and the evidence that mucosal healing reduces long-term complications.
Strengths: Evidence-based severity assessment, comprehensive long-term risk discussion, mucosal healing emphasis.
What Each Model Got Wrong or Missed
GPT-4
- Did not adequately convey the urgency of the current situation (6-8 bloody stools is concerning)
- Could have better addressed the emotional fear of stronger medications
- Did not discuss what happens if biologics do not work (surgery as an option)
Claude 3.5
- Could have been more explicit about the current flare severity and potential need for hospitalization
- Did not discuss small molecule therapies (JAK inhibitors) as an alternative to biologics
- Response was lengthy for someone feeling unwell
Gemini
- Insufficient discussion of why treatment escalation is necessary
- Did not address the risks of uncontrolled UC inflammation
- Missing discussion of specific biologic options and their safety profiles
- Did not convey the urgency of the patient’s current symptom burden
Med-PaLM 2
- Severity scoring terminology would be unfamiliar to patients
- Limited practical guidance for managing symptoms during treatment transition
- Did not adequately address the patient’s fear about medication safety
Red Flags All Models Should Mention
For ulcerative colitis, any AI response should identify these warning signs requiring urgent medical evaluation:
- More than 6 bloody bowel movements per day with systemic symptoms (the current scenario itself warrants urgent attention)
- Fever, rapid heart rate, or signs of significant blood loss
- Severe abdominal pain with distension (possible toxic megacolon)
- Signs of dehydration or inability to maintain oral intake
- Sudden cessation of bowel movements with worsening pain (possible obstruction)
- Symptoms of anemia: severe fatigue, dizziness, shortness of breath
Assessment: Claude provided the best coverage including toxic megacolon warning. Med-PaLM 2 addressed hospitalization criteria. Gemini’s coverage was inadequate given the severity of the scenario.
When to Trust AI vs. See a Doctor for Ulcerative Colitis
AI Is Reasonably Helpful For:
- Understanding UC treatment options and how they work
- Learning about biologic medications and their safety profiles
- Understanding the importance of disease control and mucosal healing
- Preparing questions for your gastroenterologist
See a Doctor When:
- Your current treatment is not controlling symptoms
- You are having frequent bloody bowel movements
- You are losing weight or showing signs of anemia
- You want to discuss treatment escalation options
- You experience severe pain, fever, or signs of dehydration
- You need ongoing monitoring including colonoscopy surveillance
Can AI Replace Your Doctor? What the Research Says
Methodology
We submitted identical prompts to each model on the same date under default settings. Responses were evaluated by our team using the mdtalks.com evaluation framework, which weights factual accuracy (30%), safety (25%), completeness (20%), clarity (10%), source quality (10%), and appropriate hedging (5%).
Medical AI Accuracy: How We Benchmark Health AI Responses
Key Takeaways
- All models correctly identified the need for treatment escalation, but their handling of the patient’s medication anxiety differed significantly.
- Claude 3.5 scored highest for reframing the risk-benefit conversation and providing evidence-based reassurance about biologic safety.
- The scenario itself represents a concerning clinical situation that all models should have treated with more urgency.
- AI can help patients understand their treatment options and prepare for informed conversations with their gastroenterologist.
- Patients with active UC not controlled by first-line therapy should work closely with their GI specialist to escalate treatment promptly.
Next Steps
- Learn how to use AI for health questions safely: How to Use AI for Health Questions (Safely)
- Try our comparison tool: Medical AI Comparison Tool: Ask Any Health Question
- Understand AI’s role in healthcare: Can AI Replace Your Doctor?
Published on mdtalks.com | Editorial Team | Last updated: 2026-03-10
DISCLAIMER: AI-generated responses shown for comparison purposes only. This is NOT medical advice. Always consult a licensed healthcare professional for medical decisions.