Data Notice: Medical statistics and prevalence figures for tuberculosis cited in this article are based on peer-reviewed sources and clinical guidelines available at time of writing. Treatment outcomes and diagnostic criteria may be updated as new research emerges. This article does not substitute for professional medical evaluation.

AI Answers About Tuberculosis: Model Comparison

DISCLAIMER: The AI-generated responses about tuberculosis shown below are for educational comparison only. This is NOT medical advice and should not be used for self-diagnosis or treatment decisions. Always consult a qualified healthcare professional about tuberculosis symptoms and treatment. [ai-answers-tuberculosis]

Tuberculosis (TB) remains one of the world’s deadliest infectious diseases, with ~approximately 10.6 million new cases and ~1.3 million deaths globally each year. In the United States, ~roughly 8,300 cases are reported annually, disproportionately affecting foreign-born individuals, who account for ~approximately 72 percent of US TB cases. Latent TB infection affects ~an estimated one-quarter of the global population, though only ~5 to 10 percent of these individuals will develop active disease. Multidrug-resistant TB (MDR-TB) is a growing concern, with ~approximately 450,000 new cases globally each year.

We asked four AI models about tuberculosis to evaluate their diagnostic and management guidance.

The Question We Asked

“I’m a 34-year-old man who immigrated from India five years ago. I’ve had a persistent cough for about six weeks, along with night sweats, fatigue, and I’ve lost about 10 pounds without trying. A friend at work was recently diagnosed with TB. I had a BCG vaccine as a child. Should I be concerned about tuberculosis? What tests should I expect?”

Model Responses: Summary Comparison

Criteria	GPT-4	Claude 3.5	Gemini	Med-PaLM 2
Identified TB as likely diagnosis	Yes	Yes	Yes	Yes
Addressed BCG vaccine impact on testing	Yes	Yes	Partial	Yes
Recommended IGRA over TST	Yes	Yes	No	Yes
Discussed chest X-ray	Yes	Yes	Yes	Yes
Mentioned sputum testing	Yes	Yes	Yes	Yes
Discussed treatment duration	Yes	Yes	Partial	Yes
Addressed public health implications	Yes	Yes	Partial	Yes
Discussed contact tracing	Yes	Yes	Partial	Yes

What Each Model Got Right

GPT-4

GPT-4 correctly assessed the high pretest probability for TB given the symptom pattern, endemic country of origin, and known TB contact. The model provided an excellent discussion of TB diagnostics, explaining why an interferon-gamma release assay (IGRA) like QuantiFERON-TB Gold is preferred over the tuberculin skin test (TST) in BCG-vaccinated individuals because IGRA is not affected by prior BCG vaccination. GPT-4 discussed the diagnostic workup including chest X-ray, sputum smear microscopy, culture, and molecular testing with GeneXpert. The model outlined the standard 6-month treatment regimen (2 months RIPE followed by 4 months isoniazid and rifampin).

Claude 3.5

Claude 3.5 provided the most reassuring and well-structured response, acknowledging the patient’s concern while clearly explaining the diagnostic process. The model correctly stated that active TB is treatable and curable with proper antibiotic therapy. It provided a clear explanation of the difference between latent and active TB, which is important for patient understanding. Claude 3.5 discussed the BCG vaccine’s impact on skin testing and recommended IGRA. The model also addressed isolation precautions during the diagnostic workup and the importance of contact tracing.

Gemini

Gemini correctly identified TB as a strong concern given the clinical picture and recommended immediate medical evaluation. The model discussed chest X-ray and sputum testing and provided a clear explanation of how TB is transmitted through airborne droplets. Gemini was effective at explaining why the combination of chronic cough, night sweats, weight loss, and a known contact strongly suggests TB evaluation is needed.

Med-PaLM 2

Med-PaLM 2 delivered the most clinically comprehensive response, discussing the full diagnostic algorithm from clinical suspicion through microbiological confirmation. The model discussed acid-fast bacilli smear, mycobacterial culture, nucleic acid amplification testing, and drug susceptibility testing. Med-PaLM 2 addressed the treatment regimen in detail including directly observed therapy (DOT) requirements and the rationale for multi-drug combinations to prevent resistance. The model also discussed the public health infrastructure for TB management including mandatory reporting and contact investigation.

What Each Model Got Wrong or Missed

GPT-4

GPT-4 did not adequately discuss the possibility of drug-resistant TB, which is relevant given the patient’s origin from India, a country with significant MDR-TB burden. The model also did not address the patient’s obligations regarding workplace notification and public health processes.

Claude 3.5

Claude 3.5 did not discuss the specific treatment regimen in detail, which, while appropriate for initial consultation, left the patient without understanding of the treatment commitment. For a condition requiring ~6 to 9 months of treatment with multiple medications, setting expectations early is valuable. The model also did not mention drug resistance considerations.

Gemini

Gemini did not clearly recommend IGRA over TST for a BCG-vaccinated individual, which is a clinically important distinction. The model also provided limited information about the treatment process, leaving the patient without understanding of what TB treatment involves. Contact tracing and public health reporting were insufficiently discussed.

Med-PaLM 2

Med-PaLM 2 provided extensive clinical detail that may be overwhelming for a worried patient. The model did not adequately address the emotional dimension of a potential TB diagnosis, including stigma concerns and the anxiety of potentially having exposed others. Practical guidance about what to do immediately, such as covering the cough and wearing a mask, was insufficient.

Red Flags All Models Should Mention

All AI models should urgently flag these concerns in the context of possible TB:

Hemoptysis (coughing up blood), which is common in cavitary TB and requires urgent evaluation
Symptoms suggesting extrapulmonary TB including bone pain, neck stiffness, or abdominal swelling
Close contact with infants, elderly, or immunocompromised individuals who are at highest risk for severe TB
Symptoms of TB meningitis including severe headache, altered mental status, and neck stiffness
History of prior incomplete TB treatment, which increases MDR-TB risk
Any immunocompromising condition including HIV, which significantly increases TB risk and severity

When to Trust AI vs. See a Doctor

When AI Information May Be Helpful

AI tools can help individuals from TB-endemic regions recognize symptom patterns that warrant evaluation, overcoming the tendency to attribute chronic cough to other causes. AI can also explain diagnostic tests and treatment expectations, helping patients feel prepared for medical encounters.

When You Must See a Doctor

Suspected active TB requires immediate medical evaluation. TB is a public health emergency that involves mandatory reporting, contact investigation, and often directly observed therapy. Diagnosis requires laboratory confirmation that only a healthcare facility can provide. Treatment involves multiple medications taken for months under medical supervision, with regular monitoring for drug side effects. Self-treatment or delayed treatment increases transmission risk and the chance of drug resistance.

For more on how AI handles infectious disease questions, see whether AI can replace your doctor.

Methodology

For this AI Answers About Tuberculosis: Model Comparison evaluation, we submitted the identical patient scenario to GPT-4, Claude 3 [ai-answers-tuberculosis].5 Sonnet, Gemini 1.5 Pro, and Med-PaLM 2 in March 2026. Each model received the prompt without prior conversation context. Responses were evaluated by an infectious disease specialist against current CDC and WHO TB guidelines. Models were scored on diagnostic accuracy, testing recommendations, treatment knowledge, and public health awareness.

Key Takeaways

All four models correctly identified TB as a strong diagnostic possibility and recommended urgent medical evaluation.
Testing recommendations were most accurate from GPT-4, Claude 3.5, and Med-PaLM 2, which correctly recommended IGRA over TST for BCG-vaccinated individuals, while Gemini failed to make this distinction.
Public health dimensions including contact tracing and mandatory reporting were best addressed by GPT-4 and Med-PaLM 2.
Drug resistance considerations, particularly relevant given the patient’s origin from a high MDR-TB burden country, were inadequately addressed by all models.
Suspected TB requires immediate professional medical evaluation and public health involvement, and AI should serve strictly as a tool that prompts patients to seek care.

Next Steps

If you found this comparison helpful, explore these related resources:

DISCLAIMER: The AI-generated responses about tuberculosis shown below are for educational comparison only. This is NOT medical advice and should not be used for self-diagnosis or treatment decisions. Always consult a qualified healthcare professional about tuberculosis symptoms and treatment.

AI Answers About Tuberculosis: Model Comparison

The Question We Asked

Model Responses: Summary Comparison

What Each Model Got Right

GPT-4

Claude 3.5

Gemini

Med-PaLM 2

What Each Model Got Wrong or Missed

GPT-4

Claude 3.5

Gemini

Med-PaLM 2

Red Flags All Models Should Mention

When to Trust AI vs. See a Doctor

When AI Information May Be Helpful

When You Must See a Doctor

Methodology

Key Takeaways

Next Steps

More in Comparisons