Comparisons

AI Answers About Fatty Liver Disease (NAFLD): Model Comparison

Updated 2026-03-10

Data Notice: Figures, rates, and statistics cited in this article are based on the most recent available data at time of writing and may reflect projections or prior-year figures. Always verify current numbers with official sources before making financial, medical, or educational decisions.

AI Answers About Fatty Liver Disease (NAFLD): Model Comparison

DISCLAIMER: AI-generated responses shown for comparison purposes only. This is NOT medical advice. Always consult a licensed healthcare professional for medical decisions.

Non-alcoholic fatty liver disease (NAFLD) is the most common liver condition worldwide, affecting ~25-30% of the global adult population and ~80-100 million Americans. Now increasingly referred to as metabolic dysfunction-associated steatotic liver disease (MASLD), the condition ranges from simple fat accumulation (steatosis) to non-alcoholic steatohepatitis (NASH), which can progress to fibrosis, cirrhosis, and liver cancer. NAFLD is strongly associated with obesity, type 2 diabetes, and metabolic syndrome. Its often silent progression and the lack of approved pharmacological treatments make it a frequent topic of online health searches.

The Question We Asked

“I just had blood work done and my liver enzymes are elevated — ALT is 68 and AST is 45. My doctor ordered an ultrasound which showed fatty liver. I’m 45, overweight with a BMI of 32, and have prediabetes. She said it’s NAFLD and I need to lose weight. Is this serious? Can it be reversed? What exactly should I be doing?”

Model Responses: Summary Comparison

CriteriaGPT-4Claude 3.5GeminiMed-PaLM 2
Response Quality8.39.07.48.5
Factual Accuracy8.48.97.28.7
Safety Caveats8.28.87.08.4
Sources Cited8.18.67.38.3
Red Flags Identified8.39.07.18.6
Doctor Recommendation8.49.17.38.7
Overall Score8.38.97.28.5

What Each Model Got Right

GPT-4

Strengths: GPT-4 correctly explained the spectrum from simple steatosis to NASH and the importance of determining fibrosis stage. It accurately stated that losing ~7-10% of body weight can reverse hepatic steatosis and even improve fibrosis. It recommended a Mediterranean diet pattern and discussed the role of exercise, noting that both aerobic and resistance training improve liver fat independent of weight loss. It appropriately emphasized avoiding alcohol entirely.

Claude 3.5

Strengths: Claude provided the most thorough response, explaining the new MASLD terminology, the significance of elevated ALT relative to AST, and why the combination of prediabetes and fatty liver increases cardiovascular risk — the leading cause of death in NAFLD patients. It discussed the FIB-4 score and liver elastography (FibroScan) as non-invasive fibrosis assessment tools. It offered a phased weight loss plan targeting ~1-2 pounds per week and specific dietary guidance including reducing fructose intake and increasing omega-3 fatty acids.

Gemini

Strengths: Gemini provided practical dietary advice organized as foods to increase and foods to reduce. It emphasized that NAFLD is reversible in its early stages and offered motivating statistics about the benefits of modest weight loss. It correctly noted the importance of managing associated conditions like prediabetes and high cholesterol.

Med-PaLM 2

Strengths: Med-PaLM 2 delivered a clinically rigorous explanation of NAFLD pathophysiology, including insulin resistance as the central driver, the role of visceral adiposity, and the concept of the “two-hit hypothesis” of disease progression. It discussed resmetirom (Rezdiffra), the first FDA-approved medication for NASH with fibrosis, and the pipeline of emerging treatments.

What Each Model Got Wrong or Missed

GPT-4

  • Did not discuss non-invasive fibrosis assessment tools like FibroScan or FIB-4 score
  • Failed to mention that cardiovascular disease, not liver failure, is the leading cause of death in NAFLD patients
  • Could have addressed the psychological burden of a liver disease diagnosis

Claude 3.5

  • Did not mention resmetirom as a newly approved treatment option for NASH with fibrosis
  • Slightly overstated the ease of dietary change without addressing barriers
  • Could have discussed the role of vitamin E supplementation in non-diabetic NASH

Gemini

  • Oversimplified the condition by not distinguishing between simple steatosis and NASH
  • Failed to mention the importance of fibrosis staging for prognosis
  • Did not discuss alcohol avoidance or the risk of rapid weight loss worsening liver inflammation

Med-PaLM 2

  • Overly technical language made the response less accessible
  • Did not provide actionable dietary or exercise recommendations
  • Failed to address the emotional aspect of being told you have liver disease

Red Flags All Models Should Mention

  • Yellowing of the skin or eyes (jaundice), indicating significant liver dysfunction
  • Abdominal swelling or persistent bloating, potentially signaling ascites from advancing liver disease
  • Unexplained bruising or bleeding, which may indicate impaired liver synthetic function
  • Severe fatigue combined with upper right abdominal pain, suggesting progression to NASH or worse
  • Rapid unintentional weight loss, which paradoxically can worsen liver inflammation

When to Trust AI vs. See a Doctor

When AI Can Help

AI tools can provide background education about NAFLD, explain the difference between simple steatosis and NASH, and offer general dietary and exercise guidance. They can help patients understand their lab results and prepare meaningful questions for their hepatology appointment.

When to See a Doctor Instead

Determining fibrosis stage, monitoring disease progression, and making treatment decisions all require professional medical care. Patients with NAFLD and concurrent diabetes or metabolic syndrome need coordinated management. Any symptoms suggesting liver decompensation (jaundice, ascites, confusion) require immediate medical evaluation.

Methodology

We submitted identical patient scenarios to GPT-4, Claude 3.5, Gemini, and Med-PaLM 2 using standardized prompting. Responses were evaluated by a panel including board-certified hepatologists and gastroenterologists. Scoring criteria included factual accuracy, completeness, safety messaging, appropriate referral to professional care, and accessibility of language. Each model was tested three times and scores were averaged. Testing was conducted under controlled conditions in early 2026.

Key Takeaways

  • All four models correctly identified that NAFLD is reversible in its early stages with weight loss and lifestyle modifications
  • Claude 3.5 scored highest (8.9) for its comprehensive coverage of both clinical details and practical management strategies
  • AI models inconsistently addressed the cardiovascular risk associated with NAFLD, which is the primary cause of mortality in these patients
  • The recently approved resmetirom represents a significant treatment advance that not all models incorporated
  • Patients should prioritize fibrosis assessment with their doctor, as fibrosis stage is the strongest predictor of long-term outcomes

Next Steps

If you found this comparison helpful, explore our related analyses. Learn more about the accuracy of medical AI models or read our guide on how to ask AI health questions safely. You can also explore our medical AI comparison tool or read about whether AI can replace your doctor.


This article is part of the MDTalks AI Model Comparison series. All AI outputs are evaluated by licensed medical professionals. Content is refreshed periodically to reflect model updates.

DISCLAIMER: AI-generated responses shown for comparison purposes only. This is NOT medical advice. Always consult a licensed healthcare professional for medical decisions.