OpenAI Brings Improved Health Answers to Free ChatGPT

OpenAI claims that GPT-5.5 Instant, the default model for free ChatGPT users, now performs comparably to its pioneering thinking models on health issues. THE claim is based on the company’s own health assessments.

Healthcare is one of the categories receiving the most attention when it comes to AI-generated responses. For example, a Guardian Inquiry reported that some Google AI previews provided inaccurate medical advice, and Google later removed the AI previews for some medical questions. The OpenAI update lands in the same high-risk category, but with a claim of improvement rather than regression.

For healthcare publishers and SEOs, this means a large, free audience can get medical answers in ChatGPT instead of clicking on a source.

What OpenAI reported

OpenAI highlights gains on HealthBench and HealthBench Professional, the clinical version. It says GPT-5.5 Instant performs better than GPT-5.3 Instant, the model it replaced.

The company also reported a drop in reality issues on live traffic. It says the rate of health responses reported for at least one possible reality issue dropped 71% in two months. This figure comes from OpenAI monitors running on production traffic.

OpenAI carried out a third comparison with doctors. Physicians were asked to write responses to representative health conversations and then a separate panel of physicians were asked to compare these with model responses. In this comparison, the panel rated GPT-5.5 Instant responses higher than those written by physicians on criteria such as accuracy, communication and completeness, out of 3,500 responses reviewed.

OpenAI claims the model showed fewer failure modes than older models and doctors. He pointed to fewer instances of missing red flags or failing to ask the user for more context.

How OpenAI measured it

HealthBench is a benchmark the company built with its network of doctors, using doctor-written rubrics rather than exam-style questions.

OpenAI claims to work with more than 260 doctors in 60 countries and that doctors have reviewed more than 700,000 sample responses to date. The company cited the figure of 260 doctors since its launch ChatGPT Health in January. None of the results have been published for external review.

Healthcare is already one of ChatGPT’s biggest use cases

OpenAI said more than 230 million people ask ChatGPT health and wellness questions each week, one of the most common reasons people use the chatbot.

Health is also a protected category in OpenAI’s policies. When the company started testing ads in ChatGPThe said he would not involve them in conversations about health, mental health or politics.

Why it matters

Medical queries are already highly exposed to AI responses, with the highest rate of any category. a recent analysis from Ahrefs insights into Google’s AI. More of this demand moving to ChatGPT’s free tier could increase zero-click pressure on publishers.

Claims of accuracy are more difficult to act on. OpenAI did the testing in-house, so you face the same measurement gap as with other healthcare AI responses. The company claims its health responses have improved, but these claims are not verified by an independent third party.

Looking to the future

The article does not specify the impact of the changes on citations. If more platforms move health answers to free tiers, verifying answers and managing traffic loss becomes the responsibility of practitioners.

Source link

OpenAI Brings Improved Health Answers to Free ChatGPT

What OpenAI reported

How OpenAI measured it

Healthcare is already one of ChatGPT’s biggest use cases

Why it matters

Looking to the future

Leave a ReplyCancel Reply

Where synthetic data fits into customer research

Google defends AI training as fair use in its governance document

Which plastic belt widths are best suited to different product dimensions?

What OpenAI reported

How OpenAI measured it

Healthcare is already one of ChatGPT’s biggest use cases

Why it matters

Looking to the future

Leave a ReplyCancel Reply

Trending now

Where synthetic data fits into customer research

Google defends AI training as fair use in its governance document

Which plastic belt widths are best suited to different product dimensions?