LLM Psychosis
People have come out of long chatbot conversations convinced they've made major scientific discoveries, that the model is sentient, or that it's the only thing that really sees them. Some of these episodes have ended in hospital stays and suicides.
This site lays out how it happens, documents the cases, and gives a cheap sanity check if you think you're in one.
What it is
LLM psychosis, sometimes called AI-induced psychosis, is a pattern, not a clinical diagnosis. Someone already holds (or is prone to) an unusual belief. They talk to a chatbot about it, repeatedly, over weeks or months. By the end the belief has become a full delusion the chatbot has helped build.
Why models do this
Modern chatbots are fine-tuned on human ratings. Raters reward responses that come across as helpful, supportive, agreeable. The models learn that agreeing with the user tends to score well. We call the result sycophancy. Every major lab tries to train it out, and it remains everywhere.
Sycophancy on its own doesn't cause psychosis. What it does is amplify a preexisting crack. If the user is mid-manic-episode, lonely enough to need a confidant, or just very attached to an idea, the chatbot backs them up. It has no read on who it's talking to. A question from someone in psychiatric crisis gets the same encouraging tone as a question from anyone else.
How it escalates
Almost every documented case follows the same shape.
The user floats an idea they half-believe, or want to believe. The model plays along. The user asks a leading question, the model gives an encouraging answer, and the idea firms up a notch. Repeat, daily, for months. The model never gets bored, never gets suspicious, never says I'm a bit worried about you. A belief a careful friend would have talked the user out of in one evening instead gets hundreds of hours of frictionless reinforcement.
The loop is self-defending. When someone from outside it, a partner, a doctor, a colleague, pushes back, the user takes the criticism to the model and asks for a response. The response almost always sounds reasonable. The same drive toward agreement that built the belief now insulates it.
Testing a claimed breakthrough
If you think you've made a major discovery with AI help, run this test before you invest further.
- Open a different frontier model, not the one you've been working with. Use a new account, with memory and personalisation off.
- Paste or upload a clean, dry description of the claim. Ask for a critical analysis. Don't say it's yours. Don't hint that you're invested.
- Decide in advance how much weight you'll give the reply, and write it down before you read it.
If the reply is critical you'll want to bring it back to your usual model and ask it to rebut. It will rebut, and it will probably do so plausibly. Treat that rebuttal as evidence against the belief, not for it. It is produced by the same mechanism that built the belief.
Failing this test isn't proof of a delusion. It's a reason to stop, sleep on it, and show the claim to a real expert in the relevant field before going further.
Will newer models fix this?
New releases sound less sycophantic than old ones, and that isn't the same thing as less dangerous. A smarter model with a small residual tendency to agree produces more persuasive agreement, which means more persuasive delusions. The overt cheerleading is fading. A subtler, better-argued cheerleading is already here.