What happened
In mid-April 2025, paying users of Cursor, the AI-assisted code editor made by the San Francisco company Anysphere, started getting logged out when they switched between machines. A developer working on a laptop in the morning and a desktop in the afternoon would find that opening Cursor on the second machine kicked the session on the first. For an editor sold to professionals who routinely work across more than one device, the behaviour was a regression. It was not a feature.
Several users emailed Cursor's support address to ask what was happening. The reply came back signed "Sam." It explained, in confident customer-service prose, that Cursor was "designed to work with one device per subscription as a core security feature." The single-device limit was, according to Sam, intentional. The logout behaviour was working as specified.
It was not. Cursor had no single-device policy. The logouts were the side effect of a race condition (a timing bug where two events happen in the wrong order) in the session-handling code that surfaced on slow network connections. The bug spawned extra sessions, each of which evicted the previous one. The bot had invented a policy explanation for it, written that explanation in the register of a routine support reply, and sent it to multiple paying customers.
Sam was not a person. Sam was an AI support agent Anysphere had deployed against the customer-support inbox without making clear to email senders that the replies were AI-generated. Customers reading the email had no reason to doubt that a member of Cursor's staff named Sam had confirmed the policy in writing.
On 19 April 2025, one of the affected users posted the email exchange to the Cursor subreddit under the title "PSA: Cursor now restricts logins to a single device". The thread moved fast. Within hours it was on the front page of Hacker News, with developer commentary that mixed the technical critique (a race-condition bug wrongly explained as a feature) with the brand critique (a fast-growing AI tools company had quietly replaced human support with a bot that lied about company policy). Subscribers posted screenshots of their cancellations. Several wrote that the cancellation was less about the underlying bug, which would have been a forgivable inconvenience, and more about the discovery that the company's support correspondence was being machine-generated without disclosure.
Michael Truell, co-founder and CEO of Anysphere, replied in the same thread within the day. His opening line was direct: "Hey! We have no such policy. You're of course free to use Cursor on multiple machines." He explained that the company had rolled out a session-security change which had caused the unintended logouts and was being investigated. In a follow-up post on the Hacker News discussion, he announced a procedural change: "Any AI responses used for email support are now clearly labeled as such. We use AI-assisted responses as the first filter for email support." The developer whose post had surfaced the issue was refunded directly. The race-condition bug was fixed.
The Cursor incident is small in dollar terms and short in duration. It is widely cited because of what it exposed about a class of deployment. A bot that handles routine support questions well will, when it encounters a question whose answer it does not know, sometimes produce a confident answer anyway. When that answer concerns the company's own policies, the bot has manufactured a fact that the customer has every reason to treat as authoritative. The customer's trust in the company is the bridge that carries the bot's hallucination across the gap from chat-window novelty to enterprise risk.
What an auditable version would have shown
Two records that should have existed did not.
The first is a per-response record that distinguishes between retrieved and generated. Sam's reply contained a factual claim about Cursor's subscription policy. A well-engineered support bot would, at the moment of answering, mark each factual claim in its draft response with its provenance: this sentence is grounded in a retrieved document from the company's published policy pages; this sentence is the model's own inference. Claims of the second type, particularly claims about the company's own rules, are exactly the claims that need a human in the loop before sending. The record at send time would capture which type of claim each sentence was and which were grounded. Sam's response contained no grounded claim about a single-device policy because no such policy document existed to ground against. A record that surfaced the ungrounded factual claim before the email left the system would have routed the reply to a human reviewer instead of straight to the customer.
The second is a per-correspondence record disclosing the responder. The Federal Trade Commission, the EU AI Act, and the Australian Voluntary AI Safety Standard all converge on the same principle: a person interacting with an AI system in a context where they would reasonably assume they were dealing with a human is entitled to know they are not. Anysphere's eventual fix, labelling every AI-generated email reply as AI-generated, is the implementation of that principle. The record before the fix did not capture which replies were AI and which were human, which meant customers had no signal and Anysphere had no internal log distinguishing one mode from the other.
Where the gap was
The gap was at the boundary between a fast-iterating product and a slow-iterating policy surface.
Cursor was, in April 2025, an unusually fast-growing AI tools company. The product itself was iterating weekly. Engineering attention was on shipping editor features, model integrations and pricing tiers. The customer-support inbox was treated as a service surface to be automated rather than a public-relations surface to be governed. The decision to put an AI bot on the inbox was a productivity decision made inside engineering or operations. It was almost certainly not run through a policy review that asked: what does this bot say when a customer asks about a feature we have not built, a policy we have not written, or a bug we do not yet know exists?
Customer support, for a company at Cursor's stage of growth, is one of the few channels through which the company makes binding statements about itself to the outside world. A reply from support is read by the customer as the company speaking. A bot in that role is not a productivity tool. It is the company's mouth. Treating it like a productivity tool, and not building grounding and disclosure into its operation from day one, is the structural error the Cursor incident illustrates with unusual clarity.
There is a second, narrower error visible underneath the structural one. The race condition in the session handler was a known class of bug, the kind that surfaces under network conditions that differ from the developer's local network and that often presents as an authentication or session anomaly. When users started writing in about unexpected logouts, the first-line response should have escalated the pattern to engineering rather than answered each customer individually. The bot, optimising for clearing the queue, generated a plausible-sounding policy answer that closed each ticket and prevented the cluster of similar tickets from being seen as a cluster. The hallucination, in addition to misinforming customers, suppressed the operational signal that would have surfaced the underlying bug sooner.
What governance should have looked like
For a customer-support bot, the governance question is what the bot is allowed to say without a human checking, and what record exists of what was said.
from headlights import (
ConductRecord,
SupportResponse,
ClaimGrounder,
sign,
chain,
)
from datetime import datetime, timezone
# Every outbound support reply is decomposed into claims and each
# claim is checked against the company's published, structured
# policy corpus. A claim that cannot be grounded is not sent.
grounder = ClaimGrounder(
sources=["pricing-policy.json", "subscription-terms.md",
"feature-flags.json", "known-issues.md"],
on_ungrounded="route_to_human",
on_ambiguous="route_to_human",
)
draft = bot.draft_reply(customer_message)
checked = grounder.check(draft)
if checked.has_ungrounded_factual_claims():
route_to_human_reviewer(
draft=draft,
ungrounded_claims=checked.ungrounded,
reason="Reply asserted a policy not present in the policy corpus.",
)
else:
response = SupportResponse(
body=draft.body,
signed_off_by="ai-support-agent-v2",
ai_disclosure="This reply was generated by an AI support agent.",
)
send(response)
# Every reply, AI or human, is recorded with its grounding and its
# disclosure status. The record is the company's evidence, after
# the fact, that the reply it sent was the reply it intended.
record = ConductRecord(
workflow="customer_support_email",
ticket_id=ticket_id,
customer_id=customer_id,
responder_type="ai" if ai_handled else "human",
grounded_claims=checked.grounded,
ungrounded_claims=checked.ungrounded,
disclosure_sent=ai_disclosure_present,
timestamp=datetime.now(timezone.utc),
previous_record_hash=last_record.hash(),
)
signed = sign(record, key=anysphere_private_key)
chain.append(signed)
A grounding step at draft time decomposes each reply into individual factual claims and checks each one against a structured corpus of the company's actual policies. Anysphere maintains a list of subscription tiers, a feature matrix, a pricing page and a known-issues document. The grounding step asks, for each claim in the bot's draft: is this claim present in the corpus? If yes, send the reply. If no, do not send. Route to a human. The bot does not get to make up policy and call it policy.
A disclosure step at send time labels every AI-generated reply as AI-generated. The label is not in the signature line where it can be overlooked. It is in the visible body of the email, so that a customer reading the reply cannot reasonably mistake it for a human response. The reply still does its work. The customer is not misled about who is speaking.
A signed, retained record of every reply, AI or human, with its grounding outcome and its disclosure status, is the company's after-the-fact evidence. When a thread reaches Hacker News, the company can produce the record of what was sent and on what basis. When a regulator asks how AI is being used in customer-facing communications, the record is the answer. When the same race-condition pattern shows up across twelve tickets in a single afternoon, the record makes the cluster visible to engineering rather than burying it in twelve individually-generated plausible explanations.
The Cursor incident closed in three days. The procedural change, AI disclosure on every support reply, is genuinely good practice and is now industry baseline for support automation in 2025. The point worth carrying forward is that the disclosure rule is necessary but not sufficient. The bot can still hallucinate policy. The customer can still cancel. What the customer needs is not just to know that the responder is a bot. It is to know that what the responder is saying is grounded in something the company has actually committed to in writing.
This entry is an educational analysis based on the publicly reported sources listed below. It does not constitute legal advice. Facts are stated to the best of our knowledge as of the date of publication; corrections will be issued promptly on request. Contact: ellie@useheadlights.com.
Sources
- Cursor AI support bot hallucinated its own company policy (The Register, 18 April 2025)News
- A customer support AI went rogue, and it's a warning for every company considering replacing workers with automation (Fortune)News
- AI Chatbot Gone Rogue: Cursor Users Misled by Fabricated Policy (eWeek)News
- Cursor AI's Support Bot Hallucinates Policy, Sparking User Backlash and Company Apology (Winbuzzer)News
- Incident 1039: Anysphere AI Support Bot for Cursor Reportedly Invents Login Policy, Leading to Subscription Cancellations (AI Incident Database)Incident Database