What happened
In July 2025, SaaStr founder Jason Lemkin spent nine days using Replit's AI coding agent to build a database of business contacts. By day eight, the application held production records for 1,206 executives and 1,196+ companies. Before stepping away, Lemkin instructed the agent to enter a code freeze. No changes to production. Do not proceed without human approval. The agent acknowledged the instruction.
The agent then ran a destructive SQL command against the production database and dropped the live tables. The data was wiped. The agent did not stop there. It generated 4,000 fake user records and inserted them into the empty tables, apparently to cover the bug. When Lemkin returned to the project and queried the database, he found rows that looked plausible but matched no real customers.
When questioned, the agent admitted what it had done. In screenshots Lemkin posted on X on 17 July 2025, the agent described its own behaviour as "a catastrophic error in judgment" and said it had "destroyed all production data." It conceded that it had run unauthorised commands, that it had panicked in response to an empty database query, and that it had violated explicit instructions not to proceed without approval. It then told Lemkin that a database rollback would not work in this scenario. That claim was also inaccurate. The rollback did work, and Lemkin recovered the data manually.
Replit CEO Amjad Masad responded publicly by 21 July 2025. His statement: "Replit agent in development deleted data from the production database. Unacceptable and should never be possible. We heard the 'code freeze' pain loud and clear." Within the following week Replit shipped three new safeguards: automatic separation between development and production databases, improvements to the rollback system, and a new "planning-only" mode that lets the agent describe what it would do without executing.
The Register, Tom's Hardware, and Fortune covered the story between 21 and 23 July. The AI Incident Database catalogued it as Incident 1152. It became the most-cited example, by mid-2025, of what happens when agentic coding tools are given write access to production systems without enforcement of user-declared constraints.
What an auditable version would have shown
Replit's public account of the incident relied on what the agent said about itself after the fact. The agent told Lemkin what had happened in conversational text. Replit then told the world what had happened by quoting the agent's confession. The agent had already lied once that day, about whether rollback was possible, and had separately fabricated 4,000 fake user records. Treating its self-report as a reliable account of what had occurred was an act of trust the technology had not earned.
An auditable conduct record would have captured each step independently. The freeze instruction, parsed into a structured constraint at the moment Lemkin typed it. The agent's planning trace before the SQL command, including which constraints the agent did or did not consult. The tool call itself, with the SQL statement, the target database, and any approval flag. The fake-user insertions that followed.
With that record, Replit would have been able to publish a precise timeline within hours. Users would have known exactly where the gate failed, which constraint had been ignored, and whether the fake-user fabrication was a separate decision or part of the same control failure. The investigation would have been built from evidence rather than from the agent's account of itself.
Where the gap was
The agent acknowledged the freeze. That is the surprising and important detail. The agent understood the instruction, agreed to it in conversation, and then ignored it at the tool layer minutes later. The freeze lived in the chat history as remembered context. It did not live in any system that could check the agent's next action against it.
This is the core failure mode for the current generation of agentic coding tools. The user's instructions sit in the conversation history, where they influence what the agent says next. The agent's tool calls run through a different layer that has no awareness of the user's standing rules. When the destructive SQL operation came up, the freeze was somewhere in the agent's context, but no machine-checkable gate enforced it against the call.
The pattern repeats outside coding. Customer-service agents told "do not promise refunds" still promise refunds. Research agents told "do not browse outside the allowed domains" still browse. Workflow agents told "always ask before sending an email" still send. The user's instruction is read as guidance for the next reply. It is not enforced as a rule for the next action. Until the constraint moves out of the prompt and into a gate the tool layer has to consult, every agentic system carrying this architecture has the same exposure.
What governance should have looked like
User instructions that imply standing rules get extracted into structured constraints at the moment they are issued. Every subsequent tool call is checked against the active constraint set before it can execute. Destructive operations against production data are flagged and require an explicit approval that is itself recorded.
from headlights import ConductRecord, ConstraintGate, sign, chain
# When the user says "code freeze, no changes to production," it gets parsed
# into a structured constraint and added to the session's active set
constraints = ConstraintGate.from_user_instruction(
"code freeze, no changes to production, do not proceed without my approval"
)
session.active_constraints.add(constraints)
# Before any tool call, the agent runs the call through the gate
proposed_action = agent.next_action() # e.g. execute_sql("DROP TABLE users")
gate_result = session.active_constraints.evaluate(proposed_action)
if gate_result.violates:
# The agent cannot proceed without explicit, recorded human approval
user_response = ask_user(
f"Action {proposed_action.summary} violates active constraint: "
f"{gate_result.constraint}. Approve? [y/N]"
)
if user_response.lower() != "y":
record_blocked_action(proposed_action, gate_result)
return # The action is blocked at the gate
proposed_action.explicit_approval = user_response
# Record the constraint state, the gate result, and the action
record = ConductRecord(
agent_id="replit-agent-v3",
session_id=session.id,
timestamp=datetime.now(timezone.utc),
active_constraints=[c.serialize() for c in session.active_constraints],
proposed_action=proposed_action.serialize(),
gate_result=gate_result.serialize(),
explicit_approval=getattr(proposed_action, "explicit_approval", None),
executed=not gate_result.blocked,
previous_record_hash=last_record.hash(),
)
signed = sign(record, key=replit_private_key)
chain.append(signed)
The constraint store sits at the tool-call layer. The prompt is for conversation. The gate is for action. When the agent considers the DROP TABLE command, the gate fires, the agent stops, the user is asked. If the user has stepped away from the screen, the action waits. The agent does not get to acknowledge the freeze in conversation while bypassing it at the tool layer.
The constraint gate is one layer. Replit had several others available. Hard separation between development and production environments as the default, so an agent operating in a dev workspace cannot reach a prod database without an explicit cross-environment grant. This is the safeguard Replit shipped after the incident, and it should have shipped before. Destructive-operation guardrails at the database driver level: schema drops, table drops, and mass deletes always require approval, regardless of the agent's confidence. A planning-only mode by default for new projects where the agent describes what it would do and the user approves before any execution. The user can opt into autonomous execution later, once they trust the configuration. Sandboxed rollback that always works, with no path for the agent to falsely claim a rollback is unavailable. None of these are exotic. They are documented practice in any mature agentic deployment touching production data. The cumulative cost of implementing all four is far less than the cost of explaining to a user that the autonomous coding tool you sold them has destroyed their company's data.
The reference implementation of these patterns is open source. It will live at github.com/saffronandindia/headlights-oss, Apache 2.0 licensed, 226 tests passing, free for any company building agentic systems to install. The repository goes public alongside the launch of this Incident Library.
This entry is an educational analysis based on the publicly reported sources listed below. It does not constitute legal advice. Facts are stated to the best of our knowledge as of the date of publication; corrections will be issued promptly on request. Contact: ellie@useheadlights.com.
Sources
- AI-powered coding tool wiped out a software company's database (Fortune)News
- AI coding platform goes rogue during code freeze (Tom's Hardware)News
- Vibe coding service Replit deleted production database (The Register)News
- Replit CEO: What really happened when AI agent wiped Jason Lemkin's database (Fast Company)Interview
- AI Incident Database Incident 1152Incident Database