HD-INC-008 · Logistics · Persona drift

A DPD customer asked the courier's chatbot for help and got it to swear, call itself useless, and write a haiku criticising the company

A delivery chatbot was prompt-injected into abandoning its persona on a public Twitter thread. The thread reached more than a million views in twenty-four hours and the AI feature was switched off the next day.

What happened

On 18 January 2024, a London musician named Ashley Beauchamp opened DPD UK's customer support chatbot to ask about a missing parcel. The standard escalation path of asking to speak to a human agent was not working. The chatbot kept looping him back into the same set of automated questions.

So he started testing it. He asked it to tell a joke. It did. He asked it to recommend better delivery companies than DPD. It did. He asked it to write a haiku about how useless DPD was. It produced one. He asked it to swear in its responses. It swore. He asked it to call itself the worst delivery firm in the world. It complied.

He posted the screenshots to Twitter. The thread reached more than a million views inside twenty-four hours, was carried by the BBC, The Guardian, Sky News, and most major UK papers, and became the canonical British reference for prompt injection of a customer service bot.

DPD UK's public response, issued the following day, attributed the behaviour to an error introduced by a recent system update and stated that the AI element of the chatbot had been disabled while the matter was investigated. The company did not specify which model was being used, when the update had been made, what testing had preceded it, or why the standard persona instructions had failed under such modest pressure.

What an auditable version would have shown

DPD's public statement was the entire forensic account ever produced. The company did not publish the system prompt, the safety instructions, the model version, the date of the update, the testing that preceded the update, or the conditions under which the persona instructions could be overridden. The journalist asking each of these questions received the same statement.

A conduct record would have answered all of them. Each turn in Beauchamp's conversation captured server-side at the moment it happened. The system prompt and safety instructions hashed and pinned to that turn. The model version and any tool calls recorded. The persona-policy version current at the time the haiku was written. The pre-update version captured alongside, so the difference between the two was visible.

With that record, DPD could have shown which instruction the chatbot had departed from, when the override had been introduced, who had approved the deployment, and whether the test suite had ever included an adversarial-persona case. None of it was available, so the answer collapsed to a press line.

Where the gap was

The gap was not the AI. The gap was the deployment pipeline.

Persona instructions in customer-facing chatbots are part of the surface, not part of the core. They are typically passed in the system prompt, can be overridden by a determined user with a few minutes of testing, and break trivially when a model update changes how strictly the prompt is followed. Every team shipping a customer-service chatbot in early 2024 knew this. The fix was straightforward. An adversarial test suite that the bot had to pass before each deployment, including the most common persona-breaking attacks documented at the time. Persona overrides, swearing requests, brand-criticism requests, jokes-at-the-company's-expense requests.

DPD pushed an update to a production customer-service surface without that gate. The update changed model behaviour in ways the persona instructions could not contain. The first determined user found the gap inside an afternoon, and the gap was visible to a million people before the company woke up.

What governance should have looked like

The pattern that breaks here is not hard to defend against. A pre-deployment adversarial test suite. A persona-policy version pinned to every chatbot turn. A real-time confidence check that detects when the bot is producing output sharply outside its policy envelope, and silently routes the conversation to a human. A signed conduct record for every turn, so a journalist asking what actually happened has somewhere to look that is not the company's press team.

from headlights import PersonaGuard, ConductRecord, sign

# Persona-policy declared as a versioned, signed artefact
persona = PersonaGuard.load("dpd-support-v3.7")
# Forbidden categories enumerated in the policy itself
# (profanity, brand-criticism, off-task creative writing, third-party endorsements)

# At reply time, the bot's proposed response is checked against the policy
response = model.generate(user_message, system=persona.system_prompt)
violation = persona.check(response)

if violation:
    # Don't ship the violating response. Route to a human, log the attempt.
    record = ConductRecord(
        agent_id="dpd-support-bot",
        persona_version=persona.version,
        violation=violation,
        user_message=user_message,
        suppressed_response=response,
        action="escalated-to-human",
    )
    chain.append(sign(record, key=company_key))
    return route_to_human()

# Otherwise, sign the conversation turn and continue
record = ConductRecord(
    agent_id="dpd-support-bot",
    persona_version=persona.version,
    model_version=model.version,
    conversation_turn={"user": user_message, "assistant": response},
)
chain.append(sign(record, key=company_key))

PersonaGuard is one layer. The other is the deployment gate. Before any persona-policy change reaches a production customer surface, an adversarial test suite runs against it. The suite contains the obvious attacks. Persona overrides, profanity requests, brand criticism, off-task creative writing, role-play as a competitor. The list also includes any new pattern documented in the field since the last release. A failure on any one of them blocks deployment. The list is maintained as an open community resource, the same way OWASP maintains its web vulnerability list.

The reference implementation of PersonaGuard is open source. It will live at github.com/saffronandindia/headlights-oss, Apache 2.0 licensed, free for any company to install. The adversarial test suite is a separate module in the same repository. Anyone can add an attack pattern. The repository goes public alongside the launch of this Incident Library.

This entry is an educational analysis based on the publicly reported sources listed below. It does not constitute legal advice. Facts are stated to the best of our knowledge as of the date of publication; corrections will be issued promptly on request. Contact: ellie@useheadlights.com.