HD-INC-010

Professional services · Australia · 2025 · Hallucination & fabrication

Deloitte's $440K AUD report for the Australian government cited a federal court quote that did not exist

By Ellie Harris · Filed 2025-07

Alleged: Deloitte Australia, Department of Employment and Workplace Relations (DEWR) developed or deployed the AI system implicated in this incident. Details are drawn from public reports; parties are presumed innocent of any wrongdoing not established by an official finding.

What happened

In December 2024 the Australian Department of Employment and Workplace Relations engaged Deloitte to conduct an independent assurance review of the Targeted Compliance Framework. The framework is the system that automates penalties against welfare recipients. It is what replaced Robodebt. The contract was worth AUD 440,000 (about USD 290,000) and ran from December 2024 to June 2025.

The deliverable was a 237-page report published on the department’s website in July 2025. It contained a fabricated quote attributed to Federal Court Justice Jennifer Davies, whose surname the AI misspelled “Davis,” and ten citations to a non-existent book attributed to Sydney law professor Lisa Burton Crawford, titled The Rule of Law and Administrative Justice in the Welfare State, a study of Centerlink (note the AI’s misspelling of Centrelink). Burton Crawford’s actual book is The Rule of Law and the Australian Constitution. Rudge catalogued around twenty citation errors of the same shape; the corrected version of the report removed 14 of the 141 sources in the original reference list.

The errors stayed live for several weeks. In late August 2025, Dr Chris Rudge, an academic at Sydney Law School, read the report and noticed that a passage attributed a non-existent book to Lisa Burton Crawford, a Sydney University professor of public and constitutional law, with a title sitting outside her field. Rudge contacted the Australian Financial Review, which broke the story.

Deloitte conceded. The corrected version of the report was republished in October 2025 with a new disclosure on page 58: Deloitte’s technical team had used “the Azure OpenAI GPT-4o based tool chain licensed by DEWR and hosted on DEWR’s Azure tenancy.” The detail mattered. Deloitte had produced the report using the department’s own AI infrastructure. The fabricated quote and the non-existent references were removed. Deloitte agreed to refund the final installment of the contract fee.

The dollar figure was small for a firm Deloitte’s size. The context made it worse. The Targeted Compliance Framework had been commissioned in the shadow of the Robodebt royal commission, where Australia learned what happens when automated decisions about vulnerable people go unaudited. Deloitte’s job was to assure that the new system was better. The assurance itself was produced by an AI without verification.

What an auditable version would have shown

Every claim in a professional-services deliverable is supposed to be traceable to a source. The traditional workflow has a junior consultant write a section, a senior consultant review it, and an associate director sign it off. Citations get checked along the way because a person owns each step.

When AI drafting enters that workflow without changing the rest of it, the chain breaks at the first step. The junior pastes the model’s output. The senior reads the prose and assumes the junior verified the citations. The associate director skims for tone. Nobody runs the citations against a real database, because the workflow does not require it. The model’s confidence in its own output reads as authority.

An auditable production record would have shown, for each section of the report, who wrote it, whether any AI was involved, which database the citations were checked against, and which named person signed off. With that record, the fabricated quote would have failed verification at the citation-check step, three reviewers before the report reached the department. The department would have received a clean deliverable, or no deliverable at all.

The corrected version of the report added an AI disclosure on page 58. That is a closing-the-stable-door move. The disclosure should have been made at the start, the citation check should have been a workflow gate, and the named author of each section should have been part of the record.

Where the gap was

The gap was a workflow that assumed all written content came from a human. When the writing stopped coming from a human, the verification steps that followed it stopped working, because they had been built on top of the assumption that the writer had checked their own sources.

This is the most common shape of AI-related professional failure now in the field. The technology is folded into one step of an existing process, the rest of the process is left unchanged, and the controls that were tacit suddenly stop being controls. Deloitte is a global firm with thousands of compliance professionals. They did not catch a fabricated court quote in a 237-page report to a federal department. Not because the controls were missing on paper, but because the controls did not anticipate that the prose itself might be fictional.

The Robodebt royal commission, which was the immediate context for this contract, made the same finding about a different system. Automated decisions need new controls, not the old controls applied to new outputs. Deloitte’s report was the inverse of the same mistake: an automated assurance of an automated system, reviewed as if both were still being produced by humans.

What governance should have looked like

Every section of every AI-drafted document gets tagged with which model produced it, and every citation in those sections gets checked against a real database before the document leaves the firm. The check is automatic, the result is signed, and unverified citations either get flagged for human review or block the document from being delivered.

If the AI drafts a sentence citing “Burton Crawford, Constitutional Limits on Automated Welfare Decisions (2022)” and that book does not appear in any index the verifier checks, the section is blocked. The senior reviewer sees the flag. The fabricated reference is either removed, replaced with a real source, or the section is rewritten. The department never sees a fabricated citation, because the workflow caught it three signatures before it would have shipped.

The citation gate is one layer. Deloitte had several others available. An AI-use disclosure at the start of every section, not at the back of the corrected version. A separate fact-check pass by a person who did not draft the document, before client delivery. Author accountability at the section level, so a named partner signs off on each chapter rather than the whole report. Tooling that distinguishes “retrieved from a database” from “generated by a model” in the consultant’s writing environment, so the consultant can see which sentences are unverified at the moment of writing. None of these are exotic. They are documented practice in any mature professional-services AI deployment. The cumulative cost of implementing all four is less than the cost of one refunded contract, and far less than the cost of a federal department losing trust in its assurance provider.

The reference implementation of CitationVerifier and ConductRecord is open source. It lives at github.com/saffronandindia/headlights-oss, Apache 2.0 licensed, free for any firm to install. The repository is public now.

Sources

The mailing list

Fresh incident reports every week. One email to match.

We add new incidents to the library regularly, and send a single short email each week with what's new. The library stays free and open; this is just how you keep up with it.

No tracking. Unsubscribe in one click.

The record

An auditable system would have produced a signed, tamper-evident record the moment this happened: what the system did, the version that did it, the basis it acted on, and the action taken, and Deloitte Australia, Department of Employment and Workplace Relations (DEWR) could have produced it on demand.

This is the record the system as deployed did not produce in a signed, auditable form.

What this teaches

Capture what happened when it happens

What the system did, the version that did it, the basis it acted on, and the action taken, recorded at the moment, not reconstructed after.

Sign it, so no one has to trust the record-keeper

A tamper-evident entry. Edit it later and the signature breaks. The record does not ask for the benefit of the doubt.

Make it verifiable by anyone

A court, a regulator, a customer's lawyer can check the record themselves, without taking the company, or us, at our word.

Also in the library

HD-INC-001 Air Canada chatbot promised a bereavement refund policy that did not exist Aviation · 2022 HD-INC-002 Mata v. Avianca, the lawyer who cited six cases that did not exist and asked ChatGPT to confirm them Legal services · 2023 HD-INC-003 Michael Cohen gave his lawyer fake case citations he had got from Google Bard, and his lawyer filed them in a federal court Legal services · 2023

Headlights summarises publicly reported AI incidents. All summaries are independently written, attributed to their original sources, and intended for research and educational purposes. Allegations are identified as such until established through official findings.

Last reviewed June 2026. This report is based on the sources listed above and reflects information available at the time of review; later developments may not be captured. Where a person is described as charged with or alleged to have done something, that allegation is unproven unless a conviction or a court or regulatory finding is stated. Headlights publishes journalism and commentary, not legal advice.

Want to write back?

Direct to my inbox.

ellie@useheadlights.com →