Dudley Gould discusses the use of generative AI tools, to quickly create realistic fake data for accountants, which can be used for training and testing without exposing sensitive information. While this technology offers significant benefits, such as efficiency and safety, it also poses obvious risks as it can be used to create deceptive data.
Please note that this article explores experimental use of AI tools.
For more, see ICAEW’s guidance on the risks and limitations of generative AI.
Fake data, real uses
Accountants are no strangers to dummy data – who hasn’t wheeled out ABC Ltd or John Doe for a training demo? But making that data used to be tedious. Enter generative AI. Tools like ChatGPT can now generate full mock datasets on command, from a customer ledger to a pile of invoices – in seconds.
The appeal is obvious: no risk of exposing sensitive client info, no handcrafting samples in Excel. You get fake data that looks real enough to use, without real-world consequences.
But as with all tech, there’s a flip side. While we’re using synthetic data to train, test and teach, others are using it to deceive.
Creating useful dummy data
Prompting AI to generate good dummy data is surprisingly intuitive – especially once you treat it like a junior member of the team. Give it clear instructions, context, and an example or two, and it’ll usually oblige. A few best practices I’ve picked up:
- Set the scenario: “I need to train new hires on bookkeeping for a UK retail business.”
- Be specific about the format: “Five sales invoices in table form, with dates, VAT, and totals.”
- Start small: You rarely need 10,000 rows. Start with 20–50.
- Iterate: Ask it to add outliers, vary the dates, or even check its own maths.
To make this concrete, here is an example prompt you can experiment with:
Example : Generate demo client invoices
You are an accountant generating dummy data for training purposes. Create 5 example sales invoices for a UK company (XYZ Trading Ltd). Each invoice should have: a realistic invoice number, date in 2025, customer name, 2-3 line items (product/service, quantity, unit price),20% VAT, and a total. Format the output as a table.
This produces a clean, readable output that’s perfect for systems training, workshops, or prototypes. Just remember to never paste in real client info – keep it synthetic.
As the context windows of AI tools grow – in essence, the amount of information that they can handle in any single conversation – so does the scope to develop more sophisticated dummy data. It becomes possible to artificially generate dummy data that can be tailored to handle very specific requirements. For example, if you need test scenarios with particular data anomalies, you can include this request in your prompt. It’s also possible to request data that structurally resembles the outputs from specific accounting systems like SAP.
So if you need to put a new analytics solution through its paces, or give something to your juniors to support their training, AI tools can almost completely remove the time, effort and risk involved in sourcing, scrubbing and tailoring existing datasets.
AI-generated receipts: A new fraud frontier
It’s not just text anymore – AI can generate images that look like the real deal. Recent advances in GPT-4’s vision capabilities mean creating a dummy invoice or receipt image is as easy as typing a prompt. People immediately used the recent update to whip up phony restaurant receipts.
The results are disturbingly good: pixel-perfect “documents” that could fool a busy finance team at a glance. Receipts with logos of real merchants, barcodes and VAT numbers that check out, and dates that align with a plausible narrative.
Here's my prompt to try and recreate the fake receipt. And below, the result from ChatGPT that it literally took 20 seconds to generate. Note sometimes you may get subtle errors, like commas instead of period or sums that don’t add up. Usually, you can fix these with minor edits but often it’s best to start fresh with an improved prompt.
Prompt:
“Create a realistic photo of a paper restaurant receipt lying on a wooden table. The receipt should be crumpled slightly with a natural paper texture and lighting. It should include the restaurant name 'EPIC STEAKHOUSE', address '369 The Embarcadero, San Francisco, CA 94105', and be formatted like a U.S. receipt. Include the following items: 2 Filet Mignon – $98.00 1 Rib Eye – $52.00 1 Caesar Salad – $14.50 1 Creamed Spinach – $11.00 1 Baked Potato – $9.50 Subtotal: $185.00 Tax: $17.02 Tip: $75.00 Total: $277.02 Use a monospaced font and realistic alignment, similar to thermal-printed receipts.”
The result:
Again, the potential for this when developing training and testing software solutions is huge – combine this with the ability to generate invoice data tables or bank statements, and you have the ability to recreate a full end-to-end process with entirely fictitious information.
But for accountants and auditors, it also raises red flags. Expense fraud is nothing new, nor is the ability to doctor images, but when any employee with a chatbot can generate a very authentic-looking fake receipt, the usual controls (like requiring photographic evidence) might not suffice.
It’s clear that making it so easy to forge documents presents huge opportunities for fraud. Some early data is alarming: one OCR provider reported that by 2024, around 15% of detected expense fraud involved AI-generated documents (a 300% jump since 2022) (Fraud Detection of AI Generated Documents | Veryfi).
Fighting back: detection and the audit implications
Thankfully, the same AI arms race is also producing defences. In April 2025, Dext added AI-generated document detection to its platform. If a suspect invoice is uploaded, it now flashes a warning: “This item appears to have been generated by AI.”
These tools use a mix of tell-tale signs – formatting inconsistencies, metadata weirdness, improbable values – to catch forgeries. It’s still evolving, and doesn’t necessarily replace the need for human oversight, but it’s a helpful first line of defence.
For auditors, the emergence of completely AI-generated evidence means we must raise the bar. The old-fashioned “inspect the invoice” approach of verification may not be enough. Increasingly, this means checking against data obtained directly from third party sources, such as bank feeds through Open Banking, e-invoicing, or confirmation with suppliers. In the not-too-distant future, blockchain technology may also play a crucial role in independent transaction validation.
Embracing innovation, guarding against abuse
There’s no question: AI-generated data is powerful. It helps train staff, test systems, and bring AI projects to life. And it doesn’t have to be boring. Ask ChatGPT to create a fictional startup’s accounts with a few hidden jokes in the numbers – you’ll keep people engaged and learning.
On the flip side, the very realism that makes AI-generated data useful also makes it a potential tool for fraud. As trusted finance professionals, we must stay one step ahead.
After all, innovation in data analytics doesn’t replace our judgement – it elevates it. With the right balance, we can enjoy the benefits of AI-generated synthetic data and insightful analytics, without falling for AI-generated lies.