Your automations are quietly keeping data you already deleted
Deleted from the CRM, still in a Make log months later. Where automations keep personal data and how to build a deletion concept for your workflows.
Your automation never forgets, and that is the problem
Most people picture an automation as a pipe. Something goes in on one side, something comes out the other, and in between the logic does its job. Move the order from the shop to the accounting system. Push the new lead from the form to the CRM. Clean, mechanical, forgettable.
What that picture leaves out is that the pipe keeps a recording. Every serious automation platform writes down what passed through it, because otherwise you could never figure out why a workflow broke last night. That recording is the part nobody thinks about, and it is where your deleted customers quietly keep living.
I ran into this with a client last autumn. A former customer had sent an access request under the GDPR, the ordinary kind: tell me what you have stored about me. The company had deleted that customer from the CRM months earlier, properly, with a confirmation. Sales was sure there was nothing left. Then we opened the automations, and in the execution history of one Make scenario the whole record was still sitting there. Name, address, phone number, the last three orders, an internal sales note nobody ever meant to share. Deleted in the CRM, alive in the automation.
This is not a story about a careless company. It is the default state of almost every automation setup I look into. The data you delete from your main systems keeps living in your workflows, in execution logs, in error queues, in logging spreadsheets, in old Slack messages. Nobody planned it. It happens on its own, because automation platforms keep data by default and almost nobody has ever touched the setting.
Why platforms hold onto your data
The retention nobody talks about starts with a reasonable feature. When a record runs through your scenario, the platform does not just note that it ran. It usually stores the full contents: the incoming payload, the intermediate results after each step, the final response from the API. For an order workflow that means the customer name, the shipping address, the payment details, everything that flowed through the pipe. And it stays there, often far longer than you would expect.
The decisive thing is the default. None of these platforms ask you during setup how long they may keep the personal data from this workflow. Nobody asks the question because the platform does not. It picks a retention period chosen on technical grounds, optimized for debugging, not for data minimization.
Self-hosted n8n shows this most clearly. In the default configuration n8n saves every execution with full data to the database, the successful ones and the failed ones alike. The automatic cleanup through `EXECUTIONS_DATA_PRUNE` is a setting you have to switch on deliberately. If you do not know it exists, after a year of production you have every execution from the last twelve months sitting in your Postgres database, including every piece of personal data that ever ran through. I have seen instances where the executions table was bigger than all the actual business data combined.
Make keeps execution logs for days to weeks depending on the plan, and incomplete executions, the error cases that land in the queue, stay there until someone clears them by hand. That can be months if nobody watches the queue. Zapier ties task history retention to the plan and also measures it in weeks. The exact numbers shift with pricing tiers, so the only honest answer is to open your own account settings and look, rather than trust a figure from a blog post. What holds across all of them is the same: left alone, they keep more and for longer than a data protection concept would allow.
The five places your data quietly piles up
When I audit an automation account for retention, I walk through five places. It is rare that even one of them is clean.
Execution logs and history
The obvious one. Every successful run leaves a full copy of the data it processed. A workflow handling a hundred orders a day produces something like a hundred thousand records a year, each carrying everything that went through the pipe. These logs are useful while a workflow is new and you are chasing bugs. Three months later nobody needs them, and they sit there anyway.
Error queues and incomplete runs
When a workflow stalls because an API does not answer, the run drops into a queue for a later retry. Sensible. The catch is that these runs contain the full record at the moment of failure, and they do not clear themselves. I have seen Make accounts with thousands of incomplete executions piled up over two years, each one holding customer data, nobody had ever tidied up. This is the most uncomfortable category, because it grows while no one is looking.
Logging into external tables
Plenty of workflows end by writing a row to a Google Sheet or a database: timestamp, customer, what happened. Homemade monitoring. These logging tables grow for years, nobody feels responsible for them, and they often hold more personal data than the actual processing does. An 80,000-row spreadsheet of customer activity, shared with half the team, is a data protection problem disguised as a convenience.
Slack and email notifications
If a workflow posts a message to Slack on every error, with the full record attached so you can see what went wrong at a glance, then that record now also lives in Slack. Indefinitely, searchable, visible to everyone in the channel. The same goes for error emails. The data leaves the automation platform and multiplies into systems nobody associates with retention.
Scratch storage and data stores
Platforms offer their own little stores, Make Data Stores, Storage by Zapier, variables in n8n. Handy for remembering something between two runs. They get set up and forgotten. Whatever you wrote into them stays until somebody actively deletes it, and nobody does.
Add the five together and you get a plain fact. When you delete a customer from your CRM, you deleted them from one system. In your automation stack they can keep living in five more places, and you touched none of them when you hit delete.
What the GDPR actually requires here
Two parts of the regulation point straight at this, and both get skipped in almost every automation project.
Article 5(1)(e), storage limitation: personal data may only be kept as long as the purpose requires. The purpose of order processing is the order. Once it is complete and booked, there is no remaining reason to keep the full record in an automation log for another six months. Keep it anyway and you are breaking the principle. Article 17, the right to erasure, means that when someone asks you to delete their data, you delete every copy, not only the one in the main system. A copy in a Make log is a copy. It counts. And Article 15, the right of access, requires you to give a complete answer about what you hold on a person. If you do not know what is in your automations, you cannot give a complete answer, and that is exactly where companies fall down in practice.
There is a common misreading worth naming. If Make or Zapier stores the data, surely that is their responsibility, not mine. No. You are the controller. The platform is your processor under Article 28, handling the data on your behalf and on your instructions. If their default keeps data for twelve months and you never change it, that is your retention, not theirs. The data processing agreement does not move the responsibility off you, it only governs how the vendor handles what you hand them.
Location matters too. Make has European roots and offers EU data centers, self-hosted n8n runs wherever you put it, Zapier processes mostly in the US. Every transfer to a third country needs a legal basis, and the execution logs are part of that processing. A workflow that looks like it only sends an email can quietly write customer data across the Atlantic and hold it there for months. The sober consequence: your automations belong in your record of processing activities under Article 30. If they are not in there, your record is incomplete, and that is one of the first things a supervisory authority asks to see.
Retention duties are not an excuse for a data graveyard
The objection at this point is almost always the same: but we have to keep invoices for ten years, the tax law says so. True. German tax rules, the GoBD and the fiscal code, require certain records to be held for years, ten for invoices, and that is a real legal basis that overrides the right to erasure for those specific records.
But that duty justifies keeping the data in the right system, not in the execution log of an automation. An invoice belongs in the accounting archive, audit-proof, with access control and a defined period. It does not belong as an accidental side copy in the history of a Make scenario that passed the invoice along once two years ago. The retention duty covers the one copy in the system meant for it. The copy in the workflow log does not satisfy the duty, it only adds risk.
The mistake I see often is using the legal retention duty as a blanket excuse to never think about retention at all. We have to keep everything anyway. You do not. You have to keep specific data for specific purposes over specific periods, in the place it belongs. Everything else, the internal notes, the ratings, the intermediate states sitting in five logs, is not covered by any retention duty and falls under the obligation to delete. Mix the two and you end up keeping everything while being able to justify none of it.
Default retention at a glance
So you have a feel for where the platforms sit without your involvement, here is a rough comparison. The figures are orientation, not guarantees, because they move with tiers and versions. The column that matters most is the last one.
| Platform | Execution logs (default) | Error queue | Who cleans up |
|---|---|---|---|
| n8n (self-hosted) | All executions kept indefinitely until pruning is switched on | Stays in the database | You, via environment variables |
| Make | Days to weeks depending on plan | Incomplete runs stay until cleared by hand | You, manually or with a cleanup scenario |
| Zapier | Task history measured in weeks by plan | Failed tasks remain visible | You, via settings and Storage |
What the table shows is that no row says the platform is data-minimal on its own. The last column is the same answer every time. It is on you, and until someone does it, the pile grows.
Two projects, two lessons
The agency with the 80,000-row log
An owner-run agency, a bit over twenty people, had built a tidy automation landscape in Make over the years. One central workflow took in every new project inquiry, enriched it, and wrote a log row to a Google Sheet at the end. One row per inquiry: name, company, email, budget, internal ratings. Over four years that came to roughly 80,000 rows.
The sheet was shared with the whole team, because early on it was handy to look something up fast. Nobody had ever considered that this meant the full contact and budget history of thousands of prospects sat permanently open to the team, many of whom had never placed an order and so had no live business relationship to justify keeping their data. When we built the deletion concept, that one sheet was the single largest item, bigger than the CRM. We cut it to the last twelve months, archived and anonymized the rest, and rebuilt the workflow to log only what debugging needs, with no personal content. The lesson: the most dangerous log is the one that starts as a convenience and is never recognized as data retention. It is in no register, nobody owns it, and it grows for exactly as long as it takes someone to glance at it.
The company that sweated through an access request
A larger company, around 300 staff, ran a grown self-hosted n8n instance maintained by the IT team. Clean at first glance: own servers, data in house, GDPR-friendly. Then an access request came in from a former job applicant, and the question was simple: what have we stored about this person? The HR system could answer. The n8n instance could not, at least not quickly. There were recruiting workflows that had run for years, with save-all-executions on and no pruning. Every application, every email, every status change of the last years sat in full in the database. To answer the request correctly, we had to go through the execution history of several workflows and search for the person. It cost days, and the uncomfortable finding at the end was that applicant data had been kept well past the point its retention period allowed.
The lesson: self-hosting cleans up the location, not the retention. Having the data in your own house is only an advantage if you also know which data that is and how long it has been sitting there. Otherwise you have just moved the hoarding into your own data center.
A deletion concept for automations that holds up
Enough diagnosis. Here is how to get it under control, in an order that works in practice.
First, know the data flows before you delete anything. You need an overview: which workflows process personal data, which fields run through, where the copies land. It is tedious, and without that map you configure blind. I do it as a simple table, one row per workflow: data types processed, where the copies live, the retention period you actually want. That table doubles as the part that belongs in your Article 30 record.
Second, send less through the pipe. The most effective move is to not push everything through in the first place. If a workflow only needs the order number and the status to trigger a notification, the full customer record has no business being in that workflow. Minimizing at the entrance automatically shrinks what ends up in the logs. Every field you do not send through the automation cannot sit in five logs afterward.
Third, set retention on purpose. In n8n, switch `EXECUTIONS_DATA_PRUNE` on and set `EXECUTIONS_DATA_MAX_AGE` to a window that fits the purpose, often one or two weeks is enough for debugging. You can also decide not to store successful executions in full and keep only the errors. In Make, check retention per scenario and set up a small cleanup scenario that runs through the error queue on a schedule. In Zapier, walk through the Storage and history settings. None of these settings is hard. They just never get touched, because nobody thinks of them.
Fourth, do not forget the copies outside the platform. This is the part most people miss: the logging sheets, the Slack channels full of error data, the data stores. They live outside the platform settings and need their own rules. A logging sheet gets an automatic cleanup that deletes rows older than the window. Error notifications send a reference instead of the full record, the execution ID rather than the customer data. Whoever wants to see the error clicks into the platform, where the data already lives with access control anyway.
Fifth, make a deletion run that includes the automations. When you delete a customer on request, your deletion process needs a step for the automations. That can be a documented manual step, perfectly fine for a small stack, or a dedicated workflow that walks the known storage locations on request. What matters is that the automations show up in the deletion routine at all. In most companies I see, the deletion routine ends at the main systems, and the automations are a blind spot.
And then there are the backups
One point missing even from good concepts: backups. You switched on pruning in n8n, the logs get deleted after two weeks, all clean. But your nightly database backup still pulls the full state, and a backup from three months ago holds exactly the executions you removed from the live database long ago. When a customer requests deletion, those backup copies are legally awkward. The common practice is to give backups their own short retention window, so deleted data ages out on its own as the backups rotate within that window. What matters is that you know and document that window instead of stacking backups forever. Year-old database backups of an automation are a data store nobody has on their list.
A quick interim note after these steps: none of them is technically hard. The hard part is starting at all, because the problem is invisible until someone asks about it. Which is why the person who ends up asking is the wrong one, the customer with the access request or the authority with the audit.
The question you should be able to answer today
How long does my automation platform keep the personal data that runs through my workflows? If you cannot answer that in one sentence, the answer is probably: longer than you are allowed to. For default n8n without pruning the honest answer is indefinitely, for Make it is days to weeks by tier plus every error case until someone clears it, for Zapier it depends on the plan. You find the exact value in the account settings, and looking takes less than fifteen minutes.
The second question follows straight on: if an access request arrived tomorrow, could I give a complete answer about what data on that person sits in my automations? For most companies the honest answer is no, and that is the moment an abstract data protection topic turns into a concrete risk.
What I took away from this
Retention in automations is the opposite of spectacular. There is no incident, no breach with a headline, no dramatic event. There is only a slowly growing heap of copies nobody wants to look at, until somebody has to. That is exactly what makes it dangerous. Problems that make noise get fixed. Problems that grow quietly become the norm, until they get expensive.
The habit I picked up across these projects: when building an automation, the retention question belongs in the same conversation as the logic question. Not as a compliance follow-up afterward, but at design time. How long do we need this data in this workflow, and what happens to it after. That one question at the start saves the unpleasant search at the end. And the clean answer is almost never more storage. It is less. Less through the pipe, kept shorter, fewer copies outside the platform. An automation that only processes and keeps what it truly needs is friendlier to data protection, and it is also easier to run and faster to search. You can actually answer for it when someone asks. The minimization the GDPR demands turns out, here, to be the better engineering too.
If you are not sure what your automations have stored about your customers and how long it has been sitting there, that is worth checking. Our free Automations Check goes through your workflows and looks at which personal data sits where and for how long, before someone else asks the question.