Skip to main content
Back to Blog
Data Privacy22 min read08.05.2026Max Fey

What goes into ChatGPT, and what doesn't: a working guide to data classification

ChatGPT Enterprise gives you a license, not a strategy. A four-class data framework with examples from real engagements, before AI rollout.

What goes into ChatGPT, and what doesn't

Here is a question I have been asking CTOs in the last few months. Which of your company's data are you actually allowed to put into ChatGPT? Not which data you do put in. Which data you are allowed to put in.

Out of about thirty conversations, four people had a clear answer. The rest gave me some variation of "well, we have ChatGPT Enterprise, so it should be fine."

That sentence is the most expensive one I hear in client engagements right now. Not because ChatGPT Enterprise is a bad product. It is a fine product. The problem is that "we have an Enterprise license" and "we know what we are doing" are two completely different things. The gap between them has a name, and that name is data classification, a discipline that simply does not exist in most of the companies I work with.

This article is my attempt to close that gap. Concrete examples, real numbers, no ISO jargon.

What data classification actually is

Let me kill some misconceptions first. Data classification is not the Excel sheet that gets created during your ISO 27001 audit and then never opens again. It is also not the corporate slide titled "Public, Internal, Confidential, Restricted" that nobody under the executive level has ever read.

Data classification is the operational answer to a very practical question. Which of our data goes into which tool, and which does not?

Until you can answer that question with a straight face, every AI rollout is a gamble. A gamble with your regulator, with your clients, with your trade secrets, depending on what business you are in.

What classification gives you when it is done right is three things. It turns a vague feeling that "we have sensitive data" into a countable list with categories, examples, and concrete data assets. It produces clear rules about what can be processed where, so you do not have to re-evaluate every single use case. And it creates the documentation that supervisory authorities now expect during AI audits, namely evidence that you have a process, not just an opinion.

What it is not, is a one-time task. It is a living process that needs to be updated every time a new tool, new data category, or new subprocessor appears.

This is where the companies I love auditing diverge from the ones I dread auditing.

The four classes I use in client projects

There are dozens of classification frameworks. BSI baseline protection, ISO 27001, NIST, industry-specific frameworks. In most engagements I work with a stripped-down version: four classes. Anything with five or more classes does not survive contact with reality. People can remember four levels. Five is where the wheels come off.

Class 1: Public, the truly public

What you publish on your website, write in press releases, or present at industry events. Marketing copy, public product descriptions, general company information.

Watch out for the trap. Many companies treat their internal newsletter as "kind of public, can't really hurt." That is wrong. Anything not actively meant for public consumption does not belong in Class 1, even if it is not technically secret.

This class can go into any AI tool that meets baseline GDPR requirements, including free tiers, as long as no personal data is involved.

Class 2: Internal, the daily grind

Run-of-the-mill emails. Meeting notes without strategic content. General project documentation. Routine vendor correspondence. Data whose compromise would be embarrassing but not dramatic.

In practice, eighty percent of what flows through a typical company lands in Class 2. Including most of what employees paste into ChatGPT, Claude, or Copilot.

This class can go into AI tools with documented GDPR compliance, meaning Enterprise tiers with a Data Processing Agreement, where training use is contractually excluded. For US providers, only with supplementary measures like Standard Contractual Clauses and a Transfer Impact Assessment.

Class 3: Confidential, not for the public

Personal data of customers, employees, candidates. Strategic documents. Internal financial data not in the public statements. Engineering designs with moderate protection needs. Negotiation strategies. Cost models.

This is where things get interesting, because many companies do not separate Class 2 from Class 3. They say "sensitive is sensitive." That is not true. A job application is confidential. A pricing model for a major bid is confidential. But they do not belong in the same tool, because one falls under Article 9 GDPR and the other is competitively sensitive.

This class only goes into AI tools explicitly certified for the relevant data category, ideally with EU data residency and demonstrable subprocessor management. For special categories of personal data, only after a Data Protection Impact Assessment.

Class 4: Restricted, existential matters

Patents not yet filed. Tax matters with criminal exposure. Bank credit negotiations. M&A transactions before closing. Engineering designs that define the core business. Client and patient data in regulated professions.

Less than five percent of company data lands here. But when something from this class ends up in the wrong tool, it gets expensive. Sometimes existentially expensive.

This class does not go into cloud AI. Period. If you want AI for these workloads, you build on-premises or in a dedicated European cloud instance with zero-knowledge architecture. Expensive, painful, exactly right.

The four classes are not arbitrary. Each has a concrete operational consequence. If you cannot enforce that, you do not have a classification, you have a label.

Which data goes into which tool, a pragmatic matrix

Here is the matrix I use as a starting point in workshops. It is not sacred, but it saves enormous amounts of arguing in the first two hours.

Tool / TierClass 1 (Public)Class 2 (Internal)Class 3 (Confidential)Class 4 (Restricted)
ChatGPT FreeYesNoNoNo
ChatGPT PlusYesNo (no DPA)NoNo
ChatGPT TeamYesConditional with DPANoNo
ChatGPT EnterpriseYesYesConditionalNo
Claude.ai (Free/Pro)YesNoNoNo
Claude for WorkYesYesConditionalNo
Microsoft Copilot (M365)YesYesYes (with config)No
Self-hosted LLM (Llama, Mistral)YesYesYesConditional
Mistral La Plateforme (EU)YesYesYesConditional
Industry-specific, on-premYesYesYesYes

"Conditional" means: not automatic, but case-by-case. For Class 3 in Microsoft Copilot, for example, you need correct sensitivity labels, correct SharePoint permissions, working DLP setup, documented user training. Without all four, it is effectively Class 2.

The matrix is deliberately conservative. In audits I keep seeing companies stretch their interpretation of Class 3 to the breaking point. Better to be one class more cautious than to deal with three regulators at once.

What the matrix deliberately does not do

It does not differentiate between providers of the same maturity tier. ChatGPT Enterprise and Claude for Work have minute differences in their DPAs, but for classification logic they are interchangeable. Differentiation belongs in tool selection, not in classification.

It also does not say what is wise, only what is permissible. Whether a finance employee really needs Claude for Work or would do fine with Office Copilot is an economic question, not a privacy question.

Why "ChatGPT Enterprise is enough" misses the point

Back to the opening example. Eighty employees, everyone gets to do everything in ChatGPT Enterprise. What is wrong with that picture?

ChatGPT Enterprise has a solid DPA. Training use of data is contractually excluded. There is SOC 2, SAML SSO, audit logs. As a tool, it is a good product.

The problem is not the tool. It is four assumptions the CTO made without realizing it.

First assumption: "GDPR-compliant" means "anything goes." Wrong. GDPR-compliant means the tool is suitable as a processor. What you push into the tool still requires legal justification. A candidate file in a GDPR-compliant cloud storage still needs a legal basis and retention rules.

Second assumption: "We are in the EU, so we are fine." Wrong. ChatGPT Enterprise, even with EU data residency enabled, still processes data in third countries, at minimum for logging, support, and model updates. The US Cloud Act still applies. A Transfer Impact Assessment is mandatory, not optional.

Third assumption: "Employees know what is sensitive." Wrong. In every audit I run, I find employees with ChatGPT histories full of complete client contracts, CVs with bank account numbers, or tax filings. Not from malice. From the simple fact that nobody told them clearly what was off-limits.

Fourth assumption: "Audit logs are enough to catch violations." Wrong. Audit logs show which user sent which prompt at what time. They do not show what was in the prompt. To catch violations, you need DLP at the endpoint or a proxy in front of the API.

These four assumptions explain why "we have ChatGPT Enterprise now" is not a privacy strategy. It is a tool purchase. The strategy still needs to be built.

The subprocessor chain nobody reads

A point that gets lost in most discussions. You have a contract with OpenAI. OpenAI has a contract with Microsoft Azure. Azure has contracts with local data center operators. Each layer has its own subprocessor list, which changes periodically.

In one project six months ago we compared the subprocessor lists of three major AI providers. Total: seventy-four sub-processors, twenty-one of them in third countries, fourteen of them in the US. Every one of these subprocessors can theoretically see data flowing through the system. The contracts regulate this, but few people actually read them.

If you are sending Class 3 or Class 4 data into cloud AI, you should at least know how many third parties you are trusting along the way.

Three project scenarios

Theory is fine, examples are better. Three scenarios from real projects, anonymized.

Scenario 1: Consulting firm, eighty employees, client data in ChatGPT

The consulting firm from the opening rolled out ChatGPT Enterprise. Six weeks in, employees were thrilled. We did an audit.

In the first eight chat histories I analyzed, we found:

  • Three full client contracts with names, addresses, contract values
  • A list of candidates for a senior position with CV details
  • An internal strategy discussion about ending a client relationship

What did ChatGPT Enterprise do wrong? Nothing. The tool did exactly what users typed in. It did not share the data (per DPA), it did not train on it (per DPA). But the data still ended up in a system with US connections, without anyone ever doing a Transfer Impact Assessment for these data categories.

Our recommendation, three steps:

1. Immediate: stop Class 3 content, written notice to staff. 2. Within 30 days: DPIA for productive use, documented classification matrix, mandatory training. 3. Within 90 days: tool splitting. Class 1 and 2 stay in ChatGPT Enterprise. Class 3 moves to Microsoft Copilot with sensitivity labels. Class 4 goes to a self-hosted LLM for the two use cases that need it.

Result after three months: 80 percent of original use cases still running. 10 percent eliminated because they could not be justified. 10 percent moved to other tools.

Data classification does not kill use cases, it sorts them. The valuable ones stay. The expensive or risky ones get caught early.

Scenario 2: Machinery manufacturer, six hundred employees, engineering data in Copilot

Mid-sized machinery manufacturer, export-oriented, rolled out M365 with Copilot. Initially executives and team leads, then expanded after three months to two hundred people, including the entire engineering department.

Copilot pulls data from SharePoint via Microsoft Graph. SharePoint permissions had grown over years, sharing was easier than not sharing, and as a result Copilot had access to engineering folders that were supposed to be visible only to engineering. Sales staff could ask Copilot questions and get information they would never have seen without it.

From a privacy perspective, no GDPR violation in the strict sense, no personal data involved. From a trade-secret perspective, highly problematic. The German Trade Secrets Act requires "appropriate confidentiality measures," and an open Copilot setup is the opposite of that.

Our recommendation, two steps:

1. Roll out sensitivity labels on engineering data, configure Copilot with DLP rules so this class does not appear in answers. 2. Mandatory classification for all newly created documents, technically enforced, not voluntary.

Result after six months: 92 percent of engineering documents correctly classified (sample-tested). Two incidents where classification was bypassed, both accidental, both corrected. CEO sleeps better.

For trade secrets, data classification is a precondition for legal protection in disputes. If you do not have it, you cannot sue.

Scenario 3: Solo tax advisor, client tax data in Claude

Tax advisor, solo practice, fifty clients, discovered Claude.ai on the Pro tier and used it daily for three months for research, assessment analysis, client communication.

Tax advisors are bound by professional confidentiality (German § 57 StBerG). A breach is criminal (§ 203 German Criminal Code). Claude.ai on the Pro tier does not have a sufficient DPA for professionally protected client data. Even if Anthropic does not train on the data, German criminal law does not care what someone actually does, it cares who could have access.

From a privacy perspective: GDPR issues, yes, but manageable. From a professional law perspective: career-ending. In a Tax Advisors Chamber audit, this would mean license revocation.

Our recommendation:

1. Immediate: stop all client-related use. Pure research without client data is fine. 2. Within 60 days: switch to Claude for Work with proper DPA, or Claude via Bedrock (AWS Frankfurt) with a dedicated DPA. 3. Within 90 days: documented client information about AI use in the practice, opt-out option for clients.

Result after three months: tax advisor still uses Claude daily, but on the right tier with a proper DPA. Three clients chose opt-out, forty-five signed opt-in. Professionally clean.

In regulated professions, data classification is not a GDPR question, it is a license question. Skip the classification, and you are not just risking fines. You are risking your career.

What a data classification workshop actually looks like

People often ask me how much time this all takes. The honest answer: less than they think, if structured. More, if done without a plan. Here is the three-day plan we usually start with.

Day 1: Inventory

Morning: stakeholder interviews. Executives, IT, Data Protection Officer, one rep per main department. 30 to 45 minutes each. Question 1: what data do you regularly process? Question 2: which of these would you intuitively call sensitive? Question 3: which AI tools do you use today, officially or unofficially?

Afternoon: collect and group data categories. Big whiteboard or Miro board, every data type that came up in interviews. Goal: 40 to 80 data types, from "marketing brochure" through "candidate CV" to "M&A term sheet."

Day 1 output: list of data types, unsorted but complete.

Day 2: Classification and tool mapping

Morning: assign data types to the four classes. Consensus-based, with executives and the DPO in the room. Disputes go to a parking lot, addressed in the afternoon.

For disputes, two questions decide. First: what happens if this data becomes public? Second: what is our legal basis for processing it? If the answer to one is "scandal" or "investigation," it is at least Class 3.

Afternoon: tool inventory. Which AI tools are in use today? Which tier and maturity level? Mapping between classes and tools, using the matrix above as a starting point. Result: concrete statements like "Class 3 data goes in tool A, B, C, but not in D, E, F."

Day 2 output: classification table and tool matrix, both as versioned documents.

Day 3: Operationalization

Morning: what needs to change? Concrete to-do list. One use case gets killed. One tool gets switched. Three sensitivity labels get defined. A DLP rule set gets commissioned. Training gets dated. A DPIA gets prioritized.

Afternoon: communication. Who gets informed when? What goes in the staff memo? Who is the contact for questions? What is the escalation path for violations?

Day 3 output: roadmap with clear ownership and deadlines.

In well-prepared projects, three days is enough. In poorly prepared projects, I have spent two weeks because stakeholders first had to grasp that "everyone gets to do everything" is not a strategy.

What happens after the workshop

A common mistake. After the workshop, everyone goes home. Classification is done, everyone is happy, the Excel sheet disappears into SharePoint. Six months later, nobody knows anything.

What works:

  • Quarterly reviews of the classification. Not heavy, a 60-minute meeting is enough. What changed? Which new tools? Which new data categories?
  • Integration of the classification into the AI procurement process. No new AI tool without mapping to the classification matrix.
  • Sample audits. Once every six months, look at a few dozen real ChatGPT histories or Copilot queries with the DPO and IT. What people actually do is always different from what the theory says.

What I have learned

When I look back at two years of AI consulting and sort out which projects went well and which did not, data classification sits at the top of the success factors. Not the hippest model. Not the cheapest license. Not the slickest integration. Whether classification was done or not.

Six observations from my notebook.

One. Companies that classify before buying tools save money. They buy more appropriate tools, fewer licenses, less migration pain. The classification costs two to five consulting days. Migrating a wrong tool costs two to five consulting months.

Two. Employees are not the problem. Employees are the solution, if they understand what is at stake. Nobody wants to leak client data. They do it because nobody told them clearly what is sensitive and what is not. Good training with concrete examples from their own company works wonders.

Three. Executives have to walk the talk. If the CEO is pasting Class 3 data into ChatGPT Free, no classification scheme will survive. Leadership by example is not a cliché, it is a precondition.

Four. Tools with native classification support like Microsoft Sensitivity Labels or Google Data Loss Prevention are underused. They are often already in your existing license bundle but not activated. If you have an M365 tier, you probably already have most of the tooling without knowing it.

Five. The matrix above survives 18 to 24 months in most projects. Then either a new tool generation or new regulation forces an update. Treat the matrix as a living document and you are fine. Treat it as one-and-done and you will have problems in two years.

Six, the most important. Data classification is not the goal. The goal is AI use that is safe, compliant, and productive. Classification is just the fastest path there. Forget that, and you build bureaucracy. Get it, and you build an actual AI strategy.

When the CTO from the opening shares a coffee with me again in six months, I want him to say, "We have ChatGPT Enterprise. And we know exactly what goes in and what does not. And so do our employees."

That is a sentence I like to hear.

If you want to know which of your data lands where, and what a classification setup for your own company would look like in practice, the free Automations Check gives you a first read in around 30 minutes.

#Datenklassifizierung#AVV#Schrems II#Microsoft Copilot