Shadow AI, de AVG en datalekken (1): wat shadow AI is en wat de AVG eist

Shadow AI, the GDPR and data breaches (1): what shadow AI is and what the GDPR requires

Author

Jeroen van Woezik

Part 1 of a three-part series on shadow AI for DPOs and privacy officers.

At the end of 2025, the municipality of Eindhoven reported to the Dutch Data Protection Authority (AP) that employees had uploaded Youth Act documents, CVs, and internal reports to public AI tools such as ChatGPT and Gemini(1). The AP had already warned in 2024 that the use of AI chatbots could lead to data breaches. By the end of 2025, it turned out that the AP had received dozens of reports of similar AI-related data breaches that year, an increase compared to the previous year(2). The pattern is always the same: employees use AI tools out of sight of the organization.

This three-part series discusses the legal risks of the phenomenon 'shadow AI' for organizations and the role of their Data Protection Officer (DPO). This first part focuses on two questions: what shadow AI is and why it has a fundamentally different risk profile than classic 'shadow IT,' and what obligations the GDPR imposes on the processing. In part 2, we will address the consequences when things go wrong: when shadow AI qualifies as a (notifiable) data breach, and what happens to personal data once it enters a model. In part 3, we will cover the course of action: what policy, training, and procurement requirements are needed, how to also protect trade secrets and contractual confidentiality, how the AP is currently enforcing, and what the DPO's role is.

What is shadow AI?

Shadow IT (the use of software and services not approved by the organization) has existed for decades. Think of an employee who shares files via a private Dropbox account because the company network is too slow, or who works via a personal email address. Shadow AI is the current variant: employees use AI tools such as ChatGPT, Claude, or Gemini, often via free personal accounts, to speed up work, but out of sight of IT and the DPO.

However, shadow AI is legally more impactful than classic shadow IT for two reasons. Firstly, there is a high probability that (unnecessarily large amounts of) personal data will be processed: an HR employee who uses ChatGPT to draft a dismissal letter based on a complete personnel file introduces personal data to an external provider, and a customer service employee who has Gemini rewrite a customer email and adds the entire conversation history does the same. Secondly, AI introduces additional risks that classic shadow IT does not. With shadow IT, it is usually still possible to trace where a copy of a file is located, and deletion or access restriction can sometimes be enforced. Often, an IT vendor that acts as a processor in its services has a standard data processing agreement and (if all goes well) does not use the data for its own purposes.

With shadow AI, this is more difficult: depending on the service and configuration, prompts and uploads can be stored, reviewed by human reviewers, used for model improvement, or not fully removed from the system later. While shadow IT thus creates uncontrolled storage with third parties, shadow AI can add a layer of model processing and irreversibility to that.

As soon as personal data is entered into an external AI tool, privacy risks arise(3). As the next chapter shows, it makes a significant difference whether an employee uses a free consumer version or a commercially contracted enterprise environment.

Shadow AI and the GDPR

The AI Act explicitly leaves the GDPR unaffected(4). The analysis therefore begins with a breakdown of the processing and qualification of roles under the GDPR, because: processing consists of phases, and for each phase, it must be determined who the data controller (and possibly the processor) is and what legal basis applies.

Two processing operations, potentially two data controllers

When an employee enters personal data into an external AI tool, depending on the service's design, multiple processing operations with different purposes can occur. The first is the collection and transfer of data to the provider, serving the organization's business purpose (having a text rewritten, summarizing a file). If the AI application provider also processes the input for its own purposes, particularly for training or improving the model, that processing must be assessed separately.

This distinction has already been addressed in case law: in Fashion ID, the Court of Justice ruled that a party that co-determines the purposes and means of collection and transmission is the data controller for that phase, but not for subsequent processing over which it has no control. The EDPB applies this phased approach directly to generative AI(5).

Caution is advised when qualifying the provider. Whether the provider is a processor or an independent data controller is a factual question that depends on who determines the purposes and essential means, the design and presentation of the service, and the contractual role(6). If the provider processes the input for its own purposes, such as training, it acts as an independent data controller in that regard, and a data processing agreement is not applicable. If it processes exclusively according to instructions and for the client's purpose, it is a processor. It is not a given that a free service always trains on input (some variants offer an opt-out), so the qualification must be determined per service and configuration.

The legal basis: for which processing exactly?

Insofar as the provider also processes the input for its own purposes, the legal basis for that processing must be assessed separately; each processing operation requires its own legal basis under Article 6(1) GDPR(7). For the first processing operation, internal use, the organization may have a legal basis (for example, a legitimate interest in efficiency). For public organizations, this is different: for processing operations in the context of their public task, the legal basis of legitimate interest is not available under Article 6(1)(f), last sentence, GDPR, so they must rely on another legal basis (a legal obligation or a task carried out in the public interest), provided that it actually covers the specific AI processing. However, that legal basis does not cover the step in which those same personal data are made available to the provider for its independent training purpose. For that transfer, consent from the data subject is usually lacking, it is not necessary for the performance of a contract with the data subject, and a legitimate interest that passes the legitimate interest test is difficult to construct when the data serves to improve a third party's model(8). Therefore, the fact that the organization can invoke a legal basis for its own purpose does not mean that the transfer to the provider is lawful.

Special categories: the prohibition of Article 9

If special categories of personal data are involved, a legal basis under Article 6 is not sufficient. In addition to that legal basis, an exception to the processing prohibition of Article 9(1) GDPR must also apply. With shadow AI, this exception is almost always absent: there is no explicit consent, and the idea that data is public does not help, because the mere fact that data is accessible does not mean that the data subject has manifestly made it public(9). The opening example of this blog clearly illustrates this: Youth Act documents containing health data of minors are fully covered by Article 9, and for entering them into a public chatbot, no defensible exception under Article 9(2) can be identified based on these facts.

Contract and transfer

If the provider acts as a processor, a data processing agreement is required under Article 28 GDPR, which explicitly covers the processing of prompts, uploads, and output, including retention periods, instruction mechanisms, and sub-processors(10).
If the provider acts as an independent controller, such an agreement is not applicable, but the organization then lacks the safeguards that a data processing agreement would provide. The setup is decisive here: whether there is a data processing agreement, whether input is used for training, what retention periods apply, and whether human review takes place.

If the provider processes data outside the EEA, a transfer basis is also required, such as the Data Privacy Framework (DPF) or standard contractual clauses or an adequacy decision from the European Commission (Chapter V GDPR). The EU-US Data Privacy Framework provides an adequacy basis for certified US providers. The General Court confirmed its validity on September 3, 2025, but an appeal against that ruling was filed with the Court of Justice on October 31, 2025(11). The framework is therefore still usable for now, but not without reservation, and for critical or large-scale processing, it is advisable to identify alternative transfer mechanisms and exit scenarios.

Transparency, principles, and accountability

The data subject whose data is entered is usually not the user but a third party: a customer, a patient, an applicant. Towards that third party, the information obligation of Article 14 GDPR applies, which in practice is usually not, or not sufficiently specifically, complied with in shadow AI(12). In addition, there is the broader violation of norms. The processing is at odds with the principles of Article 5 (lawfulness, fairness and transparency, purpose limitation, data minimization, and integrity and confidentiality) and undermines the accountability obligation of Article 5(2) and Article 24: the organization cannot demonstrate that it complies with the regulation. Technical impossibility does not constitute a justification, and the burden of proof for the effectiveness of the measures taken lies with the data controller(13). These shortcomings exist irrespective of whether a notifiable data breach occurs.

In the next part

This clarifies the first layer: shadow AI is often already unlawful at the time of input, even without a third party actually viewing the data. But unlawfulness is different from a notifiable data breach. In part 2, we will discuss when shadow AI qualifies as a (notifiable) data breach, and what happens to personal data once it enters a model.

_

1. Municipality of Eindhoven, 'Data leak public AI in Eindhoven,' December 18, 2025, eindhoven.nl/nieuws/datalek-openbare-ai-in-eindhoven. The report to the AP took place on October 23, 2025; Council information letter municipality of Eindhoven, December 2025.

2. AP, 'Warning: use of AI chatbot can lead to data breaches,' autoriteitpersoonsgegevens.nl (2024); AP, 'AP: dozens of data breaches due to uploading personal data to AI chatbots,' Security.nl December 30, 2025, security.nl/posting/919037.

3. Art. 2(1) GDPR (material scope), Art. 4(1) GDPR (definition of personal data), and Art. 4(2) GDPR (definition of processing). Any processing of personal data within the territorial scope of Art. 3 GDPR is subject to the GDPR.

4. Art. 2(7) AI Act (Regulation (EU) 2024/1689): the AI Regulation is without prejudice to Union law on data protection, including the GDPR.

5. CJEU July 29, 2019, C-40/17 (Fashion ID), para. 70 et seq.: joint controllership can exist for the phases of collection and transmission, while a later phase is attributable to one party. The provision by transmission is itself a processing operation (Art. 4(2) GDPR). The EDPB applies this phased approach to AI: EDPB, Report of the work undertaken by the ChatGPT Taskforce, May 23, 2024, para. 14 (with reference to Fashion ID).

6. Art. 4(7) and 4(8) GDPR; EDPB, Guidelines 07/2020 on the concepts of controller and processor, paras. 24-40. The qualification is factual and depends on who determines the purposes and essential means, the design and presentation of the service, and the contractual role; it must be determined per service and configuration.

7. Each processing must meet at least one condition of Art. 6(1) GDPR: EDPB, ChatGPT Taskforce Report, May 23, 2024, para. 13; CJEU December 21, 2023, C-667/21 (Medizinischer Dienst), para. 79.

8. Legitimate interest requires the three-step test (legitimate interest, necessity, and a balancing of interests that takes into account the reasonable expectations of the data subject): CJEU July 4, 2023, C-252/21 (Bundeskartellamt), para. 106; EDPB, ChatGPT Taskforce Report, paras. 16-17; EDPB, Opinion 28/2024 of December 17, 2024 (legitimate interest to be assessed on a case-by-case basis).

9. Art. 9 GDPR (prohibition of processing special categories, subject to Art. 9(2)). The mere fact that data is accessible does not mean that the data subject has manifestly made it public (Art. 9(2)(e)): EDPB, ChatGPT Taskforce Report, para. 18; CJEU July 4, 2023, C-252/21 (Bundeskartellamt), para. 77.

10. Art. 28 GDPR. The data processing agreement must explicitly cover the processing of prompts, uploads, and output, including retention periods, instruction mechanisms, and sub-processors.

11. Art. 44-49 GDPR. The EU-US Data Privacy Framework (adequacy decision of July 10, 2023) is valid: the General Court dismissed an action for annulment on September 3, 2025 (case T-553/23, Latombe/Commission, ECLI:EU:T:2025:831). An appeal against that judgment was lodged with the Court of Justice on October 31, 2025. For standard contractual clauses: Art. 46(2)(c) GDPR.

12. Art. 13 and 14 GDPR. Towards the third-party data subject (the customer or patient whose data is entered), Art. 14 applies; see EDPB, ChatGPT Taskforce Report, paras. 27-28 (Art. 14 for collection from other sources, Art. 13 for direct interaction).

13. Art. 5(1)(a) (lawfulness, fairness, transparency), (b) (purpose limitation), (c) (data minimization), and (f) (integrity and confidentiality) GDPR, and Art. 5(2) and Art. 24 (accountability). Technical impossibility does not justify non-compliance, and data protection by design applies fully (Art. 25); the burden of proof lies with the data controller: EDPB, ChatGPT Taskforce Report, paras. 7 and 19.

Share:

Share

Shadow AI, GDPR, and Data Leaks (2): Data Leaks and Model Risks

Latest insights & publications

View all

Shadow AI, the GDPR and data breaches (1): what shadow AI is and what the GDPR requires

Shadow AI, de AVG en datalekken (2): datalek en modelrisico’s

Shadow AI, GDPR, and Data Leaks (2): Data Leaks and Model Risks

Shadow AI, de AVG en datalekken (3): beleid, inkoop, bedrijfsgeheimen, handhaving en de rol van de FG

Shadow AI, the GDPR and data breaches (3): policy, procurement, trade secrets, enforcement and the role of the DPO