What exactly is "Shadow AI"?

Shadow AI is the use of artificial intelligence applications and tools by employees for work-related purposes without the explicit approval or oversight of the company's IT and security departments. It is a subset of the broader "Shadow IT" problem but poses unique risks. While traditional Shadow IT might involve an unapproved project management tool, Shadow AI often involves public generative AI models. The key danger is that data input into these models—such as source code, customer PII, or strategic documents—can be absorbed into the model's training data. This makes the data leakage persistent, unrecoverable, and a permanent corporate liability, unlike a file stored on an unauthorized cloud server which can potentially be deleted.

Isn't blocking all AI websites the easiest solution?

While technically feasible, a blanket ban on AI websites is often strategically flawed and difficult to enforce. Firstly, resourceful employees will find ways around it, such as using personal phones, home networks, or VPNs, driving the problem deeper into the shadows and eliminating any chance of visibility. Secondly, it positions the security department as an obstacle to productivity rather than an enabler. Generative AI offers real efficiency gains, and denying employees these tools can hurt morale and competitiveness. A more effective strategy is to provide a sanctioned, secure enterprise alternative and use technical controls to block only the unapproved public versions, thereby channeling user behavior toward a safe environment.

How does an Enterprise AI plan actually protect our data?

Enterprise-tier AI plans from major providers like Microsoft, Google, and OpenAI are designed specifically for corporate data security. Their primary protection mechanism is contractual. These agreements typically include a Zero Data Retention (ZDR) policy and an explicit clause stating your company's data will not be used to train their public models. When your employee submits a prompt via the enterprise API or portal, the data is processed to generate a response and then purged. It is not stored long-term or fed back into the AI's learning cycle. This contractual wall is the key differentiator from public versions, where your data may be considered fair game for model training.

What are the first three steps our company should take to address this?

1. **Assess and Discover:** You cannot manage what you do not measure. Deploy a Cloud Access Security Broker (CASB) or similar network monitoring tool to get an objective report on which AI tools are being accessed by whom and how often. This data provides a baseline and the business case for further action.

Does this issue matter if our company is not based in the EU?

Yes, it matters significantly, for two primary reasons. First, regulations like the EU's GDPR have extraterritorial reach. If your company processes the personal data of any EU resident—even a single customer—you are subject to GDPR rules. Pasting that customer's data into a public AI tool could trigger a serious compliance violation regardless of your company's headquarters. Second, frameworks like the NIST AI Risk Management Framework are becoming global benchmarks for the standard of care. In the event of a data breach linked to Shadow AI, demonstrating that you ignored these widely accepted best practices could expose your company to greater liability, shareholder lawsuits, and reputational damage.

How does Shadow AI risk differ from regular Shadow IT?

The core difference lies in the nature of the data exposure. With traditional Shadow IT, like an employee using a personal Dropbox account, the risk is unauthorized storage. The data exists as a discrete file or object in a location you don't control, but it can often be located and deleted. With Shadow AI, the risk is data absorption. When an employee pastes sensitive information into a public generative AI model, that information can be incorporated into the model’s trillions of parameters. The data ceases to exist as a discrete file; it becomes part of the model's neural network. It is effectively impossible to find and delete, creating a permanent, undiscoverable, and unquantifiable liability that could surface in an answer to another user's prompt months or years later.

Shadow AI in the Workplace: Risks, Detection, and Governance

By Business Indemnity EditorialUpdated May 5, 2026

TL;DR: The unsanctioned use of generative AI tools by employees, or "Shadow AI," presents a critical data exfiltration risk that standard security controls may miss. Research from firms like Cyberhaven indicates a significant percentage of employees are feeding sensitive corporate data—from source code to PII and strategic plans—into public AI models. This data is often absorbed into training sets, rendering it unrecoverable and creating a permanent liability. Effective governance requires a triad approach: deploying technical controls like CASB and DLP for detection, providing sanctioned enterprise-grade AI alternatives with data-privacy guarantees, and establishing a robust acceptable use policy reinforced by continuous training and auditing.

The Scope and Scale of Unsanctioned AI

Shadow IT, the use of unapproved technology within an organization, is a familiar challenge for security leaders. Shadow AI is its more potent and insidious variant. It refers to employees using publicly available generative AI tools—such as OpenAI's ChatGPT, Google's Gemini, or Anthropic's Claude—for work-related tasks without corporate sanction, visibility, or control. The motivation is typically productivity, not malice. Employees use these tools to write code, draft emails, summarize documents, and analyze data.

The scale of this activity is significant. A 2023 report from Cyberhaven, a data security firm, found that a notable percentage of an organization's workforce uses generative AI tools, with a substantial portion of the data they input being classified as sensitive. This includes confidential information, client data, and source code.

The incident at Samsung Electronics in April 2023 serves as a canonical example of the risk materializing. After lifting a temporary ban, engineers reportedly input proprietary source code into ChatGPT to fix bugs and summarize meeting notes containing confidential information. This act exposed sensitive intellectual property to a third-party model, with no guarantee of deletion or confidentiality. The data, once submitted, can be used to train the model, effectively embedding corporate secrets into a public utility. This risk is not hypothetical; it is a documented failure mode with severe consequences for intellectual property and competitive advantage.

The Anatomy of Data Leakage via Generative AI

The primary risk of Shadow AI is unintentional data leakage. Unlike traditional data exfiltration vectors that involve a discrete transfer of a file, interacting with generative AI is conversational. Employees normalize the act of copying and pasting text into a prompt window, often without recognizing it as a data transfer event. A comprehensive AI data leakage prevention guide is essential for mapping these new threat vectors.

Intellectual Property and Source Code

Developers are among the most enthusiastic adopters of AI assistants. They use them to debug code, generate boilerplate functions, refactor complex logic, and translate code between languages. Each time a snippet of proprietary source code is pasted into a public AI tool, the organization loses control of that intellectual property. The code could surface in suggestions provided to another user, including a competitor. For technology companies, whose valuation is directly tied to their IP, this represents an existential threat that bypasses traditional IP protection measures.

Customer and Employee Personally Identifiable Information (PII)

Support, sales, and marketing teams leverage AI to improve efficiency. An employee might paste a long chain of customer support emails to generate a concise summary or input a list of customer feedback to identify common themes. This data frequently contains PII, such as names, email addresses, phone numbers, and account details. If this data originates from EU citizens, its transfer to a non-compliant third-party AI service could constitute a significant violation of the General Data Protection Regulation (GDPR). Fines for non-compliance can be severe, not to mention the reputational damage and erosion of customer trust. Ensuring adherence to a strict GDPR compliance checklist becomes nearly impossible when data flows are unmonitored.

Strategic Corporate Data (M&A, Financials, Contracts)

The danger extends to the highest levels of corporate strategy. An analyst may upload excerpts from a confidential M&A term sheet to have the AI simplify the legal language. A finance professional might input sensitive quarterly financial data to generate presentation talking points. A lawyer could paste a draft partnership agreement to check for loopholes. In all these cases, highly sensitive, market-moving information is exposed. This data, if absorbed and correlated, could leak through prompts from other users, jeopardizing negotiations, violating securities regulations, and nullifying strategic advantage.

Detection: Illuminating the Shadows

You cannot govern what you cannot see. The first tactical step in addressing Shadow AI is achieving visibility. Traditional firewalls and network monitoring may not be sufficient, as traffic to AI platforms can be mistaken for general web browsing. A more specialized toolset is required.

Cloud Access Security Brokers (CASB): A CASB sits between an organization's users and cloud services, enforcing security policies as cloud-based resources are accessed. A modern CASB can identify traffic directed to thousands of cloud applications, including known public AI platforms. This allows security teams to discover which employees are using which tools and with what frequency. The initial goal is not necessarily to block, but to assess the scope of the problem.
Data Loss Prevention (DLP): While CASBs identify the destination, DLP solutions inspect the content. A DLP agent can be configured with policies to detect and flag specific data patterns within HTTP/S traffic. These patterns can include source code syntax, regular expressions for PII or credit card numbers, and keywords like "confidential," "Project Titan," or "M&A Draft." When an employee attempts to paste this sensitive data into a web-based AI tool, the DLP can alert security teams, block the action, or present a warning to the user, providing real-time intervention.
Browser-Side Controls: Some security platforms offer browser extensions or agents that provide granular control over web interactions. These tools can specifically identify the text entry fields of AI chat interfaces and apply policies directly at the point of data entry, offering a more precise method of control than network-level DLP alone.

The Sanctioned Alternative Strategy

An outright ban on all AI tools is often impractical and counterproductive. It drives usage further into the shadows (e.g., employees using personal devices) and denies the organization the significant productivity gains these tools offer. A more mature strategy is to provide a sanctioned, secure alternative.

Major AI providers like OpenAI, Microsoft (via Azure OpenAI Service), and Google now offer enterprise-tier plans designed for corporate use. The single most important feature of these plans is the contractual guarantee regarding data usage. These contracts typically include a "zero data retention" or "no-training" policy, meaning that any data submitted via the API or enterprise portal is not stored long-term and is explicitly excluded from being used to train the public models.

How employees using ChatGPT are leaking company data — Source: CNBC

By vetting and procuring such a service, the organization can channel the demand for AI into a controlled environment. The CISO and legal teams must scrutinize the service agreements to ensure the data privacy and security clauses are ironclad. This process of vetting and deployment should follow a structured methodology, similar to the one outlined in this securing LLM applications checklist. Once a sanctioned tool is in place, IT can configure network policies to block access to non-sanctioned public alternatives, effectively steering users toward the safe option.

Forging a Governance Framework

Technology alone is insufficient. Long-term risk mitigation depends on a robust governance framework that combines policy, training, and auditing.

Drafting a Robust Acceptable Use Policy (AUP)

The existing AUP must be updated or supplemented with a specific policy for AI. This policy should be clear, concise, and unambiguous. It must define:

What "Generative AI" means in the context of the policy.
The official, sanctioned AI tool(s) available to employees.
A strict prohibition on using any non-sanctioned AI tools for company business.
An explicit list of data types that must never be input into any AI tool, sanctioned or not (e.g., credentials, trade secrets, sensitive PII).
The employee's responsibility to review AI-generated output for accuracy and bias before use.
The process for requesting access to new AI tools or use cases.
The consequences for violating the policy.

Employee Training and Communication

Policy is useless if it is not understood. Training cannot be a one-time, check-the-box activity. It must be ongoing and role-specific. All employees need foundational training on the AUP and the core risks of data leakage. Developers need specific guidance on handling source code. Legal and finance teams need tailored instructions on managing contracts and financial data. The training should use concrete examples of what not to do (e.g., showing a redacted screenshot of pasting sensitive data into a public chatbot) and what to do (using the sanctioned enterprise tool for an approved task). The goal is to build a culture of security awareness around this new technology class.

Establishing Audit Trails and Monitoring

Trust, but verify. The sanctioned enterprise AI platform should provide detailed audit logs showing which user prompted the model with what data and when. These logs, combined with data from CASB and DLP systems, create a comprehensive audit trail. Security and compliance teams must regularly review these logs for policy violations and anomalous activity. This monitoring capability is not only a deterrent but also a critical tool for incident response. If a data leak is suspected, these logs provide the forensic evidence needed to understand the scope of the breach, a core element in managing the broad spectrum of AI cybersecurity risks.

Navigating Regulatory and Compliance Headwinds

The proliferation of Shadow AI introduces significant compliance risks. Regulators are moving swiftly to address AI, and organizations will be held accountable for the tools their employees use, whether they are sanctioned or not.

The EU AI Act establishes a risk-based framework for AI systems. Using unvetted AI tools could inadvertently cause an organization's AI-assisted processes to fall into a "high-risk" category, triggering stringent compliance obligations related to data quality, transparency, human oversight, and cybersecurity. A data leakage incident involving an employee's use of a public AI tool could be interpreted as a failure of technical and organizational measures under both the AI Act and GDPR.

In the United States, the NIST AI Risk Management Framework (AI RMF) provides a voluntary but highly influential set of guidelines for managing risks associated with AI. The framework's core functions—Govern, Map, Measure, and Manage—provide a structured approach that aligns perfectly with the challenge of Shadow AI. By identifying and mapping the use of unsanctioned tools (Map), measuring the associated data leakage risks (Measure), and implementing the policies and controls discussed here (Govern and Manage), an organization can demonstrate due diligence and align with emerging standards of care. CFOs and risk managers should view alignment with the NIST AI RMF not as a compliance burden, but as a framework for responsible innovation and risk reduction.

Frequently asked questions

Written by

Business Indemnity Editorial

Editorial Team

The Business Indemnity editorial team covers AI security, cybersecurity, and cyber insurance for SaaS and modern businesses.

About the editorial team →