When AI Knows Too Much: The New Risk of Internal Information Leakage

Artificial intelligence is no longer isolated in laboratories or controlled pilots. Today it responds in chatbots, searches internal documentation, helps develop code, automates tasks, and connects to APIs, databases, and corporate tools. And that’s where the problem appears: the more useful AI is, the closer it gets to information that a company cannot afford to leak.

For a long time, the main concern was whether prompts could be used to train models. But the current risk is much broader. An AI can expose sensitive information through the data it was trained with, the context it retrieves, the permissions it inherits, the connectors it uses, or the records it leaves in logs, embeddings, and third-party platforms.

Information leakage in AI doesn’t always occur because the model “reveals” a secret. Many times it happens because the architecture has given it access to data, tools, or processes that should never be available for that use case.

That’s why AI security can no longer be limited to reviewing the model. The entire ecosystem must be analyzed: applications, integrations, providers, permissions, data, automation flows, and governance controls. A poorly designed chatbot, incorrect segmentation of a RAG (AI technique that allows language models to query databases or external documents), a copilot connected to internal repositories, or an agent with excessive access to APIs can become new exposure paths for the organization.

The Leak Can Start Earlier: In the Model’s Training, Fine-Tuning, and Memory

The first group of risks appears when sensitive information is incorporated into the model’s life cycle. It can occur when training a model from scratch or when performing different techniques such as fine-tuning, which is a technique that consists of taking a generalist, already pre-trained model and retraining it with a specific dataset to adapt it to a concrete task without having to build it from scratch with internal data, by enriching it with user feedback, or by building a knowledge repository that will later be used through RAG, which optimizes the output of a language model so that it refers to an authorized knowledge base outside the training data sources before generating a response.

In all these scenarios, the data ceases to be isolated in its original repository and becomes part of a probabilistic, distributed system that is difficult to audit if it has not been designed with controls from the start.

The most relevant risks are the following:
▪️Memorization of sensitive data: The model can reproduce specific fragments of information present in its training, especially if they are unique, repeated, poorly filtered, or appear in small adjustment sets.
▪️Data leakage during fine-tuning: Support tickets, emails, chats, internal reports, or technical traces can end up inside the model’s adjustment without prior review of privacy, secrets, or intellectual property.
▪️Membership inference: An attacker can try to deduce whether a specific piece of data was part of the training, which can reveal sensitive information about people, clients, or internal decisions.
▪️Inversion or reconstruction of information: Under certain scenarios, the model’s responses can offer enough signals to reconstruct attributes, patterns, or fragments of private data.
▪️Exposed artifacts and datasets: The risk does not reside only in the final model: datasets, checkpoints, embeddings, notebooks, snapshots, buckets, training logs, and access tokens can also leak information.

The consequence is clear: training or adjusting AI with internal data without data governance is equivalent to moving critical information to a new surface. In a classic environment, a confidential document could be protected by repository permissions. In an AI environment, that same content can end up represented in embeddings, system prompts, conversational memories, derived datasets, or evaluation traces. If traceability does not exist, the organization can lose the ability to know where the data is, who can access it, and how to delete it.

The most dangerous path today: integrating AI into websites, chatbots, APIs, agents, etc.

The second group of risks appears when AI connects to the company’s infrastructure. This is probably the area where the most concerning incidents are materializing. AI no longer responds only with general knowledge: it now queries document bases, summarizes case files, opens tickets, accesses CRMs, interacts with code repositories, invokes tools, processes client files, and integrates with internal APIs.

It is not enough to ask “what model are we going to use.” The critical question is: “what can this AI system see, retrieve, record, and execute inside my organization.”

When an organization connects a chatbot to a website, an internal assistant to a document base, or an agent to corporate tools, the attack surface shifts. The attacker no longer needs to compromise the database directly if they can manipulate the context consumed by the model, force an improper retrieval, abuse a connector, or cause the agent to execute an unintended action.

Some particularly relevant vectors are:
▪️RAG without permission control: The system retrieves documents the user should not be able to see because the vector index does not correctly replicate the authorization of the original repository.
▪️Indirect prompt injection: A document, a webpage, an email, or an incident contains hidden instructions that the model interprets as commands rather than as untrusted data.
▪️Context leakage: The model reveals information from its system prompt, internal instructions, policies, data from other users, or fragments retrieved from context.
▪️Tool abuse: The agent uses connectors with excessive permissions to query, modify, forward, or export information.
▪️Oversized logs and telemetry: Prompts, responses, attached documents, personal identifiers, secrets, or tool outputs end up stored in observability platforms or external providers.
▪️Insecure connectors and MCPs: Plugins, tools, or MCP servers introduce classic vulnerabilities—SSRF, injection, deserialization, privilege escalation—into a flow now governed by natural language.

Public cases showing that the risk is no longer theoretical

The risk of leakage in AI should not be presented as science fiction. There are already public cases that illustrate real patterns: isolation errors, credentials with excessive permissions, exposed log databases, compromised secrets, integrations with development tools, and prompt injection in assistants connected to the user’s environment.

Public case	What happened	Takeaway for the company
OpenAI / ChatGPT, 2023	A failure related to Redis caused the exposure of conversation titles and, during a specific window, payment information of a portion of Plus subscribers.	Even mature platforms can suffer leaks in infrastructure components, cache, or session mechanisms.
Microsoft AI Research, 2023	A failure related to Redis caused the exposure of conversation titles and, during a specific window, payment information of a portion of Plus subscribers.	Sharing datasets or AI artifacts with misconfigured permissions can turn into a massive breach.
Hugging Face Spaces, 2024	The company reported unauthorized access to Spaces secrets and rotated the affected tokens.	Secrets associated with AI environments must be treated as critical assets, not as simple development configuration.
SAP AI Core, 2024	Researchers reported isolation failures and exposure of artifacts and secrets in a multi‑tenant AI platform.	Isolation between customers, workloads, models, and artifacts is essential in MLOps/AI Cloud platforms.
DeepSeek, 2025	An exposed ClickHouse database was reported containing logs, chat history, API keys, and backend details.	AI logs can contain extremely sensitive data and must be protected as high‑impact information.
GitHub Copilot / VS Code, 2025	Scenarios of prompt injection and command injection capable of impacting the developer’s local environment were documented.	Development copilots and agents can touch code, tokens, repositories, and commands; they must be isolated and audited.

How can a leak materialize inside a company?

In a business project, a leak rarely appears in isolation. It is usually the result of several reasonable decisions made by different teams: innovation wants to accelerate the chatbot, IT connects document repositories, the business asks for traceability, development enables tools, and security arrives too late. The result may be a functional system, but with exposure paths that are difficult to detect in a traditional review.

Customer service chatbot connected to internal documentation
The assistant correctly answers public questions, but the RAG index contains documents with internal procedures, commercial scripts, incidents, or customer data that should not be available to external users.

Internal assistant with inherited permissions
An employee with low privileges asks the assistant about a policy, but the agent queries a document source using a service account with global access and returns restricted content.

Development copilot with access to repositories and terminal
Malicious instruction hidden in a README, an issue, or a code snippet can induce the agent to reveal tokens, internal paths, environment variables, or execute unintended commands.

Fine-tuning with real support tickets
The dataset contains customer numbers, temporary credentials, error traces, tokens, user messages, or contractual information. If it is not anonymized, the fine‑tuned model may learn patterns that later appear in responses.

Prompt logs treated as normal technical logs
AI conversations include documents, personal data, business decisions, and code excerpts. If stored without minimization, encryption, or limited retention, the observability system becomes a sensitive repository.

Why are traditional controls not enough?

Many companies try to fit AI into already known controls: web pentesting, API review, cloud hardening, vendor management, or GDPR compliance. All of them are still necessary, but they do not cover by themselves the complexity of an AI ecosystem. The reason is that LLM‑based systems mix three elements that in classical security we try to keep separate: instructions, data, and actions.

A traditional application receives data and executes logic defined by the programmer. A system with AI interprets natural language, retrieves external context, and can decide which tool to use to achieve a goal. That flexibility adds value, but it also makes the boundary between “content to be processed” and “instruction to be obeyed” less clear. That is why prompt injection should not be understood as a simple variant of SQL injection: in many cases there is no perfect sanitization, and the defense depends on architecture, permissions, validations, monitoring, and human control.

What should organizations review before scaling AI?

Before deploying or scaling an AI system connected to corporate data, organizations should perform a specific security and governance review. It is not enough to accept the provider’s terms or trust that the base model is secure: the use case, the architecture, the data, the permissions, and the operational controls must be evaluated.

Area	Key questions	Recommended controls
Data and training	What data is used to train, fine‑tune, or evaluate the system? Are there personal data, secrets, or intellectual property?	Minimization, anonymization/pseudonymization, dataset review, source control, documentation, and impact assessment.
RAG and document bases	Does the index respect permissions? Are public, internal, and confidential documents mixed?	Segmentation by sensitivity, user‑level ACLs, source validation, controlled deletion, and context‑leak testing.
Agents and tools	Which APIs can the agent invoke? Can it read, write, delete, send, or execute?	Least privilege, dedicated service accounts, sandboxing, human approvals, action limits, and auditing.
Providers and SaaS	Are prompts reused for training? Where are logs stored? Which subprocessors are involved?	Contractual review, limited retention, no‑training clauses, data location, evidence, and audit rights.
Logs and telemetry	What is recorded from prompts, responses, documents, and tool calls?	Log minimization, encryption, secret masking, defined retention, restricted access, and exposure alerts.
Offensive testing	Has prompt injection, jailbreak, context leakage, tool abuse, and secret extraction been tested?	AI‑specific red‑teaming, testing on real integrations, connector fuzzing, MCP review, and mitigation validation.

Auditing an AI Ecosystem

The answer is not to slow down AI adoption. The answer is to audit it with the same rigor used to audit critical applications, APIs, cloud, source code, or regulated environments. The difference is that an AI audit must look beyond the model: platform, data, connectors, orchestration, permissions, providers, logs, policies, and adversarial behavior.

A complete assessment should combine technical review and offensive testing. In practice, this means analyzing the architecture, inventorying components, reviewing data flows, validating configurations, checking access controls, evaluating the LLM’s robustness against prompt injection or jailbreak, testing for context leaks, reviewing connectors/MCPs, analyzing dependencies, and delivering a risk‑prioritized action plan.

If your organization is already connecting AI to internal data, customer systems, or corporate tools, the right time to review security is not after the incident: it is before the assistant, the RAG, or the agent goes into production.

In this context, a Comprehensive Security Audit for AI Ecosystems allows for modular evaluation of the components actually deployed by the organization: AI and MLOps platforms, integrations with traditional applications, LLM models and their orchestration, as well as connectors, plugins, or tools that enable agent capabilities with external services.

This approach is especially relevant for companies already using AI in production or scaling use cases with sensitive data, internal systems, customer interaction, or regulatory requirements. The advantage of auditing before scaling is twofold: it reduces the risk of information leakage and provides technical evidence to support governance, compliance, and control‑prioritization decisions.

Before connecting a chatbot to your internal documentation, deploying a corporate RAG, or allowing an agent to interact with business APIs, it is advisable to verify what it can see, what it can do, and what it could leak. A specific AI security audit helps identify vulnerabilities, deviations from best practices, and operational risks before they impact customers, data, or reputation.

At Internet Security Auditors, we provide a comprehensive and modular service aimed at evaluating the security of an organization’s AI components, both at the Platform level and in Integrations with traditional systems, LLM Models and their Orchestration, and Connectors/Plugins/Tools that enable agent capabilities with external services. The scope is configured “tailored” according to the client’s context.

References
Regulatory sources and reference frameworks
AEPD – General policy for the use of generative AI.
https://www.aepd.es/documento/politica-iag-aepd.pdf

AEPD – Guidelines on agentic artificial intelligence.
https://www.aepd.es/guias/orientaciones-ia-agentica.pdf

CNIL - AI system development: recommendations to comply with GDPR.
Guide on the development of AI systems in accordance with the GDPR, minimization, privacy, and security measures. https://www.cnil.fr/en/ai-system-development-cnils-recommendations-to-comply-gdpr

EDPB - Opinion on AI models and GDPR principles.
Reference on model anonymization, legitimate interest, and the effects of unlawful data processing in training.
https://www.edpb.europa.eu/news/news/2024/edpb-opinion-ai-models-gdpr-principles-support-responsible-ai_en

EDPB - AI privacy risks and mitigations in LLMs.
https://www.edpb.europa.eu/system/files/2025-04/ai-privacy-risks-and-mitigations-in-llms.pdf

ENISA - Securing Machine Learning Algorithms.
Taxonomy and threats on machine learning security. https://www.enisa.europa.eu/publications/securing-machine-learning-algorithms

NIST - Adversarial Machine Learning:
A Taxonomy and Terminology. https://csrc.nist.gov/pubs/ai/100/2/e2025/final

NIST - AI RMF Generative AI Profile.
Framework for generative AI risk management. https://nvlpubs.nist.gov/nistpubs/ai/NIST.AI.600-1.pdf

OWASP - LLM01 Prompt Injection.
https://genai.owasp.org/llmrisk/llm01-prompt-injection/

OWASP - Training Data Poisoning.
https://genai.owasp.org/llmrisk2023-24/llm03-training-data-poisoning/

Public cases cited
OpenAI - March 20 ChatGPT outage.
Incident involving exposure of conversation titles and payment data during a limited window. https://openai.com/index/march-20-chatgpt-outage/

Wiz - Microsoft AI Research data exposure.
Case of exposure associated with an overly permissive SAS token and AI research data. https://www.wiz.io/blog/38-terabytes-of-private-data-accidentally-exposed-by-microsoft-ai-researchers

Microsoft MSRC - mitigated exposure due to overly permissive SAS token.
Confirmation and measures adopted by Microsoft regarding the exposure . https://www.microsoft.com/en-us/msrc/blog/2023/09/microsoft-mitigated-exposure-of-internal-information-in-a-storage-account-due-to-overly-permissive-sas-token

Hugging Face - Spaces secrets disclosure.
Disclosure about unauthorized access to Spaces secrets and token rotation. https://huggingface.co/blog/space-secrets-disclosure

Wiz - SAPwned: SAP AI Core vulnerabilities.
Investigation into isolation failures and exposure of artifacts/secrets in an AI platform. https://www.wiz.io/blog/sapwned-sap-ai-vulnerabilities-ai-security

Wiz - DeepSeek database leak.
Case of an exposed ClickHouse database containing logs, chat history, API keys, and backend details. https://www.wiz.io/blog/wiz-research-uncovers-exposed-deepseek-database-leak

GitHub - Safeguarding VS Code against prompt injections.
Analysis on prompt injection in AI‑assisted development environments. https://github.blog/security/vulnerability-research/safeguarding-vs-code-against-prompt-injections/

AWS - Security Bulletin AWS-2025-015.
Case involving Amazon Q Developer for VS Code and mitigation measures. https://aws.amazon.com/security/security-bulletins/AWS-2025-015/