Free AI Comes at a Price: How Public LLMs Learn From Your Data

In January 2026, news broke that Dr. Madhu Gottumukkala, the acting director of the Cybersecurity and Infrastructure Security Agency (CISA), had uploaded sensitive government documents marked "For Official Use Only" to the public version of ChatGPT [1]. The head of America's cybersecurity agency, using a consumer AI tool with confidential data. If it can happen there, it can happen anywhere.

This incident isn't an isolated lapse in judgment. It represents a systemic problem affecting organizations of every size and industry. Employees are using free AI tools to boost productivity without understanding that their inputs may be retained, analyzed, and used to train future models. Your company's source code, customer data, strategic plans, and confidential communications could be feeding the next version of a public AI.

The question isn't whether your employees are using AI—they almost certainly are. The question is whether you've established policies and controls to protect your organization's most sensitive information.

34.8%

of employee ChatGPT inputs contain sensitive data, up from 11% in 2023 [2]

The Hidden Cost of "Free" AI Tools

When OpenAI offers ChatGPT for free, users are paying with something else: their data. By default, conversations on Free, Plus, and Pro plans are used to train future AI models [2]. While users can opt out through settings, most never do, and organizations rarely have visibility into how their employees are using these tools.

The data risk extends beyond just training. According to research from Q4 2025, organizations using AI chatbots exposed an average of 3 million sensitive records per company in the first half of the year alone [2]. This includes customer information, employee data, source code, financial records, and strategic documents.

What Data Is at Risk?

Cisco research found that employees regularly enter problematic information into AI tools [3]:

Employee information (45%) - HR records, performance reviews, salary data
Non-public company information (48%) - Strategic plans, financial projections, M&A details
Source code and technical documentation - Proprietary algorithms, system architectures
Customer data - Contact information, purchasing history, PII
Legal documents - Contracts, litigation details, privileged communications

The "Delete" Button Doesn't Always Work

In May 2025, a US Magistrate Judge ordered OpenAI to preserve all ChatGPT conversation logs indefinitely as part of the New York Times copyright lawsuit. During this period, even deleted conversations were archived in a separate legal hold [2]. Your "deleted" data may not be as gone as you think.

Real-World Breaches: When AI Use Goes Wrong

The risks of uncontrolled AI use aren't theoretical. Multiple high-profile incidents demonstrate what happens when employees treat public AI tools like secure enterprise software.

Case Study: CISA Acting Director (2025)

Dr. Madhu Gottumukkala, despite having special authorization to use ChatGPT with stated "controls in place," uploaded at least four documents marked "For Official Use Only" to the public platform between mid-July and early August 2025. The documents contained government contracting details including procurement processes, vendor relationships, and pricing structures [1].

CISA's automated security monitoring detected the uploads and triggered alerts, leading to an internal investigation. The incident demonstrates that even security-aware organizations with monitoring in place can fail to prevent data exposure when users have authorized access to AI tools.

Case Study: Samsung (2023)

Samsung engineers leaked confidential semiconductor source code, internal meeting notes, and test data through three separate incidents in under a month [4]. In one case, an engineer pasted buggy source code into ChatGPT to fix errors. In another, meeting minutes were uploaded to generate a summary.

Since ChatGPT retains user input data for training, Samsung's trade secrets effectively became part of OpenAI's knowledge base. Samsung subsequently banned generative AI tools on internal networks and developed its own in-house solution [4].

The Shadow AI Problem

In February 2025, security researchers discovered a coordinated campaign that compromised over 40 popular browser extensions used by 3.7 million professionals [2]. Many of these were "productivity boosters" that employees had installed without IT approval. Once compromised, these extensions could silently scrape data from active browser tabs, including sensitive corporate sessions open in ChatGPT.

This "Shadow AI" problem extends beyond browser extensions. Employees use personal devices, consumer accounts, and unapproved tools to get work done faster, creating data exposure that IT teams never see.

Why Organizations Are Banning Public AI Tools

In response to these risks, many organizations have implemented outright bans on public AI tools. The list includes major players across multiple industries [3]:

Financial Services

JPMorgan Chase - Restricted to ensure compliance with third-party software regulations
Goldman Sachs, Bank of America, Citigroup - Similar restrictions prioritizing internal AI development
Deutsche Bank - Banned to prevent leakage of confidential banking data

Technology

Apple - Restricted after concerns about leaking product roadmaps and code
Samsung - Banned after multiple data leak incidents
Amazon - Issued warnings about sharing sensitive information

Government and Defense

Department of Energy - Temporarily blocked access for employees and managed customers
Northrop Grumman - Blocked public AI to protect national security data
Verizon - Not accessible from corporate systems

Italy became the first Western country to temporarily ban ChatGPT in 2023 over GDPR concerns, and OpenAI was fined €15 million by Italian authorities for GDPR violations in December 2024 [2].

Consumer vs. Enterprise AI: Understanding the Difference

Not all AI access is created equal. Understanding the difference between consumer and enterprise offerings is critical for making informed decisions about AI use in your organization.

Feature	Free/Plus/Pro	Enterprise/Business
Training on Your Data	Yes (by default)	No (disabled by default)
Data Retention Control	Limited	Admin-controlled
SSO/Identity Integration	No	Yes
Audit Logging	No	Yes
Data Processing Agreement	No	Yes (signed DPA)
Compliance Support	Minimal	SOC 2, GDPR, HIPAA eligible

According to OpenAI's enterprise privacy documentation, business data from ChatGPT Enterprise, Business, and Healthcare plans is not used for training, and organizations have ownership and control over their inputs and outputs [5]. However, even enterprise solutions aren't a silver bullet—proper configuration, user training, and monitoring are still required.

The Cost of Getting It Wrong

According to IBM data, companies dealing with unmonitored AI faced an average of $670,000 more in data breach costs compared to those with proper AI governance [2]. Enterprise AI solutions cost money, but the alternative may cost much more.

Regulatory and Compliance Implications

Using public AI tools with sensitive data can create significant compliance exposure, particularly for organizations subject to data protection regulations.

GDPR (General Data Protection Regulation)

GDPR mandates strict data handling and requires explicit consent for data transfers outside the EU. When employees paste customer data into US-based AI services, organizations may violate data transfer requirements, consent obligations, and the right to erasure [2].

HIPAA (Health Insurance Portability and Accountability Act)

Healthcare organizations must ensure that any AI tools processing protected health information (PHI) have appropriate Business Associate Agreements in place. Consumer AI tools don't meet this requirement.

CMMC (Cybersecurity Maturity Model Certification)

Defense contractors handling Controlled Unclassified Information (CUI) must implement strict access controls. Using public AI with CUI could jeopardize certification and contract eligibility.

Industry Frameworks

In May 2025, CISA, NSA, and FBI released joint guidance on AI data security, outlining ways AI data may become compromised including unauthorized access, data tampering, poisoning attacks, and inadvertent data leakage [6]. The guidance emphasizes mitigation strategies defined by the NIST AI Risk Management Framework [7].

Protecting Your Organization: A Practical Framework

Banning AI entirely isn't realistic for most organizations. Employees will find workarounds, and you'll lose the productivity benefits that AI offers. Instead, implement a governance framework that balances innovation with risk management.

1. Establish a Clear AI Use Policy

Your policy should explicitly address:

Approved tools: Which AI platforms are sanctioned for business use
Prohibited data types: What information can never be entered into AI tools
Use cases: Appropriate and inappropriate applications of AI
Consequences: What happens when policies are violated

2. Implement Technical Controls

Deploy enterprise AI solutions with data protection guarantees
Block access to consumer AI from corporate networks if necessary
Monitor for data leakage using DLP (Data Loss Prevention) tools
Implement browser controls to prevent unauthorized extension installation

3. Train Your Workforce

Most employees don't understand the risks. Training should cover:

How AI training works and why data retention matters
Examples of data types that should never be shared
How to use approved AI tools safely
Reporting procedures for suspected violations

4. Audit and Monitor

Regularly audit AI tool usage across the organization
Monitor for shadow AI and unauthorized tools
Review and update policies as AI capabilities evolve

AI Data Security Checklist

Document approved AI tools and their security configurations
Create data classification guidelines for AI use
Disable training data sharing on all consumer AI accounts
Implement DLP monitoring for AI platforms
Train employees on AI data risks quarterly
Review AI vendor security documentation and DPAs
Establish incident response procedures for AI data exposure
Align AI policies with existing compliance frameworks

Is Your Organization Protected from AI Data Risks?

LocalEdgeIT helps Denver businesses implement secure AI policies and protect sensitive data from unauthorized exposure.

Get Your Free Security Assessment

Key Takeaways

Summary for IT Leaders

Free AI tools use your data for training by default—and your employees are using them
Nearly 35% of ChatGPT inputs now contain sensitive data
Even CISA's acting director exposed government data through consumer AI
Major companies including Apple, Samsung, and JPMorgan have banned or restricted public AI
Enterprise AI solutions offer data protection but require proper configuration
Compliance frameworks including GDPR, HIPAA, and CMMC have AI data implications
The solution isn't banning AI—it's implementing governance, training, and controls

Next Steps

AI is transforming how work gets done, and that transformation isn't slowing down. Organizations that establish clear policies and appropriate controls now will be better positioned to capture AI's benefits while managing its risks.

At LocalEdgeIT, we help Denver businesses navigate the complex intersection of AI productivity and data security. From policy development to technical implementation, our team provides the guidance needed to use AI safely and effectively.

Ready to protect your organization's data? Take our free IT Security Assessment to evaluate your current AI governance posture, or contact us to discuss how we can help implement secure AI practices.

Sources & Additional Resources

CISA Acting Director ChatGPT Government Data Breach 2026 - The Small Business Cybersecurity Guy, January 2026
https://thesmallbusinesscybersecurityguy.co.uk/blog/cisa-acting-director-chatgpt-government-data-breach-2026
Analysis of the CISA incident based on Politico reporting.
Is ChatGPT Safe? The Complete 2026 Security & Privacy Guide - ESET, 2026
https://www.eset.com/blog/en/home-topics/cybersecurity-protection/is-chatgpt-safe-2026-guide/
Comprehensive guide on ChatGPT data privacy and security.
Companies Banning ChatGPT (2025): The Enterprise Security List - Moveo.AI
https://moveo.ai/blog/companies-that-banned-chatgpt
Tracking organizations restricting AI tool access.
Samsung Engineers Feed Sensitive Data to ChatGPT - Dark Reading, 2023
https://www.darkreading.com/vulnerabilities-threats/samsung-engineers-sensitive-data-chatgpt-warnings-ai-use-workplace
Details on the Samsung data leak incident.
Enterprise Privacy at OpenAI - OpenAI
https://openai.com/enterprise-privacy/
Official documentation on ChatGPT Enterprise data handling.
Joint Cybersecurity Information: AI Data Security - NSA/CISA/FBI, May 2025
https://media.defense.gov/2025/May/22/2003720601/-1/-1/0/CSI_AI_DATA_SECURITY.PDF
Official joint guidance on AI data security.
AI Risk Management Framework - NIST
https://www.nist.gov/itl/ai-risk-management-framework
NIST framework for managing AI-related risks.
NIST Updates Privacy Framework to Address AI - Jones Day, May 2025
https://www.jonesday.com/en/insights/2025/05/nist-updates-its-privacy-framework-to-address-ai
Analysis of NIST Privacy Framework 1.1 updates.
A 2026 Guide to ChatGPT Risks - Concentric AI
https://concentric.ai/chatgpt-security-risks-in-2026-a-guide-to-risks-your-team-might-be-missing/
Enterprise-focused guide on ChatGPT security risks.
CISA Releases AI Data Security Guidance - Inside Government Contracts, June 2025
https://www.insidegovernmentcontracts.com/2025/06/cisa-releases-ai-data-security-guidance/
Summary of CISA's AI data security guidance for federal contractors.