Model Theft Is Real — And Most Enterprises Have No Defense

The Question

Your organization has invested significant resources in AI: fine-tuning a foundation model on proprietary data, building a custom AI application, training a specialized model for your domain. That model — its weights, its architecture, its training data — is an intellectual property asset. Are you protecting it like one?

Most enterprises are not. The controls applied to source code, financial data, and customer records — version control, access management, audit logging, DLP — are rarely applied with equivalent rigor to AI models. The gap is real and exploitable.

OWASP classifies this risk under LLM04: Data and Model Poisoning — but model theft is the offensive complement to poisoning. Where poisoning corrupts a model, theft extracts it. Both attacks target the same underprotected asset.

Why This Matters Now

In March 2023, Meta released LLaMA — its large language model — to a limited set of approved researchers under a controlled access agreement. Within days, the full model weights had been leaked and posted on 4chan. From there they spread to torrent sites, GitHub repositories, and Hugging Face. What Meta had intended as controlled academic access became an open download within a week.

The LLaMA leak was not a sophisticated attack. It was a straightforward insider incident: someone with legitimate access shared what they had access to. The damage was significant — competitors and bad actors had access to a model that took Meta considerable resources to train, without Meta's consent and outside any usage policy.

LLaMA was an open-weights model that Meta eventually chose to release publicly anyway. For enterprises that have fine-tuned proprietary models on internal data — customer behavior, clinical records, financial transactions, manufacturing processes — the stakes of an equivalent leak are considerably higher. The model weights themselves encode patterns from the training data. Extracting the weights is, in some cases, a path to extracting the data.

What the CURVE™ Data Shows

The 2026 Stackcurve AI Security CURVE™ Report covers the AI Model Security category — vendors focused on protecting model weights, architectures, and training pipelines against theft, tampering, and supply-chain compromise. Representative vendors include Adversa AI, Cranium, and the model security capabilities now embedded in Palo Alto Networks following the Protect AI acquisition.

What the CURVE™ data shows is that this category remains underinvested relative to the risk. Enterprises that have rigorous controls around customer data often have almost no controls around the AI models trained on that data. The model is treated as an application component, not as a sensitive asset requiring its own protection regime.

The second finding from the research: the supply chain attack surface is larger than most buyers appreciate. In 2024, security researchers discovered more than 100 models on the Hugging Face model hub that contained embedded malware — backdoors, data exfiltration payloads, and code execution capabilities hidden inside model files that appeared legitimate. Organizations that download pre-trained models or open-source components and deploy them without security scanning are importing an unaudited attack surface directly into their AI infrastructure.

The full vendor rankings are in the 2026 AI Security CURVE™ Report — free to download.

The Gap Most Buyers Miss

Model theft takes two forms, and most enterprises have defenses for neither.

Direct theft — unauthorized access to model weights, fine-tuning data, or training infrastructure. This is the LLaMA pattern: someone with access, authorized or compromised, extracts the model artifact. The defense is identical to protecting any other high-value intellectual property: access controls, audit logging, DLP on model files, monitoring for large file transfers, and insider threat awareness.

Model extraction attacks — an adversary queries a deployed model through its API, collects enough input-output pairs, and uses that data to train a functionally equivalent copy without ever accessing the original weights. This is subtler and does not require any insider access. Research has demonstrated that the architecture and approximate parameters of models including GPT-2 can be reconstructed through systematic API querying. A sufficiently motivated attacker with API access can steal the functional value of a model without stealing the artifact itself.

The defense against extraction attacks requires monitoring at the API layer: detecting systematic, high-volume querying patterns that look more like model interrogation than normal usage. Rate limiting, query diversity monitoring, and output watermarking are the relevant controls. Most enterprises have none of them.

Questions Your Buying Team Should Be Asking

1. Where are your fine-tuned model weights stored, and who has access? Model weights should be treated with the same access control rigor as source code or financial records. If the answer is "in an S3 bucket with broad access" or "on a shared file server," the access model needs immediate tightening.

2. Do you scan open-source and third-party models before deploying them? The Hugging Face malware findings established that pre-trained models are a supply chain attack vector. Tools including ModelScan and the model security capabilities in the Palo Alto Protect AI platform provide scanning capability. If you are downloading and deploying models without scanning them, you are importing an unaudited attack surface.

3. Are you monitoring your model APIs for extraction-pattern queries? High-volume, systematically varied querying is the signature of a model extraction attempt. If your API monitoring is not specifically tuned to detect this pattern, you have no visibility into whether extraction is occurring.

4. Is your training data governed with the same rigor as the data it was trained on? Training data often contains sensitive information — PII, proprietary business logic, regulated health or financial data. The training pipeline that processes this data is as sensitive as the data itself. Is it protected accordingly?

5. Have you applied output watermarking to your proprietary models? Watermarking — embedding detectable patterns in model outputs — allows you to identify whether a competitor or adversary is using a stolen copy of your model. It does not prevent theft, but it provides forensic capability to detect it. Few enterprises have implemented this; fewer still know it exists.

The Stackcurve Take

Model theft is the AI-era equivalent of source code theft — and the enterprise security playbook for source code theft is well-established. Version control with access logging. Strict permissions with least privilege. DLP controls on file transfer. Monitoring for anomalous access patterns. Insider threat awareness. That playbook transfers directly to model security; most enterprises simply have not applied it.

The supply chain dimension is new and requires specific action. Every open-source model, every pre-trained component, every third-party AI integration is a potential supply chain vector. Scanning before deployment is the minimum viable control — the equivalent of running a virus scan on a downloaded executable, which no enterprise would skip.

The model extraction risk is the hardest to address because it exploits legitimate API access. Rate limiting provides some protection. Output watermarking provides detection after the fact. Behavioral monitoring of API query patterns is the most proactive control available, and it requires explicit investment in monitoring infrastructure tuned for this specific pattern.

Start with the direct theft controls — they are the highest impact and easiest to implement. Then address the supply chain gap. Then build toward extraction monitoring. None of this requires a dedicated AI security product to start; it requires applying existing security disciplines to an asset category you may not have been treating as sensitive.

The 2026 Stackcurve AI Security CURVE™ Report covers the AI Model Security and AI Supply Chain Security categories in detail. Download it free →

← Back to Research Library

Stackcurve Advisory Briefs are independent research. No vendor pays for placement, tier assignment, or editorial influence. The CURVE™ methodology is disclosed in full at stackcurve.net/research/methodology.