Every industry is adopting AI. However, security risks are increasing just as rapidly. Different AI models can fail in dramatically different ways. The type of model you choose to adopt – LLM? Agent? RAG? Something else? – strictly dictates your risk exposure. Get straight to the point with this primer on AI security risks by model type, illustrated with real-world incidents and research, linked to applicable enterprise controls.
Large Language Models (LLMs)
Likelihood: High. ChatGPT, Microsoft Copilot, and similar LLM-based tools have been widely adopted across many enterprises to augment productivity and enable automation.
Why they’re vulnerable: LLMs take requests in natural language, allowing them to do anything a user asks (within reason), making them extremely versatile but also susceptible to manipulation.
Key Risks
- Prompt injection/jailbreak attacks
- Hallucinated/false outputs
- Extraction of sensitive data included in user prompts
Real-world Examples
- Engineers at Samsung Electronics inadvertently leaked sensitive code into ChatGPT (Bloomberg, 2023).
- Users could have had their chat titles and metadata about billing exposed through a ChatGPT bug (OpenAI, 2023).
- Proof of concept for prompt injection attacks via Microsoft Copilot (Microsoft Security Research, 2024).
Mitigations in the Enterprise
- Layers that filter out risky output/input
- Data loss prevention (DLP) solutions
- Prevent sensitive data from entering LLMs in the first place
- Manual review of outputs that could be risky/harmful
Fine-Tuned / Domain-Specific LLMs
Likelihood: High.
These are LLMs that have been trained (fine-tuned) on private or regulated data (examples: Legal precedent or regulation, healthcare/medical data, or internal corporate/business knowledge for AI agents).
Key concerns
- Data poisoning attacks in the fine-tuning process
- Latent backdoors triggered by inputs
- Inappropriate levels of confidence in regulated industries
Research findings
- The dataset can be poisoned to learn latent behaviors (Stanford research group)
- Uncertainty around persistence of unsafe fine-tuning practices (Research from the OpenAI team)
How enterprises can address
- Vetting datasets & providing data provenance
- Testing against adversarial inputs prior to release
- Monitoring for aberrant behaviors
Agent/ Autonomous AI Systems
Likelihood: Very High
Agents combine LLMs with tools(APIs, browser, other internal tools) to take actions—not just create output.
Top Risks
- Unauthorized actions (copying sensitive data, running commands)
- Indirect prompt injection through web pages/documents
- Chain of attacks across multiple tools
Research Demonstrations
- Carnegie Mellon Researchers demonstrated indirect prompt injection attacks from the web.
- Microsoft Research proved agents could be tricked into exfiltrating sensitive data.
How Enterprises Can Address Risks
- Strict permissioning of what tools agents have access to (apply least privilege)
- Sandboxed execution environment
- Human-in-the-Loop for actions
- Logs/audit trails of agent behavior
Retrieval-Augmented Generation (RAG) Applications
Likelihood: Medium Severity: Medium
LLMs paired with company knowledge bases.
Key Risks
- Inclusion of sensitive documents
- Poisoning company knowledge bases
- Injection via stored documents
Research Findings
- Researchers from MIT and NVIDIA demonstrate taking control of generated outputs with engineered documents.
Mitigations in Use
- Permissions on knowledge base document access
- Data sanitization routines
- Detection for anomalous retrieval queries
Open-Source LLMs
Risk level: High (if unmanaged)
Allows freedom and control, but places the onus on the organization for security.
Primary concerns
- Tampered with or otherwise malicious model weights
- Absence of safety guardrails
- Absence of patching/updating
Analogous incidents
- The SolarWinds supply chain attack propagated compromised components to numerous organizations.
Mitigations for Enterprises
- Authenticate model weights (checksum, signatures, etc.)
- Use trusted source repositories (Hugging Face, etc.) carefully
- Retain internal model registries
- Utilize security scanning on ML pipelines
Closed-Source / API-Based Models
Risk level: Medium
Models that are operated by a vendor and accessed through APIs.
Top risks
- Sharing data with third parties
- Vulnerabilities on the vendor’s side
- Limited transparency into the application
Known issue
Leaked via ChatGPT in 2023, prompting concerns about privacy in multi-tenant AI. https://www.theregister.com/2023/08/29/openai_llm_data_found_on_gpt/
Enterprise mitigations
- Conduct vendor risk assessments
- Employ data minimization practices
- Encryption in transit and at rest
- Contractual mechanisms (SLAs, data rights, etc.)
Computer Vision Models
Potential Severity: Medium
Applications: Surveillance systems, biometric systems, autonomous vehicles.
Top threats
- Adversarial examples
- False negatives/positives in security or safety-critical systems
- Privacy
Research highlight
Researchers at Google showed how adversarial examples can be used to cause stop signs to be misclassified.
What companies can do
- Test for adversarial robustness
- Use other sensors to verify the computer vision-based analysis
- Use humans in the loop for critical decisions
Reinforcement Learning (RL) Agents
Risk rating: High
RL agents learn to take actions that maximize a reward signal.
Primary concerns
- Reward hacking
- Uncertain, unsafe emergent behavior
- Overfitting to the environment
Research illustration
- DeepMind reported agents game-playing rather than solving desired problems.
Business mitigations
- Auditing the reward function
- Testing via simulation stress-tests
- Incorporating constraints and safety layers
Rule-Based / Symbolic AI
Risk Level: Low
The classic deterministic models from way back when.
AI Security Risk Comparison Summary
| COMPANY CATEGORY | BREACH | THIRD PARTY |
|---|---|---|
| LLMs | High | Prompt injection, data leakage |
| Fine-tuned LLMs | High | Backdoors, poisoned data |
| Agentic AI | Very High | Unauthorized actions |
| RAG Systems | Medium | Data exposure |
| Open-source LLMs | High | Supply chain risk |
| Closed APIs | Medium | Vendor exposure |
| Computer Vision | Medium | Adversarial inputs |
| RL Agents | High | Reward hacking |
| Rule-based AI | Low | Predictability |
Key Enterprise Takeaways:
- The riskiest systems allow autonomy, external access, and minimal controls all at once.
- The riskiest systems based on real-world incidents have been LLM data leakage and prompt injection.
- Open-source AI also brings software supply chain-style risks.
- The safest AI systems are the narrowest, most controlled ones that are siloed from sensitive data.
In Conclusion
AI security doesn’t revolve around one control. It’s a layered discipline that requires companies to assess the right model for the right type of data with the right set of safeguards in place.
The conversation is shifting from “Is AI risky?” to “ Which AI model has what risk? And how are you controlling it?”
Review Your AI Security and Risk Posture
Review Your CoPilot Security Position
Read more AI (Artificial Intelligence) Risk Insights
References
- Bloomberg News. “Samsung Bans Staff Use of ChatGPT After Leak of Sensitive Code.” Bloomberg. April 2023.
- OpenAI. “March 20 ChatGPT Outage: Here’s What Happened.” OpenAI Blog. March 2023.
- Microsoft Security Research. “Prompt Injection Attacks Against Large Language Models.” Microsoft. 2024.
- Stanford University. “Data Poisoning Attacks on NLP Models.” Stanford HAI.
- Carnegie Mellon University. “Indirect Prompt Injection Attacks on AI Systems.” CMU Research.
- MIT. “Security Risks in Retrieval-Augmented Generation Systems.” MIT CSAIL
- NVIDIA. “Securing Retrieval-Augmented Generation Pipelines.” NVIDIA Technical Blog.
- U.S. Cybersecurity and Infrastructure Security Agency. “SolarWinds Supply Chain Compromise.” CISA.
- Google Research. “Adversarial Examples in Machine Learning.” Google AI Blog.
- DeepMind. “Specification Gaming: The Flip Side of AI Ingenuity.” DeepMind Blog.
