SecureCode:fyi
Posts
Securing Generative AI: Strategies, Methodologies, Tools and Best Practices - Course Overview

Securing Generative AI: Strategies, Methodologies, Tools and Best Practices - Course Overview

Great course for AI Security overview by Omar Santos from Cisco's Security and Trust Organization

Thomas Nguyen
January 26, 2025

Securing Generative AI: Strategies, Methodologies, Tools and Best Practices

Course Overview

This comprehensive course, taught by Omar Santos from Cisco's Security and Trust Organization, covers the entire spectrum of AI security. The course is structured into six lessons, each focusing on critical aspects of securing AI implementations.

LinkedIn Link for the course: https://www.linkedin.com/learning/securing-generative-ai-strategies-methodologies-tools-and-best-practices

The course is 4 hours long, but trust me, there are ton of information, not just focusing on AI but security in general to help you understand the concept of Security & AI Security, so I would recommended to review/playback whatever you can

Course Mind Map

Lesson Summaries

Lesson 1: Introduction to AI Threats and LLM Security

LLM Current State
- LLMs are widely adopted across industries
- Used in: ChatGPT, Claude, Grok, search engines, operating systems
- Transforming interactions in medicine, cybersecurity, programming
- Unlike predecessors, modern LLMs can understand and generate human-like text and images
RAG (Retrieval Augmented Generation)
- Enhances LLM capabilities with external knowledge
- Sources from document collections and databases
- Main purpose: enhance quality of responses
- Helps reduce likelihood of hallucinations
Security Frameworks
- OWASP Top 10 for LLMs:
  - Provides baseline for LLM application security
  - Includes prompt injection, output handling, data poisoning
- MITRE ATLAS:
  - Similar structure to ATT&CK framework
  - Maps AI-specific tactics and techniques
  - Includes real-world examples and case studies
- NIST Taxonomy:
  - Provides structured approach to AI security
  - Enables common language for discussing AI security
  - Helps develop comprehensive security strategies

Lesson 2: Prompt Injection and Insecure Output Handling

Prompt Injection Attacks
- Targets language models through natural language components
- Affects all types of models: ChatGPT, Microsoft Phi, Llama, Mistral
- Real-world example: Microsoft Bing Chat manipulation through website access
- Can occur through:
  - Direct user interface interaction
  - Indirect methods (loaded documents, websites)
ChatML and System Prompts
- System prompts define behavior, capabilities, and limitations
- Invisible to end users but fundamental to model operation
- Meta prompt extraction attacks can reveal system prompts
- Toxicity attacks can manipulate model behavior
Privilege Control
- LLM stack components:
  - Inference models (proprietary/open source)
  - Orchestration libraries (LlamaIndex, LangChain)
  - Frontend applications
- Common issues:
  - Too much permission given to agents
  - Exceeding original user permissions
  - Lack of role-based access control
Output Handling Security
- OWASP ASVS defines three security levels:
  - Level 1: Basic security
  - Level 2: Standard security (most applications)
  - Level 3: Critical applications (medical, high-volume transactions)
- Key protections:
  - Input validation
  - Output encoding
  - Sandboxing for code execution
  - Protection against XSS, CSRF, SSRF
API Security Best Practices
- Use secure vaults for API keys (HashiCorp Vault, AWS Secrets Manager)
- Implement token rotation
- Monitor API token usage
- Regular audits of permission settings
- Automated monitoring of authentication attempts

Lesson 3: Training Data Poisoning, Model DoS & Supply Chain

Training Data Poisoning
- Core concept: Manipulating training data to influence model behavior
- Attack objectives:
  - Degrade model performance
  - Introduce specific vulnerabilities
  - Create backdoors for later exploitation
- Attack methods:
  - Label manipulation (e.g., mislabeling training samples)
  - Data injection (adding malicious data points)
  - Altering existing data points
- Defense strategies:
  - Data sanitation
  - Robust data collection processes
  - Quality control in labeling
  - Differential privacy techniques
Model Denial of Service (DoS)
- Occurs when attackers:
  - Create resource-consuming prompts
  - Increase underlying resource costs
  - Decline quality of service
- Attack techniques:
  - High volume generation tasks
  - Repetitive long inputs exceeding context window
  - Recursive context expansion
  - Flooding with variable length inputs
- Impact:
  - Service degradation
  - Increased inference costs
  - Resource exhaustion
Supply Chain Security
- Critical components:
  - Data collection and model development
  - Pre-trained models from sources like Hugging Face
  - Third-party libraries
  - Vector databases
- Known incidents:
  - Manipulated models on Hugging Face
  - Data breaches in AI companies
- Security measures:
  - Create comprehensive inventory
  - Implement AI Bill of Materials (BOM)
  - Use standards like SPDX 3.0 or CycloneDX
  - Track model provenance
Cloud Environment Security
- Popular platforms:
  - Google Vertex AI
  - Amazon Bedrock
  - Azure AI Studio
  - Amazon SageMaker
- Key advantages:
  - Scalability
  - Cost efficiency
  - Integration capabilities
- Security considerations:
  - Access controls
  - Data protection
  - Resource management

Lesson 4: Information Disclosure, Plugin Design & Agency

Sensitive Information Disclosure
- AI models can leak sensitive information through output/inference
- Two-way trust boundary issues:
  - User input (prompts) cannot be trusted
  - Model output (inference) cannot be trusted
- Provider considerations:
  - Clear terms of use policies
  - Data handling transparency
  - Opt-out options for training data
- Mitigation approaches:
  - Data governance strategy
  - Input/output monitoring
  - System prompt restrictions
Plugin Security Risks
- Core issues:
  - Automatic plugin invocation during user interactions
  - Operation without application control
  - Free text inputs without validation
- Vulnerability examples:
  - Single text field accepting all parameters
  - Configuration strings overriding settings
  - Raw SQL statements in parameters
- Attack scenarios:
  - Malicious domain injection
  - SQL injection through advanced filters
  - Repository ownership transfer through indirect prompt injection
Excessive Agency
- Definition: LLM agents given too much authority
- Risk factors:
  - Too much functionality
  - Excessive permissions
  - Too much autonomy
- Common vulnerabilities:
  - Unnecessary plugin access
  - Retained development functionality
  - Poor input filtering
  - Excessive access permissions
- Prevention measures:
  - Limit plugins to minimum necessary functions
  - Avoid open-ended functions
  - Restrict plugin permissions
  - Track authorization scope

Lesson 5: Overreliance, Model Theft & Red Teaming

Overreliance Issues
- Occurs when:
  - Models present erroneous information as accurate
  - Organizations over-depend on models for tasks
- Key problems:
  - Hallucinations/confabulations
  - Factually incorrect outputs
  - Inappropriate or unsafe content
- Potential impacts:
  - Security misinformation
  - Legal issues
  - Reputational harm
  - Code vulnerabilities in AI-generated code
Model Theft Attacks
- Primary targets: Proprietary LLM models
- Attack methods:
  - Prompt manipulation for weight/parameter extraction
  - Infrastructure exploitation
  - Side channel attacks
  - Model extraction through API queries
- Impact:
  - Economic loss
  - Brand damage
  - Loss of competitive advantage
  - Unauthorized access to sensitive information
Red Teaming AI Models
- Definition: Simulating adversarial attacks in real-world scenarios
- Testing approaches:
  - Automated systems/tools
  - Manual human testing
  - Hybrid approaches
- Key objectives:
  - Enhanced security
  - Improved model performance
  - Regulatory compliance
  - Bias mitigation
Testing Tools & Frameworks
- Available tools:
  - HiddenLayer Model Scanner
  - Meta's Assessment Tools
  - Garak Vulnerability Scanner
  - LLMFuzzer
  - Prompt Security Fuzzer
- Testing datasets:
  - AttaQ (IBM dataset)
  - HarmBench evaluation framework
- Protection tools:
  - Prompt Firewall
  - LLM Guard
  - Robust Intelligence
  - HiddenLayer Detection

Lesson 6: Securing RAG Implementations

RAG Architecture & Orchestration
- Components of LLM Stack:
  - Data pipelines (data lakes, documents, network logs)
  - Indexing systems
  - Embedding models
  - Vector databases
- Orchestration libraries:
  - LangChain: Integration and data management
  - LlamaIndex: Data ingestion and indexing
  - LangGraph: Cyclical workflows and agent implementations
- Caching systems:
  - Redis
  - GPTCache
  - Reduces inference costs
  - Stores common queries
Embedding Model Security
- Selection considerations:
  - Performance benchmarks (MTEB)
  - Domain-specific requirements
  - Security features
- Security best practices:
  - Consider on-premise/private cloud deployment
  - Anonymize/pseudo-anonymize data
  - Implement strong access controls
  - Regular audits of data processing
Vector Database Security
- Key challenges:
  - Limited encryption support
  - Similarity search on encrypted data
- Security solutions:
  - Searchable/queryable encryption
  - MongoDB Vector Search capabilities
  - Homomorphic encryption (future potential)
  - Secure multi-party computation
- Common threats:
  - Unauthorized access
  - Insider threats
  - Malicious vector injection
  - Resource exhaustion
Monitoring & Incident Response
- AI incident types:
  - Sensitive information uploads
  - Model decay/drift
  - Performance degradation
- Response challenges:
  - Reproducibility issues
  - Complex root cause analysis
  - Remediation options
- Best practices:
  - Proactive monitoring
  - Continuous validation
  - AI-specific incident response plans
  - Regular system audits

Key Tools and Resources

Security Frameworks

OWASP Top 10 for LLMs: genai.owasp.org
MITRE ATLAS: atlas.mitre.org
NIST AI Security Guidelines

Testing Tools

Garak: LLM Security Testing
HarmBench: Evaluation Framework
AttaQ Dataset: Testing Scenarios

Reply

or to participate.