Prompt Injection Hardening for Tool Calls
Prompt injection is a system vulnerability, not a “prompting mistake.” Once an agent can call tools, untrusted text can attempt to override instructions, exfiltrate secrets, or trigger unsafe side effects. Hardening requires clear trust boundaries, strict tool schemas, and policy enforcement at execution time.
Threat Model
| Injection Vector | Example | Risk | |—|—|—| | Web content | malicious page tells agent to reveal secrets | data exfiltration | | Documents | embedded instructions in PDFs or emails | policy bypass | | Tool outputs | tool returns adversarial text | tool-chain takeover | | User input | user asks to override safety or routing | unsafe actions |
Premium Gaming TV65-Inch OLED Gaming PickLG 65-Inch Class OLED evo AI 4K C5 Series Smart TV (OLED65C5PUA, 2025)
LG 65-Inch Class OLED evo AI 4K C5 Series Smart TV (OLED65C5PUA, 2025)
A premium gaming-and-entertainment TV option for console pages, living-room gaming roundups, and OLED recommendation articles.
- 65-inch 4K OLED display
- Up to 144Hz refresh support
- Dolby Vision and Dolby Atmos
- Four HDMI 2.1 inputs
- G-Sync, FreeSync, and VRR support
Why it stands out
- Great gaming feature set
- Strong OLED picture quality
- Works well in premium console or PC-over-TV setups
Things to know
- Premium purchase
- Large-screen price moves often
Core Defenses
- Separate system instructions from untrusted content. Never concatenate blindly.
- Use a tool gateway: all tool calls pass through schema validation and policy checks.
- Apply least privilege: only enable tools required for the workflow.
- Strip or quarantine instructions from retrieved documents.
- Require confirmations for external side effects.
Tool Call Hardening Checklist
| Control | Implementation | Effect | |—|—|—| | Schema validation | JSON schema + strict parsing | blocks malformed calls | | Allowlist | tool + method allowlist | limits blast radius | | Rate limits | per user and per workflow caps | prevents abuse storms | | Secrets isolation | never expose secrets to the model | prevents exfiltration | | Audit logs | log calls + reason codes | supports incident response |
Safe Browsing Pattern
When browsing, treat external text as hostile. Extract facts, not instructions. If your system needs to follow instructions, those instructions must come from trusted policy code, not from arbitrary pages.
- Parse web pages into neutral facts and citations.
- Ignore any directive language inside retrieved content.
- Keep tool execution behind an approval step for sensitive actions.
Related Reading
Navigation
- AI Topics
- AI Topics Index
- Glossary
- Infrastructure Shift Briefs
- Capability Reports
- Tool Stack Spotlights
Nearby Topics
- Threat Modeling for AI Systems
- Tool Selection Policies and Routing Logic
- Sandbox Isolation and Execution Constraints
- Secure Retrieval With Permission-Aware Filtering
- Secure Prompt and Policy Version Control
Appendix: Implementation Blueprint
A reliable implementation starts by versioning every moving part, instrumenting it end-to- end, and defining rollback criteria. From there, tighten enforcement points: schema validation, policy checks, and permission-aware retrieval. Finally, measure outcomes and feed the results back into regression suites. The infrastructure shift is real, but it still follows operational fundamentals: observability, ownership, and reversible change.
| Step | Output | |—|—| | Define boundary | inputs, outputs, success criteria | | Version | prompt/policy/tool/index versions | | Instrument | traces + metrics + logs | | Validate | schemas + guard checks | | Release | canary + rollback | | Operate | alerts + runbooks |
Implementation Notes
In production, the best practices in this topic become constraints that you can enforce and measure. That means versioning, observability, and testable rules. When you cannot measure a guardrail, it becomes opinion. When you cannot rollback a change, it becomes fear. The system becomes stable when constraints are explicit.
| Operational Question | Artifact That Answers It | |—|—| | What changed | version ledger and changelog | | Did quality regress | regression suite report | | Where did time go | stage timing traces | | Why did cost rise | token and cache dashboards | | Can we stop it | kill switch and routing policy |
A reliable practice is to attach a small number of “reason codes” to every enforcement decision. When a tool call is blocked, record the reason code. When a degraded mode is activated, record the reason code. This turns operational history into data you can improve.
Implementation Notes
In production, the best practices in this topic become constraints that you can enforce and measure. That means versioning, observability, and testable rules. When you cannot measure a guardrail, it becomes opinion. When you cannot rollback a change, it becomes fear. The system becomes stable when constraints are explicit.
| Operational Question | Artifact That Answers It | |—|—| | What changed | version ledger and changelog | | Did quality regress | regression suite report | | Where did time go | stage timing traces | | Why did cost rise | token and cache dashboards | | Can we stop it | kill switch and routing policy |
A reliable practice is to attach a small number of “reason codes” to every enforcement decision. When a tool call is blocked, record the reason code. When a degraded mode is activated, record the reason code. This turns operational history into data you can improve.
Implementation Notes
In production, the best practices in this topic become constraints that you can enforce and measure. That means versioning, observability, and testable rules. When you cannot measure a guardrail, it becomes opinion. When you cannot rollback a change, it becomes fear. The system becomes stable when constraints are explicit.
| Operational Question | Artifact That Answers It | |—|—| | What changed | version ledger and changelog | | Did quality regress | regression suite report | | Where did time go | stage timing traces | | Why did cost rise | token and cache dashboards | | Can we stop it | kill switch and routing policy |
A reliable practice is to attach a small number of “reason codes” to every enforcement decision. When a tool call is blocked, record the reason code. When a degraded mode is activated, record the reason code. This turns operational history into data you can improve.
Implementation Notes
In production, the best practices in this topic become constraints that you can enforce and measure. That means versioning, observability, and testable rules. When you cannot measure a guardrail, it becomes opinion. When you cannot rollback a change, it becomes fear. The system becomes stable when constraints are explicit.
| Operational Question | Artifact That Answers It | |—|—| | What changed | version ledger and changelog | | Did quality regress | regression suite report | | Where did time go | stage timing traces | | Why did cost rise | token and cache dashboards | | Can we stop it | kill switch and routing policy |
A reliable practice is to attach a small number of “reason codes” to every enforcement decision. When a tool call is blocked, record the reason code. When a degraded mode is activated, record the reason code. This turns operational history into data you can improve.
Implementation Notes
In production, the best practices in this topic become constraints that you can enforce and measure. That means versioning, observability, and testable rules. When you cannot measure a guardrail, it becomes opinion. When you cannot rollback a change, it becomes fear. The system becomes stable when constraints are explicit.
| Operational Question | Artifact That Answers It | |—|—| | What changed | version ledger and changelog | | Did quality regress | regression suite report | | Where did time go | stage timing traces | | Why did cost rise | token and cache dashboards | | Can we stop it | kill switch and routing policy |
A reliable practice is to attach a small number of “reason codes” to every enforcement decision. When a tool call is blocked, record the reason code. When a degraded mode is activated, record the reason code. This turns operational history into data you can improve.
Implementation Notes
In production, the best practices in this topic become constraints that you can enforce and measure. That means versioning, observability, and testable rules. When you cannot measure a guardrail, it becomes opinion. When you cannot rollback a change, it becomes fear. The system becomes stable when constraints are explicit.
| Operational Question | Artifact That Answers It | |—|—| | What changed | version ledger and changelog | | Did quality regress | regression suite report | | Where did time go | stage timing traces | | Why did cost rise | token and cache dashboards | | Can we stop it | kill switch and routing policy |
A reliable practice is to attach a small number of “reason codes” to every enforcement decision. When a tool call is blocked, record the reason code. When a degraded mode is activated, record the reason code. This turns operational history into data you can improve.
