This guide covers integrating Tempered as a governance gate in autonomous AI agent workflows.
Autonomous AI agents (Claude Code, GPT agents, custom automation) make decisions and take actions without human oversight. A governance gate ensures that:
Agent decides to take action
↓
Submit evaluation to Tempered (POST /api/v1/evaluations/)
↓
Poll for result (GET /api/v1/evaluations/{id}/)
— or receive via webhook
↓
Check verdict:
PROCEED → execute the action
PROCEED_WITH_MITIGATIONS → implement mitigations, then execute
REVIEW_REQUIRED → pause and notify a human
Create a dedicated API key for your agent with eval scope:
curl -X POST https://your-tempered-instance/api/v1/org/api-keys/ \
-H "Authorization: Bearer prx_your_admin_key" \
-H "Content-Type: application/json" \
-d '{"name": "infrastructure-agent", "scope": "eval"}'
Include structured context that identifies the agent and its authorisation:
import httpx
import time
TEMPERED_URL = "https://your-tempered-instance"
TEMPERED_KEY = "prx_your_agent_key"
def evaluate_action(description: str, context: dict) -> dict:
"""Submit an evaluation and wait for the result."""
client = httpx.Client(
base_url=TEMPERED_URL,
headers={"Authorization": f"Bearer {TEMPERED_KEY}"},
timeout=120,
)
# Submit
response = client.post("/api/v1/evaluations/", json={
"description": description,
"context": {
"agent_identity": "infrastructure-agent-01",
"authorisation_scope": "docker compose operations, config changes",
**context,
},
"idempotency_key": f"agent-{context.get('task_id', 'unknown')}-{int(time.time())}",
})
response.raise_for_status()
eval_id = response.json()["id"]
# Poll for result
for _ in range(60):
time.sleep(2)
result = client.get(f"/api/v1/evaluations/{eval_id}/")
result.raise_for_status()
data = result.json()
if data["status"] in ("completed", "failed"):
return data
raise TimeoutError(f"Evaluation {eval_id} did not complete in 120 seconds")
def gate_action(description: str, context: dict) -> bool:
"""Returns True if the action is approved."""
result = evaluate_action(description, context)
if result["status"] == "failed":
print(f"Evaluation failed: {result.get('error', 'unknown')}")
return False
recommendation = result["recommendation"]
if recommendation == "PROCEED":
return True
if recommendation == "PROCEED_WITH_MITIGATIONS":
print(f"Approved with conditions: {result.get('conditions', [])}")
# Agent should implement mitigations before proceeding
return True
if recommendation == "REVIEW_REQUIRED":
print(f"Human review required. Rationale: {result.get('rationale', '')}")
# Notify human, pause the workflow
return False
return False
{
"agent_identity": "name-of-the-agent",
"authorisation_scope": "what the agent is allowed to do",
"task_context": "what specific task is being performed",
"triggering_event": "what triggered this action",
"reversibility": "full | partial | none",
"blast_radius": "what systems/users are affected",
"human_approved": true,
"approval_reference": "CHG-1234 or ticket ID"
}
Agents should always include an idempotency_key to prevent duplicate evaluations on retry:
{
"description": "...",
"idempotency_key": "agent-task-12345-1709942400"
}
If the same key is submitted twice, Tempered returns the existing evaluation result instead of creating a new one.
Use change_request_id and plan_id to link evaluations to your change management system:
{
"description": "...",
"change_request_id": "CHG-2024-0456",
"plan_id": "PLN-20260309-001"
}
For agents that can't poll, set up a webhook to receive results asynchronously. See the Webhook Integration guide for details.
| Error | Cause | Agent Action |
|---|---|---|
429 Too Many Requests |
Rate limit exceeded | Back off and retry after Retry-After header |
500 Internal Server Error |
Server error | Retry with exponential backoff (max 3 attempts) |
timeout |
Evaluation took too long | Retry once, then fall back to conservative default |
QUORUM_FAILED |
Not enough vendors responded | Retry, or proceed with caution if the action is low-risk |
change_request_id — connects evaluations to your change management workflow