Overview
In production, most value comes from agent assist: classify, extract, summarize, recommend, and draft with a person in control. Autonomous agents add planning and tool use to take actions. Because LLMs are probabilistic, autonomy requires strict boundaries: limited tools, explicit approvals, logging, and a fast rollback path.
Definitions
Agent assist
- LLM produces recommendations, summaries, extractions, or drafts
- Human approves/edits before actions affect systems or customers
- Great fit for service desks, quality checks, case notes, email drafts
Autonomous agent
- LLM plans tasks, calls tools/APIs, and executes steps
- Policy and approvals gate high-impact actions
- Good fit for repetitive, bounded operations with clear outcomes
When autonomy fails
- Ambiguous goals; sparse or volatile data
- Open-ended browsing or tools without guardrails
- No owner, no kill switch, no audit trail
Decision framework
Impact
- Customer-visible? Financial or safety risk? → keep assist or require approval
- Back-office, reversible steps → autonomy possible with limits
Clarity
- Stable goal and termination conditions
- Deterministic tools with validation
- Grounded context (RAG from approved sources)
Control
- Policy and RBAC in front of tools
- Approvals for thresholds; immutable logs
- Kill switch and rollback
Architecture patterns
Assist patterns
- Suggest-and-approve: LLM proposes; user edits/approves; system executes
- Summarize-and-cite: RAG with citations; blocked without sources
- Extract-and-validate: structured JSON with schema checks
Autonomy patterns
- Plan-and-execute: planner decomposes task; executor calls tools
- Toolformer-style calls: LLM emits function calls; platform enforces allowlists/quotas
- Supervisor agent: meta-agent gates high-risk steps and seeks approval
State & memory
- Short-term memory per task; avoid long-term user data unless policy allows
- Log prompts, tool calls, results, approvals, overrides
- Do not write back to ground truth without validation
Tools & permissions
Tool model
- Allowlisted tools with typed schemas and explicit scopes
- Quotas, rate limits, and dry-run for risky operations
- Test doubles for non-prod; canary releases in prod
Data boundaries
- RAG only from approved corpora; redact PII/secret data
- Context access = user’s access; propagate RBAC/OIDC claims
Guardrails & policy
Policy
Document allowed use cases, prohibited inputs, escalation paths, and human oversight rules.
Filters
Input/output filters (PII, secrets, toxicity); JSON schema validation; jailbreak detection.
Approvals
Threshold-based gates for money, access changes, or customer contact; signatures stored with rationale.
Human-in-the-loop
Design
- Confidence × impact grid for auto vs. review
- Explain sources; show diffs; one-click edits
- Collect feedback to retrain prompts/models
Approval UX
- Summaries with citations and blocked red flags
- Clear “why” for recommendations and actions
Evaluation & monitoring
Offline
- Task accuracy, groundedness/faithfulness
- Adversarial prompts; tool misuse tests
Online
- Override rate, approval latency, safety flags
- Tool error rates, API timeouts, retries
Drift & regression
- Golden sets after model/prompt/data changes
- Canary/ring deploys; rollback plan
Operations, cost & SRE
Run rules
- SLIs/SLOs (latency, success, hallucination flags)
- Rate limits, quotas, token budgets per team
- Playbooks for timeouts, degraded modes, kill switch
Costs
- Token/compute cost, embedding/caching, vector store I/O
- Human review time; retraining cycles
90-day starter
Days 0–30
- Pick one assist use case (classify/route or summarize)
- Publish policy; define tools and scopes
- Set approval thresholds and logging
Days 31–60
- Add RAG from approved sources; schema-check outputs
- Instrument override rate and safety flags
- Trial a single autonomous action behind approval
Days 61–90
- Canary rollout; track value and risk deltas
- Decide assist-only vs. limited autonomy; finalize SLOs
References
- NIST AI Risk Management Framework — nist.gov
- OWASP Top 10 for LLM Applications — owasp.org
- Model / System Cards (transparency) — Google · Meta
- OpenAPI / OAuth 2.0 / OIDC (tool auth & contracts) — openapis.org · RFC 6749 · openid.net
Assist by default. Autonomy by exception. Guardrails always.
If you want a guardrails checklist and an agent tool-scope template, ask for a copy.