Trending:
AI & Machine Learning

Vercel research: Static context beats dynamic skills for AI agent accuracy

Vercel's AI SDK team found that embedding documentation in AGENTS.md files achieved 100% pass rates versus 53% for tool-based skills in coding agents. The finding challenges the common enterprise pattern of building complex skill systems to prevent hallucinations.

Vercel research: Static context beats dynamic skills for AI agent accuracy

The Pattern

Enterprise teams building custom AI agents face a recurring problem: hallucinations persist despite adding more tools, skills, and retrieval systems. An agent insists on deprecated Next.js syntax. Another misreads a £716 visa fee as £70,000. The instinct is to add more guardrails.

Vercel's AI SDK team tested a different approach. Instead of building skills that agents invoke on-demand, they embedded knowledge directly into static context files. The results were stark.

The Data

In evaluations using Next.js 16 documentation:

  • Dynamic skills (tool-based lookups): 53% pass rate
  • Static context (AGENTS.md files): 100% pass rate

The failure mode was predictable: agents forgot to invoke tools or invoked them incorrectly. Static context eliminates the decision entirely - the information is always present.

What This Means

The finding aligns with December 2025's VS Code Agent Skills integration and October 2025 discussions comparing Skills.md versus AGENTS.md approaches. The trade-off is clear:

AGENTS.md strengths: No invocation failures. Consistent across conversation turns. Better for codebase-wide rules and project structure.

Skills strengths: Reusable across projects. Better for large, specialized knowledge sets where context window costs matter. Don't inline 50 capabilities.

Vercel's approach - calling it "Hands vs. Brains" - puts knowledge in markdown (AGENTS.md, docs/) and reserves skills for execution-only tasks (API calls, terminal commands).

The Enterprise Angle

For teams building internal agents, this suggests auditing what's implemented as skills. If it's knowledge the agent should always have - coding standards, architecture decisions, approved vendors - it probably belongs in static context.

The risk: over-indexing on either approach. Builder.io's framework treats them as complementary: rules for invariants, skills for task-specific playbooks, commands for workflows. The pattern scales when quantity of capabilities grows.

Worth noting: This applies to deterministic knowledge. For dynamic data or third-party APIs, skills and RAG remain necessary. The question is whether you're building a skill because the agent needs to do something, or because you're trying to make it remember something.

The research is specific to coding agents but the principle - reducing decision points for AI systems - generalizes. Every "should I check this?" moment is a failure point.