OpenAI ships Codex macOS app with multi-agent orchestration, doubles rate limits across tiers

OpenAI's new Codex desktop app addresses the orchestration problem in agentic development—managing multiple AI agents working simultaneously on different codebases. The company is temporarily opening Codex to free tier users and doubling rate limits for paid tiers, a direct response to competitive pressure from Anthropic's Claude.

TheBiggish Editorial · Monday, February 2, 2026

OpenAI today launched its Codex macOS app, marking a shift from single-agent coding assistance to multi-agent orchestration. The desktop application lets developers manage multiple AI agents working in parallel across isolated git worktrees—addressing what the company identifies as the core challenge in agentic development: supervision at scale.

What's actually new

The app's differentiator is architectural. Unlike IDE extensions that handle one task at a time, Codex organizes agents by project threads, allowing developers to context-switch between simultaneous workstreams. Each agent operates in an isolated copy of the codebase via worktrees, preventing conflicts when exploring different implementation paths.

Background automations run on schedules with queued results—OpenAI uses this internally for CI failure analysis and release briefs. The "skills" framework bundles instructions and scripts for reliable integration with external tools like Figma and Linear, moving beyond code generation into workflow orchestration.

The app demonstrated this with a racing game built using 7 million tokens from a single prompt—Codex autonomously handled design, development, and QA testing by playing its own game.

The competitive context

This launch is explicitly about catching up to Anthropic's Claude, which has gained traction in agentic development. OpenAI is temporarily opening Codex to ChatGPT Free and Go users (previously paid-only) and doubling rate limits across all tiers. CEO Sam Altman acknowledged that despite GPT-5.2-Codex being "the strongest model by far," interface complexity deterred adoption.

The timing matters: this follows GPT-5.2-Codex's December launch by less than two months. Competitors like Anysphere's Cursor are already demonstrating parallel agents building functional web browsers.

What to watch

The success metric isn't whether agents can write code—that's solved. It's whether enterprises can reliably deploy autonomous agents on production codebases. OpenAI's internal use cases (issue triage, summarization) are bounded tasks. Debugging multi-agent systems at scale introduces failure modes that existing DevOps tooling wasn't designed to handle.

The skills framework also creates vendor lock-in risk. Deep integration with OpenAI's proprietary agent orchestration may limit portability if teams need to switch providers.

The app integrates with existing Codex CLI and IDE extensions, maintaining continuity across tools. Rate limit increases apply everywhere Codex runs—app, CLI, IDE, and cloud.

Notably absent: independent benchmarks comparing GPT-5.2-Codex to Claude 3.5 Sonnet. We'll see whether the claimed superiority holds in practice.

What's actually new

The competitive context

What to watch

Related Articles

Apple engineer's workflow cuts AI code hallucinations, lacks validation data

Why most enterprise AI projects fail: missing architecture, not missing models

Wav2Vec 2.0 for local mental health screening: privacy promise meets deployment reality