Model routing

Benchmarks you can apply.

SwarmMarshal treats model choice as a lifecycle: establish a frontier baseline, capture expensive functions, test cheaper challengers, then publish the winners as resettable recommendations.

Workflow

Start strong. Optimize where spend proves it matters.

  1. Frontier baseline New functions start with strong models so SwarmMarshal captures high-quality champion outputs.
  2. Cost watch The app ranks functions by recent spend and starts bounded prompt/response capture only where optimization is worth it.
  3. Replay challengers Captured prompts are replayed against cheaper models and judged against the champion response.
  4. Publish winners Benchmark-backed winners update the recommendation preset below; users can apply that preset inside Model Preferences.
Published presets

Current routing sets.

These presets are generated from the shared SwarmMarshal routing catalog used by the desktop app. The “recommended” set will become benchmark-backed as public A/B campaigns complete.

Frontier baseline

High-quality reset for establishing champion outputs before optimizing cost downward.

2026.04 Apr 28, 2026

Uses OpenAI GPT-5 for chat, analysis, classification, code, and vision work. Embeddings are pinned app-wide and are not part of routing presets.

SwarmMarshal recommended

Initial public recommendation set. Replace this with benchmark-backed winners as A/B campaigns finish.

2026.04-initial Apr 28, 2026

Conservative default: near-frontier models for routine work, full frontier for agent/tool-use, vision, and code paths.

Frontier baseline · 2026.04
Function family Preferred Fallback Vision Rationale
Default OpenAI/gpt-5 OpenAI/gpt-5-mini No Catch-all frontier baseline.
Chatbot OpenAI/gpt-5 OpenAI/gpt-5-mini No General conversational quality baseline.
Agent Chat OpenAI/gpt-5 OpenAI/gpt-5-mini No Tool-use capable baseline for interactive agent turns.
Content Generation OpenAI/gpt-5 OpenAI/gpt-5-mini No Drafting and generation champion baseline.
Summarization OpenAI/gpt-5 OpenAI/gpt-5-mini No Reference summaries before testing cheaper models.
Classification OpenAI/gpt-5 OpenAI/gpt-5-mini No High-quality labels for later agreement checks.
Data Analysis OpenAI/gpt-5 OpenAI/gpt-5-mini No Structured reasoning and analysis baseline.
Code Generation OpenAI/gpt-5 OpenAI/gpt-5-mini No Code and diagnostic baseline.
Vision OpenAI/gpt-5 OpenAI/gpt-4o Yes Vision-capable frontier baseline.
Message Pipeline OpenAI/gpt-5 OpenAI/gpt-5-mini No High-value per-message enrichment baseline.
SwarmMarshal recommended · 2026.04-initial
Function family Preferred Fallback Vision Rationale
Default OpenAI/gpt-5-mini OpenAI/gpt-5 No Cost-aware default with frontier fallback.
Chatbot OpenAI/gpt-5-mini OpenAI/gpt-5 No Good quality/cost balance for routine assistant turns.
Agent Chat OpenAI/gpt-5 OpenAI/gpt-5-mini No Keep tool-use and agent orchestration on a strong model.
Content Generation OpenAI/gpt-5-mini OpenAI/gpt-5 No Drafting starts cheaper, escalates if needed.
Summarization OpenAI/gpt-5-mini OpenAI/gpt-5 No Routine summaries are usually safe on mini.
Classification OpenAI/gpt-5-mini OpenAI/gpt-5 No Classification stays auditable against frontier outputs.
Data Analysis OpenAI/gpt-5-mini OpenAI/gpt-5 No Balanced analysis default.
Code Generation OpenAI/gpt-5 OpenAI/gpt-5-mini No Code and diagnostics keep the frontier champion.
Vision OpenAI/gpt-5 OpenAI/gpt-4o Yes Vision stays on frontier until benchmarked down.
Message Pipeline OpenAI/gpt-5-mini OpenAI/gpt-5 No High-volume pipeline starts cost-aware, with frontier fallback.
Maintenance

How this page stays current.

Benchmark campaigns update ModelRoutingPresetCatalog. The app reads that catalog for reset buttons, and this page renders the same catalog for public documentation. One catalog, two surfaces.

A/B campaign evidence Versioned preset Website render In-app reset
Want the app defaults?

Open Model Preferences and apply a preset.

The desktop app includes “Frontier reset” and “Use published” actions that apply these routing sets.