Model routing

Benchmarks you can apply.

SwarmMarshal treats model choice as a lifecycle: establish a frontier baseline, capture expensive functions, test cheaper challengers, then publish the winners as resettable recommendations.

Workflow

Start strong. Optimize where spend proves it matters.

Frontier baseline New functions start with strong models so SwarmMarshal captures high-quality champion outputs.
Cost watch The app ranks functions by recent spend and starts bounded prompt/response capture only where optimization is worth it.
Replay challengers Captured prompts are replayed against cheaper models and judged against the champion response.
Publish winners Benchmark-backed winners update the recommendation preset below; users can apply that preset inside Model Preferences.

Published presets

Current routing sets.

These presets are generated from the shared SwarmMarshal routing catalog used by the desktop app. The “recommended” set will become benchmark-backed as public A/B campaigns complete.

Frontier baseline

High-quality reset for establishing champion outputs before optimizing cost downward.

2026.04 Apr 28, 2026

Uses OpenAI GPT-5 for chat, analysis, classification, code, and vision work. Embeddings are pinned app-wide and are not part of routing presets.

SwarmMarshal recommended

Initial public recommendation set. Replace this with benchmark-backed winners as A/B campaigns finish.

2026.04-initial Apr 28, 2026

Conservative default: near-frontier models for routine work, full frontier for agent/tool-use, vision, and code paths.

Frontier baseline · 2026.04
Function family	Preferred	Fallback	Vision	Rationale
Default	`OpenAI/gpt-5`	`OpenAI/gpt-5-mini`	No	Catch-all frontier baseline.
Chatbot	`OpenAI/gpt-5`	`OpenAI/gpt-5-mini`	No	General conversational quality baseline.
Agent Chat	`OpenAI/gpt-5`	`OpenAI/gpt-5-mini`	No	Tool-use capable baseline for interactive agent turns.
Content Generation	`OpenAI/gpt-5`	`OpenAI/gpt-5-mini`	No	Drafting and generation champion baseline.
Summarization	`OpenAI/gpt-5`	`OpenAI/gpt-5-mini`	No	Reference summaries before testing cheaper models.
Classification	`OpenAI/gpt-5`	`OpenAI/gpt-5-mini`	No	High-quality labels for later agreement checks.
Data Analysis	`OpenAI/gpt-5`	`OpenAI/gpt-5-mini`	No	Structured reasoning and analysis baseline.
Code Generation	`OpenAI/gpt-5`	`OpenAI/gpt-5-mini`	No	Code and diagnostic baseline.
Vision	`OpenAI/gpt-5`	`OpenAI/gpt-4o`	Yes	Vision-capable frontier baseline.
Message Pipeline	`OpenAI/gpt-5`	`OpenAI/gpt-5-mini`	No	High-value per-message enrichment baseline.

SwarmMarshal recommended · 2026.04-initial
Function family	Preferred	Fallback	Vision	Rationale
Default	`OpenAI/gpt-5-mini`	`OpenAI/gpt-5`	No	Cost-aware default with frontier fallback.
Chatbot	`OpenAI/gpt-5-mini`	`OpenAI/gpt-5`	No	Good quality/cost balance for routine assistant turns.
Agent Chat	`OpenAI/gpt-5`	`OpenAI/gpt-5-mini`	No	Keep tool-use and agent orchestration on a strong model.
Content Generation	`OpenAI/gpt-5-mini`	`OpenAI/gpt-5`	No	Drafting starts cheaper, escalates if needed.
Summarization	`OpenAI/gpt-5-mini`	`OpenAI/gpt-5`	No	Routine summaries are usually safe on mini.
Classification	`OpenAI/gpt-5-mini`	`OpenAI/gpt-5`	No	Classification stays auditable against frontier outputs.
Data Analysis	`OpenAI/gpt-5-mini`	`OpenAI/gpt-5`	No	Balanced analysis default.
Code Generation	`OpenAI/gpt-5`	`OpenAI/gpt-5-mini`	No	Code and diagnostics keep the frontier champion.
Vision	`OpenAI/gpt-5`	`OpenAI/gpt-4o`	Yes	Vision stays on frontier until benchmarked down.
Message Pipeline	`OpenAI/gpt-5-mini`	`OpenAI/gpt-5`	No	High-volume pipeline starts cost-aware, with frontier fallback.

Maintenance

How this page stays current.

Benchmark campaigns update ModelRoutingPresetCatalog. The app reads that catalog for reset buttons, and this page renders the same catalog for public documentation. One catalog, two surfaces.

A/B campaign evidence Versioned preset Website render In-app reset

Want the app defaults?

Open Model Preferences and apply a preset.

The desktop app includes “Frontier reset” and “Use published” actions that apply these routing sets.

Get the app Read the runtime