http.request(method, url, body?)
HTTP with a configurable host allowlist. Returns status, headers, and body.
A look at the runtime, sandbox, knowledge graph, peer protocol, and the small set of design choices that keep agents on rails.
SwarmMarshal replaced 25 hand-coded specialist agents with a single tool-use engine. Each "agent" is a profile: system prompt + enabled tools + LLM provider + budget.
AgenticChatServiceV2, which delegates to the turn runner.
IProfileResolver picks the active profile and trims the tool catalog to its enabled set.
IToolUseAgentEngine calls the LLM, emits text deltas + tool-call events, executes each tool, feeds the result back, repeats until the model stops.
AgentTranscriptMessage; user/assistant pairs land in AgentChatTurn for the UI.
Per-provider adapters normalize streaming tool-use into one TurnEvent stream the engine consumes.
| Provider | Endpoint | Streaming | Tool-call format |
|---|---|---|---|
| OpenAI | /chat/completions |
SSE | Indexed argument deltas |
| Ollama | /chat/completions |
SSE | Shared OpenAI-style adapter |
| Anthropic | /v1/messages |
Named-event SSE | tool_use + input_json_delta |
| Gemini | :streamGenerateContent?alt=sse |
SSE | Atomic functionCall parts; ids synthesized as gem-{n} |
| Grok (xAI) | /chat/completions |
SSE | OpenAI-compatible |
| DeepSeek | /chat/completions |
SSE | OpenAI-compatible |
| Hugging Face | /chat/completions |
SSE | OpenAI-compatible router |
OpenAI, Ollama, Grok, DeepSeek, and Hugging Face all share OpenAIStyleToolUseProvider. Anthropic gets AnthropicToolUseProvider. Gemini gets GeminiToolUseProvider — Google emits whole functionCall objects atomically (no argument streaming), and the translator synthesizes ids since Gemini's API doesn't carry them.
Path-clamped to %LOCALAPPDATA%/SwarmMarshal/sandbox/. Path escapes throw UnauthorizedAccessException before any I/O happens.
http.request(method, url, body?)
HTTP with a configurable host allowlist. Returns status, headers, and body.
shell.exec(command, cwd?)
Shell command pinned to the sandbox directory. Stdout, stderr, and exit code come back.
fs.read_file(path)
Reads a file, clamped to the sandbox root. Path traversal throws before any I/O.
fs.write_file(path, content)
Writes a file, also sandbox-clamped. Atomic replace; parents auto-created.
fs.list_files(directory?)
Lists entries under the sandbox. Default lists the sandbox root.
code.run_csharp(source)
Ad-hoc C# via the existing code-execution skill. Compiles, runs, returns the result.
skills.run(skillId, args)
Generic runner for any registered skill — markdown SKILL.md or compiled C#.
catalog.search_tools(query)
Semantic search over the tool catalog so an agent can discover new tools at runtime.
Messages flow through entity extraction into a local graph of people, companies, topics, and projects. Each neighborhood can get an LLM-written summary, routed through local or cloud models according to your settings.
Embeddings handle "the invoice from Acme last quarter." BM25 handles "ORD-4821." Results are fused so phrasing and exact match both work.
Agents can answer "how many messages left to process?" because the indexer exposes throughput, backlog, and ETA as a built-in skill.
Pair devices over LAN. "Peer chat" is not chat with another human — it's chat with the agent on the other device.
SwarmMarshal is both an MCP client (consumes external connectors) and an MCP server (exposes its own tool modules to other clients).
Add an MCP server such as filesystem, GitHub, Slack, SQLite, browser search, or an internal tool. Its tools land next to the built-ins, and agents pick them through the same routing and approval flow.
Point any MCP client at SwarmMarshal's built-in server and use eight tool modules from outside — useful when you want another agent platform to drive SwarmMarshal's state.
A skill is a callable function with metadata. Author by hand, or let the assistant draft one and queue it for review. The same runtime invokes both.
SKILL.md
YAML frontmatter (id, description, schema) plus a markdown body the LLM reads as instructions. Easy to author, easy to diff.
ISkill (C#)
Typed input + invoke method. Best when the skill needs deterministic logic, sandbox access, or fast loops.
skills.run can call it.
Per-task model assignment with budgets, health checks, and Ollama auto-detect. No single "the model" — different tasks pick different tiers.
Scans the local network for an Ollama install and registers detected models. Local-first when local works, cloud when it doesn't.
Classification on a fast model, summaries on a smart one, drafting on whatever fits the budget. Each function picks its own tier.
Hard caps per model, per agent, per day. Spend Guard cuts off the bill before it surprises you and pages the boss when it does.
If the local AI stack is wedged, the model is undersized, or the error rate spikes, an agent runs a diagnostic skill and proposes a fix. Common repairs stay inside the app, with explicit approval where they change your machine.
fix-wedged-ollama
Detects Ollama hung on a previous request, resets the runner, and reschedules the pending turn.
diagnose-error-rate
Walks the journal, classifies failures by tool and provider, and surfaces the dominant root cause.
diagnose-slow-local-llm
Compares observed latency against the model's expected envelope and recommends an action.
swap-undersized-local-model
If the model can't keep up, proposes (and with approval, performs) a swap to a better-sized local model.
Everything described here is in the shipping app. Download the preview, then poke around.