How I run Claude Code: Opus 4.8 advisor + ultracode
Two switches I flipped and mostly forgot about. Then I read nine days of my own session logs to see what they actually did.
TL;DR
advisor is a built-in tool the model calls on itself to get a second opinion from a reviewer model. You don't type a command, you just enable it. ultracode (/effort ultracode) cranks reasoning to its highest setting and lets the model fan work out across fleets of background agents instead of doing it all in one thread.
Both are experimental and both cost more tokens. The honest verdict from my logs: the payoff is task-shaped, not uniform. It is a real win on parallel, verification-heavy work, overkill on deterministic code, and, to its credit, the model knows when to skip it entirely.
Two switches, then nine days of logs
Claude Code, out of the box, optimizes for fast and cheap. That's the right default for most of what I do. But some work deserves the opposite: audits, refactors, research, anything where being wrong is expensive. So I flipped two switches. I let the model call an advisor for second opinions, and I set ultracode, which trades tokens for thoroughness and lets the model orchestrate work across background agents.
Then I mostly forgot about them. Nine days later I went back and read my own session logs. The numbers are the most honest review I can give:
Across those nine days I ran 118 actual sessions. The model spawned 445 background agents on top of them, and reached for a second opinion 225 times, and I had only asked for one, by name, three times. I didn't use these features so much as enable them and let the model decide when they were worth it.
The advisor: the model grading its own homework
The first thing to clear up: advisor is not a command you type to ask for advice. It's an experimental tool the model is instructed to call on its own initiative, at decision boundaries, before it commits to an interpretation, and again right before it declares a task done. When it fires, it forwards the entire conversation to a reviewer model you choose, and gets back a critique.
That's why the ratio above is so lopsided. In nine days I typed /advisor exactly three times. The tool itself fired about 225 times. Roughly 45 of those calls were in sessions I was driving. The other 180 or so came from inside the background agents the model had spawned. The habit propagates: turn it on, and the whole fleet starts checking its work.
The reviewer doesn't have to be smarter, just fresh
There's one rule worth getting right: the reviewer model has to be at least as capable as the main one. My main model is Opus 4.8, and I point the advisor at Opus too. So in practice the "reviewer" is a second Opus 4.8 with a clean head and my full transcript, not some bigger, secret model. Equal capability counts as eligible, and my logs confirm it: every single session where the advisor fired was running Opus 4.8 as the main model, and it fired anyway.
I mention this because I'd half-absorbed the idea that the advisor won't bother when your main model is already the top one. My own logs say otherwise. If you're configuring this, check the current docs, but with the advisor pointed at the same tier as your main model, mine ran fine.
What a second opinion is actually for
Most advisor calls just confirm what the model was already going to do and say "proceed." That's fine. It's cheap insurance at an expensive moment. But a meaningful minority catch a real gap, and the pattern is consistent: they cluster right before the model declares something done.
The useful question the advisor keeps asking is some version of "did you actually verify the thing you're about to claim?" More than once it caught the model about to call a task finished after checking only the easy case, not the one the task was really about. The fix was never clever. It was just a second reader noticing the gap before the work shipped past it.
The flip side, for honesty: the advisor runs on the same capacity as everything else, so sometimes it's rate-limited and simply unavailable. When that happened, the model said so plainly, proceeded only when its own evidence was already strong, and committed to retry before finishing. It's a checkpoint, not an oracle, and it fails like one occasionally.
ultracode: effort to the top, plus a swarm
Claude Code has an effort dial: low → medium → high → xhigh → max, plus ultracode and auto. Opus 4.8 defaults to high. ultracode is the top of that ladder: extended (xhigh) reasoning, plus standing permission to orchestrate dynamic workflows, fanning a task out across many background agents instead of grinding through it in one thread.
There are two ways to invoke it. /effort ultracode flips it on for the whole session. Or you can drop the bare word ultracode into a single prompt to route just that one turn through the workflow machinery. (Don't confuse it with ultrathink, which only makes the model think harder, with no agents.)
How real is the fan-out? Of the 560 transcript files those nine days produced, only 118 were sessions I actually sat and typed in. The other 445 were agents the model spun up on its own. That ratio is the entire mode in one statistic.
Where the swarm pays off, and where it doesn't
The same mode produces very different outcomes depending on the shape of the task. Three patterns showed up again and again.
When a task breaks into many independent units, like vetting a long list against strict criteria, or cross-checking the same claim against several sources, the swarm shines. It verifies each unit in parallel and routinely catches cases that a single quick pass waves through.
On a straightforward coding task where the main run is already executing its own checks, spinning up a panel of reviewer agents mostly re-confirms what's already known. Once in a while one even trips over its own feet, flagging a false problem because a file changed while it was looking. No harm, but no value either.
For something tightly coupled and sequential, where each step depends on the last, fanning out is counterproductive. The encouraging part: with ultracode on, the model still recognized this and stayed single-threaded. It doesn't parallelize for the sake of it.
My setup
Nothing exotic. A few keys in settings.json, pointing the advisor at Opus:
// ~/.claude/settings.json
{
"model": "opus[1m]",
"advisorModel": "opus",
"ultracode": true,
"effortLevel": "high",
"alwaysThinkingEnabled": true
}
ultracode is session-only, so it can't be persisted in settings.json. I made it the default by wrapping the launch in my shell config. I liked it enough to turn it on for every session, across every profile:
# ~/.zshrc: make ultracode the default for every launch.
# It's session-only and can't live in settings.json, so pass it at startup.
claude() {
command claude --settings "$HOME/.claude/ultracode-settings.json" "$@"
}
# where ultracode-settings.json is literally: { "ultracode": true } The cost, and when I reach for it
None of this is free. ultracode is slower and burns a lot more tokens, and some of those 445 agents were ceremony. The advisor adds its own token cost and is sometimes unavailable. On a quick reactive edit, all of it is overkill.
But on long, multi-step work (an audit, a refactor, a research dig), the trade looks right. The reason I leave both on by default is the same reason they're worth it: the model decides when to use them, not me. It calls for a second opinion at the moments I'd have skipped, fans out the work that's actually parallel, and leaves the sequential stuff alone. That judgment turned out to be better than mine, which is exactly why I stopped thinking about the switches.
Turn it on
/effort ultracode # whole session: xhigh effort + workflow orchestration
ultracode <your task> # one turn only: route this prompt through the swarm
/advisor # pick the reviewer model the advisor calls