Tool assembly · DeerFlow

Most readers meet the tools module and think: “this is where functions get registered.” That is only half true. Its real shape appears the day a tool you clearly configured does not appear in front of the model, and you finally ask the useful question: who decides whether a tool is visible?

Hold onto this: this stop is not a function registry; it is a visibility calculation for one run. Config, built-ins, model capability, sandbox rules, skills, MCP, and subagent policy all decide which tools are added, removed, or deferred. Only after that policy pass do you get the final list handed to create_agent(..., tools=...).

flowchart TD
SRC["config.yaml · built-ins · MCP · ACP"] --> GA["get_available_tools()"]
GA --> G1["tool_groups filter"]
G1 --> G2["host bash safety gate"]
G2 --> G3["model capability · vision"]
G3 --> DD["dedupe by tool.name"]
DD --> SK["skill allowed_tools filter"]
SK --> DF["assemble_deferred_tools · defer MCP"]
DF --> CA["create_agent(tools=final_tools)"]

From tool sources to what this run is allowed to see: a two-stage assembly

A candidate set, not the final set

The main entry is get_available_tools() in tools.py . But note the boundary: what it produces is not the final tool list. It is a candidate set. It answers a deliberately limited question:

Given system config, built-ins, and runtime switches,
which tools can this agent candidate acquire?

The flow is roughly: start from AppConfig.tools → filter by a custom agent’s tool_groups → drop host bash if it is not allowed → dynamically import via resolve_variable(cfg.use, BaseTool) → add built-ins → add task_tool by switch and view_image_tool by model capability → attach MCP / ACP → and finally dedupe by tool.name.

Then, back in _make_lead_agent() ( agent.py ), the candidate set passes through two more gates: the skill-level allowed_tools filter and the MCP deferred-tool assembly. Visibility is two-stage: gather candidates, then apply policy.

Config is not the tool itself

A tool in config.yaml looks like this:

tools:
  - name: read_file
    group: file:read
    use: deerflow.sandbox.tools:read_file_tool

That use field is not Python syntax. It is a DeerFlow convention: module.path:variable_name. resolve_variable() hands the part before the colon to import_module(), pulls the variable after the colon from that module, and checks that it really is a BaseTool instance.

This produces a key distinction worth remembering:

the name in config  =  a label for humans reading config
tool.name           =  the name actually exposed to the model and used for routing

Routing looks at BaseTool.name, which need not equal the name written in config.yaml. The moment they diverge, the model sees one schema name while the runtime routes by another — and DeerFlow logs a warning about exactly that.

Visibility is several gates, not one whitelist

“Why is this tool missing?” is hard to chase because visibility crosses several gates, each with its key in a different file:

tool_groups — a custom agent’s coarse whitelist. Set it, and only tools in those groups survive.
host bash safety gate — even if config.yaml declares bash, it must pass is_host_bash_allowed(config) first; if not, it’s dropped. This is a security boundary, not an ordinary toggle.
model capability — view_image_tool is appended only when the selected model declares supports_vision.
skill policy — filter_tools_by_skill_allowed_tools() in tool_policy.py : with no skill declaring allowed_tools, it stays allow-all; the moment a skill declares one, only the names in that set survive. This is least-privilege landing on the toolset.

A tool is not one shape

The second trap in reading the tools module is assuming every tool is “just a function call.” DeerFlow tools come in at least four shapes — recognize them and several later stops get easier:

Plain execution tool — a BaseTool made by @tool; the model emits a tool_call, and ToolNode runs the function. read_file, web_search, and bash are examples.
State-updating tool — present_files doesn’t just return text; it returns a LangGraph Command(update={"artifacts": ...}) that writes graph state directly, and the frontend renders the artifacts from there.
Control-flow tool — ask_clarification’s function body is only a placeholder; the real behavior is in ClarificationMiddleware, which intercepts the call and uses Command(..., goto=END) to halt the run until the user’s next message. Treat user input as an interrupt, not as a tool that sits there blocking on a human.
Delegation tool — task is the subagent entry: it inherits the parent’s sandbox/model/tool policy, builds the child toolset with subagent_enabled=False (so task isn’t recursively exposed), runs a SubagentExecutor in the background, and returns the result as a tool result.

The tool exists; its schema may not be exposed

MCP tools are the clearest proof that a tool can sit in the list and still be invisible to the model. With tool_search.enabled on, MCP tools do not dump full schemas onto the model at once. The prompt lists only names, and the model calls tool_search for the schema when it needs one. The reason is pragmatic:

A single MCP server can expose dozens of tools.
Binding every schema = bigger prompt, higher cost, noisier tool selection.

This path crosses several layers — a textbook case of DeerFlow spreading one feature across modules ( tool_search.py ):

get_available_tools  ->  tag_mcp_tool(t)
assemble_deferred_tools  ->  build catalog + tool_search, return DeferredToolSetup
apply_prompt_template  ->  prompt lists deferred tool names only
model calls tool_search  ->  returns matching schemas + writes state.promoted
DeferredToolFilterMiddleware  ->  unpromoted schemas hidden; a raw call returns an error

The reducer for ThreadState.promoted scopes by catalog_hash: same hash unions the promoted names, a changed hash replaces them — so a stale same-named promotion can’t release a different tool after MCP config changes.

Design rationale

The thing to walk away with is this deliberate two-stage split:

get_available_tools() answers “which tools can this run’s candidate see”; filter_tools_by_skill_allowed_tools() answers “after skills load, which are still allowed.” Separating tool sources from tool policy lets visibility be controlled by both agent-level config and skill-level least-privilege — without the two rules fighting.

One level up: tool visibility is the result of a policy computation, not a static registry. Next time you debug “the tool is missing,” the question isn’t “does the function exist,” it’s “at which gate, by which policy, and for what reason was it dropped.” That’s also the motivation behind 3.0 wanting to split ToolRegistry / ToolPolicy / ToolRuntime — registration, policy, and execution were always three different questions.

What dynamic assembly buys

pluggable sources: config / built-ins / MCP / ACP, all alike
visibility computed per run — tunable by model, sandbox, skill
deferral keeps a flood of MCP tools from bursting the prompt

The cost

dynamic import fails late (only at runtime)
one feature spread across metadata/assembly/prompt/state/middleware
tool.name may diverge from the config name

Where it will trip you up

config name is not tool.name. Routing uses BaseTool.name. When they diverge, the name the model sees and the name the runtime routes by drift apart.
host bash is a security boundary, not a toggle. Declared doesn’t mean allowed — it must pass is_host_bash_allowed(config).
MCP has its own freshness path. get_available_tools() receives app_config, yet MCP extensions are re-read from disk (by mtime) to avoid a stale config across processes.
“Visible” is not “executable.” A deferred MCP tool is still held by ToolNode and can run after promotion; only its schema is withheld from the model. Call it before promotion and middleware rejects the call.
Dedupe has priority. The order is config → built-ins → MCP → ACP, and on a name clash the earlier one wins. Because routing is by name, duplicates make the schema ambiguous.

The tool-assembly stop answers “which tools is this run allowed to see or call,” not “how a given tool runs.” Keep those questions apart and the whole tool system becomes easier to follow. Once the tools are gathered and the graph is assembled, the next behavior-setting variable appears: middleware order. Next stop, we watch order turn into runtime semantics.