← the journey
01

stop 01 · published

Request entry and the agent chain

implementation risk ●●○○○
source anchors · @0fb18e3

Every agent system has an unglamorous job before it gets to be clever: turn user input into a run. A prompt from the frontend needs an identity, a lifecycle, runtime context, and an event stream before the model says anything. This stop follows the DeerFlow Gateway as it accepts that prompt, records it as a run, installs context, and streams execution events back as they happen.

Keep one idea in view: DeerFlow has to be a runtime before it can be an agent. The Gateway is that runtime host. It owns intake, run status, cancellation, context injection, streaming, and completion. The actual reasoning starts later.

flowchart TD
U["Frontend / IM / LangGraph SDK"] --> EP["POST /threads/{id}/runs/stream"]
EP --> SR["stream_run() · start_run()"]
SR --> RM["RunManager.create_or_reject()"]
RM --> CFG["build_run_config() · compat shim"]
CFG --> TASK["create_task(run_agent)"]
TASK -. background .-> RA["run_agent()"]
RA --> RT["Runtime(context=...)"]
RT --> MK["make_lead_agent(config)"]
MK --> ASM["agent.astream(...)"]
ASM --> SB["StreamBridge"]
SR --> SSE["sse_consumer() · SSE events"]
SB --> SSE
One prompt in: how it becomes a background run with a live stream

Three runtime names, three different jobs

The easiest trap here is collapsing these names into one idea:

LangGraph         = the execution kernel agents use (how a graph runs)
LangGraph Server  = the optional official HTTP runtime
DeerFlow Gateway  = the current primary HTTP runtime, LangGraph-API compatible

backend/langgraph.json still registers lead_agent, which keeps the official LangGraph Server, Studio, and CLI path working. But on the main path, the Gateway imports and calls the graph factory directly. There is no official server in between. The usual product path is DeerFlow’s own Gateway runtime, not LangGraph Server.

The four parts that land a request

Open thread_runs.py and services.py and you’ll keep meeting four roles. Don’t memorize the fields — remember which question each one answers.

RunCreateRequest — the shape of a run request. It carries input, config, context, metadata, stream_mode, plus control bits like interrupt_before / interrupt_after, multitask_strategy, and on_disconnect. It is the caller’s declaration of “how this run should run.”

RunManager — the run registry. create_or_reject() decides whether to create this run at all (for example, the same thread is already running, so the request may be rejected according to multitask_strategy), then handles set_status, cancel, and run metadata persistence.

StreamBridge — a decoupling point. The background graph produces events; the HTTP side consumes them; the two do not call each other directly. run_agent() publishes into the bridge, and sse_consumer() reads from it and formats SSE events. If the frontend disconnects, the backend does not collapse with it.

run_agent() — the background worker that actually does the work ( worker.py ). Note the boundary: it is not the agent. It is the worker that drives graph execution.

configurable, or context?

build_run_config() does a translation job: it turns the fields in the HTTP request into the RunnableConfig shape LangGraph understands. And here sits the one piece of compatibility debt most worth remembering:

config["configurable"]   the older runtime-options channel
config["context"]        the newer LangGraph runtime-context channel

In compatibility mode, DeerFlow writes key fields into both channels. Whitelisted fields like model_name, mode, thinking_enabled, reasoning_effort, is_plan_mode, subagent_enabled, agent_name, and is_bootstrap get copied across. Why the apparent waste? Older DeerFlow code reads from configurable; newer ToolRuntime.context consumers read from context. Write only one side and a whole class of consumers sees nothing.

Then run_agent() builds a LangGraph Runtime, packs run-scoped data (thread_id / run_id / user_id / app_config) into runtime.context, and tucks it back into config["configurable"]["__pregel_runtime"] so downstream middleware and tools can all reach it through LangGraph’s runtime API.

Own the host · Gateway

  • owns auth / metadata / rollback / persistence
  • one intake for frontend, IM, and SDK hosts
  • full control over a run’s lifecycle

Cost · compat glue

  • fields must be translated into LangGraph config
  • configurable / context dual-write, historical debt
  • run_agent() carries too much in one function

Where it will trip you up


The entry layer turns user input into a running graph, streams it back, and then gets out of the way. It has caught the request, filed the run, and installed context. Next, the run reaches make_lead_agent(), where the request stops being intake data and becomes a runnable graph.