Every agent system has an unglamorous job before it gets to be clever: turn user input into a run. A prompt from the frontend needs an identity, a lifecycle, runtime context, and an event stream before the model says anything. This stop follows the DeerFlow Gateway as it accepts that prompt, records it as a run, installs context, and streams execution events back as they happen.
Keep one idea in view: DeerFlow has to be a runtime before it can be an agent. The Gateway is that runtime host. It owns intake, run status, cancellation, context injection, streaming, and completion. The actual reasoning starts later.
flowchart TD
U["Frontend / IM / LangGraph SDK"] --> EP["POST /threads/{id}/runs/stream"]
EP --> SR["stream_run() · start_run()"]
SR --> RM["RunManager.create_or_reject()"]
RM --> CFG["build_run_config() · compat shim"]
CFG --> TASK["create_task(run_agent)"]
TASK -. background .-> RA["run_agent()"]
RA --> RT["Runtime(context=...)"]
RT --> MK["make_lead_agent(config)"]
MK --> ASM["agent.astream(...)"]
ASM --> SB["StreamBridge"]
SR --> SSE["sse_consumer() · SSE events"]
SB --> SSE Three runtime names, three different jobs
The easiest trap here is collapsing these names into one idea:
LangGraph = the execution kernel agents use (how a graph runs)
LangGraph Server = the optional official HTTP runtime
DeerFlow Gateway = the current primary HTTP runtime, LangGraph-API compatible
backend/langgraph.json still registers lead_agent, which keeps the official LangGraph Server, Studio, and CLI path working. But on the main path, the Gateway imports and calls the graph factory directly. There is no official server in between. The usual product path is DeerFlow’s own Gateway runtime, not LangGraph Server.
The four parts that land a request
Open thread_runs.py and services.py and you’ll keep meeting four roles. Don’t memorize the fields — remember which question each one answers.
RunCreateRequest — the shape of a run request. It carries input, config, context, metadata, stream_mode, plus control bits like interrupt_before / interrupt_after, multitask_strategy, and on_disconnect. It is the caller’s declaration of “how this run should run.”
RunManager — the run registry. create_or_reject() decides whether to create this run at all (for example, the same thread is already running, so the request may be rejected according to multitask_strategy), then handles set_status, cancel, and run metadata persistence.
StreamBridge — a decoupling point. The background graph produces events; the HTTP side consumes them; the two do not call each other directly. run_agent() publishes into the bridge, and sse_consumer() reads from it and formats SSE events. If the frontend disconnects, the backend does not collapse with it.
run_agent() — the background worker that actually does the work ( worker.py ). Note the boundary: it is not the agent. It is the worker that drives graph execution.
configurable, or context?
build_run_config() does a translation job: it turns the fields in the HTTP request into the RunnableConfig shape LangGraph understands. And here sits the one piece of compatibility debt most worth remembering:
config["configurable"] the older runtime-options channel
config["context"] the newer LangGraph runtime-context channel
In compatibility mode, DeerFlow writes key fields into both channels. Whitelisted fields like model_name, mode, thinking_enabled, reasoning_effort, is_plan_mode, subagent_enabled, agent_name, and is_bootstrap get copied across. Why the apparent waste? Older DeerFlow code reads from configurable; newer ToolRuntime.context consumers read from context. Write only one side and a whole class of consumers sees nothing.
Then run_agent() builds a LangGraph Runtime, packs run-scoped data (thread_id / run_id / user_id / app_config) into runtime.context, and tucks it back into config["configurable"]["__pregel_runtime"] so downstream middleware and tools can all reach it through LangGraph’s runtime API.
Own the host · Gateway
- owns auth / metadata / rollback / persistence
- one intake for frontend, IM, and SDK hosts
- full control over a run’s lifecycle
Cost · compat glue
- fields must be translated into LangGraph config
configurable/contextdual-write, historical debtrun_agent()carries too much in one function
Where it will trip you up
configurableandcontextare historical compatibility debt. A field written to only one side may read empty for old code or for newerToolRuntime.contextconsumers. Adding a field? First decide which whitelist it belongs to and whether it needs the dual-write.run_agent()does too much in one place. It mixes graph invocation, persistence, stream publishing, rollback, tracing, and cleanup in a single path — powerful, but heavy. Know which part you’re touching before you touch it.- It runs as a background task. Disconnect, cancellation, rollback, and final checkpoint serialization all need care, or you may return “half-finished state” as success.
- It doesn’t mutate
ThreadStatedirectly. Run records, thread metadata, and stream events are written by the Gateway; the agent’s graph state is written by graph execution. When state changes, first identify which layer wrote it.
The entry layer turns user input into a running graph, streams it back, and then gets out of the way. It has caught the request, filed the run, and installed context. Next, the run reaches make_lead_agent(), where the request stops being intake data and becomes a runnable graph.