Nanobrowser Source Code: The Multi-Agent Scraping Loop — How Executor Drives Planner + Navigator

Inside Nanobrowser's Executor class: the core loop that orchestrates Planner and Navigator through multi-step page scraping. Periodic planning, single-step execution, failure counting, and task completion validation.

16Yun Engineering TeamApr 11, 20264 min read

Introduction: What Happens Inside When You Run a Three-Step Scraping Task

A typical task: "Open a product page → extract the price → go to the next page."

Inside Nanobrowser, this isn't a single linear execution. The Executor class decomposes it into a loop — the Planner periodically checks direction, the Navigator executes one concrete action per iteration. Both agents share context through the MessageManager.

executor.ts (~400 lines) is the core of this mechanism.

The Executor Constructor

executor.ts:44-85

export class Executor {
  constructor(
    task: string, taskId: string,
    browserContext: BrowserContext,
    navigatorLLM: BaseChatModel,
    extraArgs?: Partial<ExecutorExtraArgs>,
  ) {
    const plannerLLM = extraArgs?.plannerLLM ?? navigatorLLM;
    const messageManager = new MessageManager();
 
    this.navigator = new NavigatorAgent(actionRegistry, {
      chatLLM: navigatorLLM, context, prompt: this.navigatorPrompt,
    });
    this.planner = new PlannerAgent({
      chatLLM: plannerLLM, context, prompt: this.plannerPrompt,
    });
  }
}

Key decisions:

  1. Separate LLMs for Planner and Navigator — Planner uses a stronger model (e.g., Claude Sonnet) for reasoning, Navigator uses a faster model (e.g., Haiku) for execution. If plannerLLM is omitted, both share the same model.

  2. NavigatorPrompt parameterized by maxActionsPerStep — controls how many operations the LLM can execute in a single call.

  3. Dynamic ActionRegistryActionBuilder.buildDefaultActions() constructs available operations at runtime based on BrowserContext and LLM. Not hardcoded.

The Core Scraping Loop

executor.ts:135-230

async execute(): Promise<void> {
  let step = 0;
  let latestPlanOutput = null;
  let navigatorDone = false;
 
  for (step = 0; step < allowedMaxSteps; step++) {
    if (await this.shouldStop()) break;
 
    // Planner runs periodically
    if (this.planner
        && (step % planningInterval === 0 || navigatorDone)) {
      navigatorDone = false;
      latestPlanOutput = await this.runPlanner();
      if (this.checkTaskCompletion(latestPlanOutput)) break;
    }
 
    // Navigator executes one step
    navigatorDone = await this.navigate();
  }
 
  // Status determination
  if (latestPlanOutput?.result?.done) {
    // Task complete
  } else if (step >= allowedMaxSteps) {
    // Max steps reached
  }
}

Loop Structure

加载图表中...

Planner Trigger Conditions

step % planningInterval === 0 — by default, Planner runs every N steps (planningInterval from AgentOptions). This avoids calling the most expensive model at every step.

navigatorDone — if an operation is marked "done" by the Navigator, the next iteration triggers Planner to validate. This is the root cause of the Validator hallucination issue covered in the earlier bug article — the "done" check depends on LLM self-assessment.

Status Determination

ConditionStatusEvent
plannerOutput.done === trueCompleteTASK_OK
Hit maxStepsFailedTASK_FAIL + MaxStepsReachedError
context.stoppedCancelledTASK_CANCEL
OtherPausedTASK_PAUSE

Planner Execution

executor.ts:235-290

private async runPlanner(): Promise<AgentOutput<PlannerOutput> | null> {
  // Add current browser state to memory before planning
  if (this.tasks.length > 1 || this.context.nSteps > 0) {
    await this.navigator.addStateMessageToMemory();
  }
 
  const planOutput = await this.planner.execute();
  if (planOutput.result) {
    this.context.messageManager.addPlan(
      JSON.stringify(planOutput.result), positionForPlan
    );
  }
  return planOutput;
}

Key design:

  1. Browser state added before Plannernavigator.addStateMessageToMemory() writes current DOM state, URL, visible elements to MessageManager. Planner reads this for decisions.

  2. Planner output written to message history — the plan (observation, next steps, challenges) is serialized as JSON and stored. Downstream Navigator operations can reference the Planner's reasoning.

  3. Consecutive failure counting — not all errors abort immediately. Only errors that make sense to retry are counted. Auth failures, blocked URLs, and cancellations throw immediately.

executor.ts:295-340

private async navigate(): Promise<boolean> {
  const navOutput = await this.navigator.execute();
  context.nSteps++;
  if (navOutput.error) throw new Error(navOutput.error);
  context.consecutiveFailures = 0;
  if (navOutput.result?.done) return true;
}

Navigator is simpler than Planner — single-step operation. Success resets the failure counter. Failure increments it. Each Navigator step result accumulates through MessageManager to form the action history.

Error Classification System

agents/errors.ts defines the complete error hierarchy:

Error TypeMeaningExecutor Handling
ChatModelAuthErrorInvalid API keyImmediate throw, no retry
ChatModelBadRequestErrorBad request parametersImmediate throw
ChatModelForbiddenErrorInsufficient API permissionsImmediate throw
RequestCancelledErrorUser cancelledImmediate throw
ExtensionConflictErrorExtension conflictsImmediate throw
URLNotAllowedErrorDomain not in whitelistImmediate throw
MaxStepsReachedErrorExceeded step limitDetermined after loop
MaxFailuresReachedErrorExceeded consecutive failure limitThrown in runPlanner/navigate
ResponseParseErrorLLM output parse failureHandled internally

MessageManager: The Communication Pipe Between Agents

Messages flow in this order:

[System prompt] → [Initial task] → [State 1] → [Action result 1] → [State 2] → [Plan 1] → [State 3] → ...

This sequence matters — each agent call is based on the full operation history, not just the current page snapshot.

Lessons for Scraper Developers

Periodic Planning vs Per-Step Planning

Nanobrowser runs Planner every N steps, not every step. A similar pattern for scraping:

Batch scraping strategy:
  Every 10 pages collected → check progress (like Planner)
  Every 1 page collected → extract data (like Navigator)
  More than 3 consecutive errors → abort batch (like maxFailures)

Error Classification in Production

Not all failures need retries. An expired API key retried 10 times is wasted compute. Classify errors as "recoverable" and "unrecoverable" — unrecoverable errors should stop immediately with appropriate alerts.

Key Configuration Parameters

ParameterDefaultPurposeScraping Recommendation
maxSteps100Max operation stepsAdjust for page complexity
planningInterval3Planner run frequencySimple extraction: 5, complex: 1-2
maxFailures3Consecutive failure limitSet higher if errors frequent, but ≤ 5
useVisionfalseEnable vision modelEnable for complex layouts

Summary

The Executor's loop is an elegant design:

  1. Planner periodically guides direction (every N steps, avoids unnecessary reasoning cost)
  2. Navigator executes one step at a time (one LLM call per step)
  3. Failure counting protection (abandons after consecutive failures, avoids wasted retries)
  4. Complete error classification (recoverable vs unrecoverable, handled differently)

These three Nanobrowser source code analysis articles cover the full pipeline from "page content extraction" to "element detection and targeting" to "multi-agent collaboration loop" — the complete internal machinery of an AI browser agent executing data collection tasks.

Need an enterprise proxy plan?

We can tailor architecture to your target domains, concurrency, and reliability goals.