Skip to main content

Architecture

This page describes the internal architecture of Baleybots and how the key components fit together.

Processable interface

Everything in Baleybots implements the Processable interface:

interface Processable<TInput, TOutput> {
process(input: TInput, options?: ProcessOptions): Promise<TOutput>;
getId(): string;
getBotNames(): string[];
subscribeToAll(options?): Subscription;
}

A single Baleybot, a pipeline(), a parallel() composition, and a ParallelMerge all implement this interface. This means any composition can be used anywhere a single agent is expected.

Data flow

When you call bot.process(input), the request flows through these stages:

input
-> Baleybot.process()
-> streamWithAISDK() (adapters/ai-sdk/stream-helper.ts)
-> AI SDK provider (OpenAI, Anthropic, etc.)
-> BaleybotStreamEvent stream (provider-agnostic events)
-> segment reducer (segments/core-reducer.ts)
-> StreamSegment[] (UI-canonical representation)
-> final output (extracted from segments)

streamWithAISDK

All LLM calls route through streamWithAISDK() in adapters/ai-sdk/stream-helper.ts. This function:

  1. Resolves the model provider from config
  2. Translates multimodal inputs via translateToContentParts()
  3. Calls the AI SDK streamText() or generateObject() functions
  4. Transforms the AI SDK stream into BaleybotStreamEvent events

BaleybotStreamEvent

The streaming event format is provider-agnostic. All providers transform their native format into this unified type:

type BaleybotStreamEvent =
| { type: 'text_delta'; content: string }
| { type: 'tool_call_stream_start'; id: string; toolName: string }
| { type: 'tool_execution_output'; toolName: string; result: unknown }
| { type: 'error'; error: Error }
// ... more event types

Segment reducer

Stream events are accumulated into StreamSegment[] by the segment reducer. Each event updates the current segment state. For example, text_delta events append to the current TextSegment, while tool_call_stream_start creates a new ToolCallSegment.

Pipeline composition

Pipelines chain Processable instances sequentially. Each step receives the output of the previous step as input:

pipeline().step(A).step(B).step(C).build()

input -> A.process() -> B.process() -> C.process() -> output

Pipeline steps can also include conditional branching (when), loops (loop, recursiveLoop), parallel execution (parallel), and routing (route).

Multimodal inputs

Multimodal inputs (images, audio, video, files) are handled by the builder functions (text(), image(), audio(), etc.) which produce UnifiedMessageInput objects. These are translated to AI SDK content parts by translateToContentParts() in adapters/ai-sdk/multimodal-translator.ts before being sent to the provider.