Architecture
This page describes the internal architecture of Baleybots and how the key components fit together.
Processable interface
Everything in Baleybots implements the Processable interface:
interface Processable<TInput, TOutput> {
process(input: TInput, options?: ProcessOptions): Promise<TOutput>;
getId(): string;
getBotNames(): string[];
subscribeToAll(options?): Subscription;
}
A single Baleybot, a pipeline(), a parallel() composition, and a ParallelMerge all implement this interface. This means any composition can be used anywhere a single agent is expected.
Data flow
When you call bot.process(input), the request flows through these stages:
input
-> Baleybot.process()
-> streamWithAISDK() (adapters/ai-sdk/stream-helper.ts)
-> AI SDK provider (OpenAI, Anthropic, etc.)
-> BaleybotStreamEvent stream (provider-agnostic events)
-> segment reducer (segments/core-reducer.ts)
-> StreamSegment[] (UI-canonical representation)
-> final output (extracted from segments)
streamWithAISDK
All LLM calls route through streamWithAISDK() in adapters/ai-sdk/stream-helper.ts. This function:
- Resolves the model provider from config
- Translates multimodal inputs via
translateToContentParts() - Calls the AI SDK
streamText()orgenerateObject()functions - Transforms the AI SDK stream into
BaleybotStreamEventevents
BaleybotStreamEvent
The streaming event format is provider-agnostic. All providers transform their native format into this unified type:
type BaleybotStreamEvent =
| { type: 'text_delta'; content: string }
| { type: 'tool_call_stream_start'; id: string; toolName: string }
| { type: 'tool_execution_output'; toolName: string; result: unknown }
| { type: 'error'; error: Error }
// ... more event types
Segment reducer
Stream events are accumulated into StreamSegment[] by the segment reducer. Each event updates the current segment state. For example, text_delta events append to the current TextSegment, while tool_call_stream_start creates a new ToolCallSegment.
Pipeline composition
Pipelines chain Processable instances sequentially. Each step receives the output of the previous step as input:
pipeline().step(A).step(B).step(C).build()
input -> A.process() -> B.process() -> C.process() -> output
Pipeline steps can also include conditional branching (when), loops (loop, recursiveLoop), parallel execution (parallel), and routing (route).
Multimodal inputs
Multimodal inputs (images, audio, video, files) are handled by the builder functions (text(), image(), audio(), etc.) which produce UnifiedMessageInput objects. These are translated to AI SDK content parts by translateToContentParts() in adapters/ai-sdk/multimodal-translator.ts before being sent to the provider.