Overview

The runLocal() method lets you execute prompts directly against OpenAI, Anthropic, or Google while keeping your prompts in your codebase. You get full observability through Tracia without any added latency.

import { Tracia } from 'tracia';

const tracia = new Tracia({ apiKey: process.env.TRACIA_API_KEY });

const result = await tracia.runLocal({
  model: 'gpt-4o',
  messages: [
    { role: 'system', content: 'You are a helpful assistant.' },
    { role: 'user', content: 'Hello!' }
  ]
});

console.log(result.text);

Why runLocal()?

Some teams prefer managing prompts in their codebase rather than in an external dashboard. This keeps prompts:

Version-controlled with your application code
Reviewed through your standard PR process
Deployed alongside the code that uses them
Constructed programmatically when needed

runLocal() gives you full Tracia observability while respecting this workflow.

How It Works

When you call runLocal(), the SDK:

Calls the provider SDK directly - Your request goes straight to OpenAI, Anthropic, or Google using their native SDK. Tracia is not in the request path.
Sends the trace asynchronously - After the LLM responds, trace data is sent to Tracia in the background. This is non-blocking and adds zero latency to your application.

	`prompts.run()`	`runLocal()`
Prompts stored in	Tracia dashboard	Your codebase
LLM call routed through	Tracia API	Direct to provider SDK
Trace creation	Automatic (server-side)	Async, non-blocking

When to Use runLocal() vs prompts.run()

Use runLocal() when you want to:

Keep prompts in your codebase, version-controlled with git
Build prompts programmatically (e.g., assembling messages based on context)
Prototype quickly without dashboard setup
Use Tracia purely for observability

Use prompts.run() when you want to:

Edit prompts without code deployments
A/B test prompt versions from the dashboard
Let non-engineers manage prompt content
Track prompt versions separately from code versions

Use Case	Recommended Method
Prompts managed in Tracia dashboard	`prompts.run()`
Prompts defined in code	`runLocal()`
Prompts reviewed in PRs	`runLocal()`
Quick prototyping	`runLocal()`
A/B testing prompt versions	`prompts.run()`
Programmatically constructed prompts	`runLocal()`
Non-technical prompt editors	`prompts.run()`

Quick Examples

const result = await tracia.runLocal({
  model: 'gpt-4o',
  messages: [
    { role: 'user', content: 'Explain quantum computing in simple terms.' }
  ],
  temperature: 0.7
});

Available Methods

Basic Usage

Getting started with each provider

Streaming

Real-time streaming responses

Sessions

Automatic trace chaining for multi-turn

Parameters

Complete RunLocalInput reference

Response

RunLocalResult fields and usage

Providers

OpenAI, Anthropic, Google setup

Models

94+ supported models by provider

Variables

Template interpolation syntax

Tracing

Background traces, flush(), error handling

Responses API

OpenAI reasoning models (o1, o3-mini)

Advanced

Error handling, concurrent requests

Types

LLMProvider

enum LLMProvider {
  OPENAI = 'openai',
  ANTHROPIC = 'anthropic',
  GOOGLE = 'google',
}

RunLocalInput

interface RunLocalInput {
  // Required
  messages: LocalPromptMessage[];
  model: string;

  // Streaming
  stream?: boolean;     // When true, returns LocalStream instead of Promise
  signal?: AbortSignal; // Cancel the request (streaming only)

  // Provider override (for custom/new models)
  provider?: 'openai' | 'anthropic' | 'google';

  // LLM configuration
  temperature?: number;
  maxOutputTokens?: number;
  topP?: number;
  stopSequences?: string[];
  timeoutMs?: number;
  customOptions?: Partial<Record<LLMProvider, Record<string, unknown>>>;  // Provider-specific options

  // Tool calling
  tools?: ToolDefinition[];
  toolChoice?: ToolChoice;

  // Variable interpolation
  variables?: Record<string, string>;

  // Provider API key override
  providerApiKey?: string;

  // Span options
  tags?: string[];
  userId?: string;
  sessionId?: string;
  sendTrace?: boolean;  // default: true (sends span to Tracia)
  spanId?: string;      // custom span ID (sp_ + 16 hex chars)
  traceId?: string;     // group related spans together (session)
  parentSpanId?: string;  // link to parent span
}

RunLocalResult

interface RunLocalResult {
  text: string;
  spanId: string;         // Unique ID for this span
  traceId: string | null; // Session ID if part of multi-turn conversation
  latencyMs: number;
  usage: {
    inputTokens: number;
    outputTokens: number;
    totalTokens: number;
  };
  cost: number | null;
  provider: 'openai' | 'anthropic' | 'google';
  model: string;
  toolCalls: ToolCall[];
  finishReason: 'stop' | 'max_tokens' | 'tool_calls';
  message: LocalPromptMessage;  // For easy round-tripping in multi-turn
}

LocalStream

When stream: true is set, runLocal() returns a LocalStream:

interface LocalStream {
  // Span ID available immediately
  readonly spanId: string;

  // Trace ID (session) if provided
  readonly traceId: string | null;

  // Iterate to receive text chunks
  [Symbol.asyncIterator](): AsyncIterator<string>;

  // Final result after stream completes
  readonly result: Promise<StreamResult>;

  // Cancel the stream
  abort(): void;
}

LocalPromptMessage

interface LocalPromptMessage {
  role: 'system' | 'user' | 'assistant' | 'tool';
  content: string | ContentPart[];
  toolCallId?: string;  // Required for 'tool' role
  toolName?: string;    // Required for 'tool' role
}

// Content parts for assistant messages with tool calls
type ContentPart = TextPart | ToolCallPart;

interface TextPart {
  type: 'text';
  text: string;
}

interface ToolCallPart {
  type: 'tool_call';
  id: string;
  name: string;
  arguments: Record<string, unknown>;
}

Getting Started

Node.js SDK

Python SDK

Why runLocal()?

How It Works

When to Use runLocal() vs prompts.run()

Quick Examples

Available Methods

Basic Usage

Streaming

Sessions

Parameters

Response

Providers

Models

Variables

Tracing

Responses API

Advanced

Types

LLMProvider

RunLocalInput

RunLocalResult

LocalStream

LocalPromptMessage

Getting Started

Node.js SDK

Python SDK

​Why runLocal()?

​How It Works

​When to Use runLocal() vs prompts.run()

​Quick Examples

​Available Methods

Basic Usage

Streaming

Sessions

Parameters

Response

Providers

Models

Variables

Tracing

Responses API

Advanced

​Types

​LLMProvider

​RunLocalInput

​RunLocalResult

​LocalStream

​LocalPromptMessage

Why runLocal()?

How It Works

When to Use runLocal() vs prompts.run()

Quick Examples

Available Methods

Types

LLMProvider

RunLocalInput

RunLocalResult

LocalStream

LocalPromptMessage