run_local() method lets you execute prompts directly against OpenAI, Anthropic, Google, or Amazon Bedrock while keeping your prompts in your codebase. You get full observability through Tracia without any added latency.
Why run_local()?
Some teams prefer managing prompts in their codebase rather than in an external dashboard. This keeps prompts:- Version-controlled with your application code
- Reviewed through your standard PR process
- Deployed alongside the code that uses them
- Constructed programmatically when needed
run_local() gives you full Tracia observability while respecting this workflow.
How It Works
When you callrun_local(), the SDK:
- Calls the provider via LiteLLM - Your request goes to OpenAI, Anthropic, Google, or Amazon Bedrock through LiteLLM. Tracia is not in the request path.
- Sends the trace asynchronously - After the LLM responds, trace data is sent to Tracia in the background. This is non-blocking and adds zero latency to your application.
prompts.run() | run_local() | |
|---|---|---|
| Prompts stored in | Tracia dashboard | Your codebase |
| LLM call routed through | Tracia API | Direct to provider via LiteLLM |
| Trace creation | Automatic (server-side) | Async, non-blocking |
When to Use run_local() vs prompts.run()
Userun_local() when you want to:
- Keep prompts in your codebase, version-controlled with git
- Build prompts programmatically (e.g., assembling messages based on context)
- Prototype quickly without dashboard setup
- Use Tracia purely for observability
prompts.run() when you want to:
- Edit prompts without code deployments
- A/B test prompt versions from the dashboard
- Let non-engineers manage prompt content
- Track prompt versions separately from code versions
| Use Case | Recommended Method |
|---|---|
| Prompts managed in Tracia dashboard | prompts.run() |
| Prompts defined in code | run_local() |
| Prompts reviewed in PRs | run_local() |
| Quick prototyping | run_local() |
| A/B testing prompt versions | prompts.run() |
| Programmatically constructed prompts | run_local() |
| Non-technical prompt editors | prompts.run() |
Quick Examples
Async Variants
Usearun_local() for async code:
Available Methods
Basic Usage
Getting started with each provider
Streaming
Real-time streaming responses
Sessions
Automatic trace chaining for multi-turn
Parameters
Complete run_local() parameter reference
Response
RunLocalResult fields and usage
Providers
OpenAI, Anthropic, Google, Bedrock setup
Models
94+ supported models by provider
Variables
Template interpolation syntax
Tracing
Background traces, flush(), error handling
Advanced
Error handling, concurrent requests
Types
LLMProvider
RunLocalInput
RunLocalResult
LocalStream
Whenstream=True is set, run_local() returns a LocalStream:

