run_local() method lets you execute prompts directly against OpenAI, Anthropic, or Google while keeping your prompts in your codebase. You get full observability through Tracia without any added latency.
Why run_local()?
Some teams prefer managing prompts in their codebase rather than in an external dashboard. This keeps prompts:- Version-controlled with your application code
- Reviewed through your standard PR process
- Deployed alongside the code that uses them
- Constructed programmatically when needed
run_local() gives you full Tracia observability while respecting this workflow.
How It Works
When you callrun_local(), the SDK:
- Calls the provider via LiteLLM - Your request goes to OpenAI, Anthropic, or Google through LiteLLM. Tracia is not in the request path.
- Sends the trace asynchronously - After the LLM responds, trace data is sent to Tracia in the background. This is non-blocking and adds zero latency to your application.
prompts.run() | run_local() | |
|---|---|---|
| Prompts stored in | Tracia dashboard | Your codebase |
| LLM call routed through | Tracia API | Direct to provider via LiteLLM |
| Trace creation | Automatic (server-side) | Async, non-blocking |
When to Use run_local() vs prompts.run()
Userun_local() when you want to:
- Keep prompts in your codebase, version-controlled with git
- Build prompts programmatically (e.g., assembling messages based on context)
- Prototype quickly without dashboard setup
- Use Tracia purely for observability
prompts.run() when you want to:
- Edit prompts without code deployments
- A/B test prompt versions from the dashboard
- Let non-engineers manage prompt content
- Track prompt versions separately from code versions
| Use Case | Recommended Method |
|---|---|
| Prompts managed in Tracia dashboard | prompts.run() |
| Prompts defined in code | run_local() |
| Prompts reviewed in PRs | run_local() |
| Quick prototyping | run_local() |
| A/B testing prompt versions | prompts.run() |
| Programmatically constructed prompts | run_local() |
| Non-technical prompt editors | prompts.run() |
Quick Examples
Async Variants
Usearun_local() for async code:
Available Methods
Basic Usage
Getting started with each provider
Streaming
Real-time streaming responses
Sessions
Automatic trace chaining for multi-turn
Parameters
Complete run_local() parameter reference
Response
RunLocalResult fields and usage
Providers
OpenAI, Anthropic, Google setup
Models
94+ supported models by provider
Variables
Template interpolation syntax
Tracing
Background traces, flush(), error handling
Advanced
Error handling, concurrent requests
Types
LLMProvider
RunLocalInput
RunLocalResult
LocalStream
Whenstream=True is set, run_local() returns a LocalStream:

