By default,Documentation Index
Fetch the complete documentation index at: https://docs.tracia.io/llms.txt
Use this file to discover all available pages before exploring further.
run_local() automatically sends spans to Tracia in the background. This gives you observability without blocking your application.
How Tracing Works
run_local()completes the LLM call- Returns the result immediately
- Submits the span to Tracia in the background using a thread pool
- Retries failed span submissions automatically
Span Metadata
Add metadata to help filter and analyze spans:Span Fields
| Field | Description |
|---|---|
span_id | Unique identifier for the span |
trace_id | Session ID if part of multi-turn conversation |
parent_span_id | Parent span ID for chained conversations |
model | Model used for the request |
provider | Provider (openai, anthropic, google, amazon_bedrock) |
input.messages | Messages sent (with variables interpolated) |
variables | Original variables passed |
output | Generated response |
status | SUCCESS or ERROR |
latency_ms | Request duration |
input_tokens | Input token count |
output_tokens | Output token count |
tags | User-defined tags |
user_id | End user identifier |
session_id | Session identifier |
Custom Span ID
Provide your own span ID for correlation with external systems:Waiting for Spans
Useflush() to wait for all pending spans before shutdown:
Async Flush
Graceful Shutdown with Context Manager
Error Handling
on_span_error Callback
Handle span submission failures without affecting your main application:Retry Behavior
Span submissions are automatically retried:- Up to 2 retry attempts
- Exponential backoff (500ms, 1000ms)
on_span_errorcalled only after all retries fail
Disabling Tracing
Disable tracing for specific requests:- Development and testing
- Sensitive data that shouldn’t be logged
- High-volume, low-value requests
- Reducing costs on non-critical paths
Viewing Spans
Access spans in the Tracia dashboard or via the SDK:Span Storage
Spans include:- Full input messages (after variable interpolation)
- Original variables (for filtering)
- Complete output text
- Token usage and latency
- LLM configuration (temperature, max_output_tokens, top_p)
Spans are stored securely and retained according to your plan’s data retention policy.

