Runs and Experiments#

Warning

Before running any code, ensure you are logged in to the Afnio backend (afnio login). See Logging in to Afnio Backend for details.

Afnio promotes an explicitly experimental approach to building AI agent architectures. The same iterative workflow used for machine learning and deep-learning development—design, experiment, evaluate, and refine—is central to producing robust agents. Afnio is optimized for language-centric workflows where the training set is often orders of magnitude smaller than typical deep-learning datasets; powerful optimizers and textual gradients let you learn from a few, high-quality examples and rich semantic feedback rather than relying on massive numeric datasets.

This page explains how Afnio tracks experiments on the Tellurio Studio backend, how Runs are created and finished, and how logs, metrics, and artifacts are associated with an active Run.

Terminology note: “Runs” and “Experiments” are used interchangeably and refer to the same concept: a tracked execution grouping metadata, metrics, costs, and artifacts for a single experiment.


What is a Run?#

A Run represents a single tracked execution of your agent or workflow. It groups inputs, outputs, evaluation metrics, cost information (LM usage), and artifacts (checkpoints, serialized state) created while developing or training an agent.

  • Local code, remote optimization: Agent logic and forward passes execute locally, while the Afnio backend — hosted on Tellurio Studio — executes LM requests, constructs the backward/optimization graph, and runs optimizer/backpropagation. This separation enables secure, scalable textual gradient generation, centralized optimizer execution, and consolidated cost and metric tracking.

  • Active Run: An active Run is required to optimize your agent via backpropagation and to associate logs, and artifacts with a specific experiment. Without an active Run, Afnio will not create the backward graph on the server and backpropagation-based operations will fail. See the Active Run and Backpropagation subsection below for details.


Creating a Run#

afnio.tellurio is the client module you use to interact with Tellurio Studio: login, create or retrieve Projects, create Runs, log metrics, and upload artifacts.

Runs are grouped within a Project. See Projects (Tellurio Studio) for details on creating Projects, visibility levels, and membership.

Runs can be created programmatically from Python. The te.init(...) function constructs or retrieves a Project and creates a Run that becomes the active Run for the process. A typical usage pattern looks like this:

import afnio.tellurio as te

run = te.init(
    namespace_slug="username_or_org",
    project_display_name="My Project",
    description="Prompt tuning for sentiment agent",
)

# run forward/backward logic here
print("Running experiment...")

# When you create a Run programmatically, call `run.finish()` to mark it
# COMPLETED on the server and to clear the active Run from the current process.
run.finish()

Output:

INFO     : Project with slug 'my-project' does not exist in namespace 'username_or_org'. Creating it now with RESTRICTED visibility.
INFO     : Run 'hungry_brownie_557' created successfully at: https://platform.tellurio.ai/username_or_org/projects/my-project/runs/hungry-brownie-557/
Running experiment...
INFO     : Run 'hungry_brownie_557' marked as COMPLETED.

The te.init(...) call returns a Run object and (internally) sets it as the active Run for the current process. Many higher-level Afnio utilities will automatically use that active Run to associate logs, metrics, and artifacts.

You can also use the Run as a context manager so that it is automatically finished when the block exits:

with te.init("username_or_org", "my-project") as run:

  # run forward/backward logic here
  print("Running experiment...")

# when the with-block exits, the Run is marked COMPLETED

Output:

INFO     : Project with slug 'my-project' already exists in namespace 'username_or_org'.
INFO     : Run 'focused_halloumi_666' created successfully at: https://platform.tellurio.ai/username_or_org/projects/my-project/runs/focused-halloumi-666/
Running experiment...
INFO     : Run 'focused_halloumi_666' marked as COMPLETED.

Run Lifecycle and run.finish()#

A Run progresses through a small set of lifecycle states (expressed in the API and UI). The table below combines each state description with how that state is set in practice:

State

Description

How it’s set (typical)

RUNNING

The Run is active and accepting logs, metrics, and artifacts.

Created by te.init(...) or when entering a a with te.init(...) block.

COMPLETED

The Run finished successfully.

Call run.finish() or exit a with te.init(...) block without an exception.

CRASHED

The Run terminated due to an unhandled exception during execution.

Set automatically by the context manager or the atexit/exit handler when an exception occurs.

FAILED

A safeguard/intermediate failure state used when a Run was left unfinished or superseded.

Set by the safeguard when a previous active Run is replaced or in certain unrecoverable scenarios.

Why run.finish() matters:

  • Marks the Run as COMPLETED (or another status you pass) on Tellurio Studio UI via a server PATCH.

  • Clears the active Run from the local process so subsequent logs/operations are not associated with an ended Run.

  • Unregisters the automatic safeguard that attempts to finish the Run on process exit.

Prefer the context-manager form (with te.init(...) as run:) to ensure Runs are cleanly finished and automatically marked CRASHED if an exception occurs.


Active Run and Backpropagation#

The backward graph—the structure that describes how textual gradients should flow to learnable Parameters (for example, prompt pieces)—is constructed on the Afnio backend (hosted on Tellurio Studio) only when an active Run exists. In practice:


Logging and Tracking#

Logs and metrics are the primary way to monitor training and evaluation. The Run object exposes a simple .log() method that records scalar values, structured metrics, and step indices. Logged values are streamed to Tellurio Studio as they are produced, and associated with the active Run, so you can visualize and compare them on the platform.

Tellurio Studio provides real‑time scalar plots, per‑Run overlays for direct comparisons, step‑wise charts, and cost‑breakdown visualizations; use the platform’s compare view to overlay runs and inspect differences interactively.

Example: Logging metrics inside a Run (context manager)

with te.init("username_or_org", "my-project") as run:

    # Log some metrics
    run.log("train_loss", 0.23, step=3)
    run.log("val_accuracy", 0.87, step=3)

Output:

INFO     : Project with slug 'my-project' already exists in namespace 'username_or_org'.
INFO     : Run 'vigilant_pho_308' created successfully at: https://platform.tellurio.ai/username_or_org/projects/my-project/runs/vigilant-pho-308/
INFO     : Logged metric 'train_loss'=0.23 for run 'vigilant_pho_308'.
INFO     : Logged metric 'val_accuracy'=0.87 for run 'vigilant_pho_308'.
INFO     : Run 'vigilant_pho_308' marked as COMPLETED.

What gets tracked:

  • Scalars and metrics: loss, accuracy, custom evaluation scores.

  • Costs: Afnio can record LM usage and cost for each call so you can monitor budget across runs.


Best Practices#

  • Create a Run early: Call te.init(...) before starting training or optimization so that the backward graph gets generated, and metrics are properly associated.

  • Use descriptive names: Give runs meaningful name and description values to simplify later analysis.

  • Use the context manager: Prefer with te_run.init(...) as run: to ensure runs are cleanly finished even if your script errors.


Troubleshooting#

  • If backpropagation does not produce gradients on the server, confirm you have an active Run set (use te_run.init(...)) and that requires_grad=True is set for the Parameters you expect to optimize.

  • If logs do not appear in Tellurio Studio, check network connectivity and make sure your session credentials and consent (API key sharing) are intact.


Further reading#