# Save, Load and Use Agent

```{warning}
Before running any code, ensure you are logged in to the Afnio backend (`afnio login`). See [Logging in to Afnio Backend](login) for details.
```

Afnio agents (subclasses of `cog.Module`) expose `state_dict()` / `load_state_dict(...)` to persist trained parameters and minimal metadata; saving and restoring agent state enables reproducible evaluation, resuming training, safe deployment, and sharing of model parameters without serializing full objects.

---

## Prerequisite Code

Create an LM client and instantiate the example agent below.

```python
import os

import afnio
import afnio.utils.agents as agents
from afnio.models.openai import AsyncOpenAI

os.environ["OPENAI_API_KEY"] = "sk-..."  # Replace with your actual key

fwd_model = AsyncOpenAI()
agent = agents.SentimentAnalyzer()

response = agent(
    fwd_model,
    inputs={"message": "I've been a satisfied client of ProCare for a year"},
    model="gpt-4.1-nano",
    temperature=0.0,
)
print(response.data)
```

_Output:_

```output
{"sentiment":"positive"}
```

---

## Saving and Loading Agent Parameters

Afnio agents store the learned parameters in an internal state dictionary, called `state_dict`. These can be persisted via the `afnio.save` method:

```python
path = "sentiment_analyzer.hf"
afnio.save(agent.state_dict(), path)

print(f"Saved agent state: {path} (exists={os.path.exists(path)})")
```

_Output:_

```output
Saved agent state: sentiment_analyzer.hf (exists=True)
```

To load agent parameters, you need to create an instance of the same agent first, and then load the parameters using `load_state_dict()` method.

```python
new_agent = agents.SentimentAnalyzer()
new_agent.load_state_dict(
    afnio.load(path),
    model_clients={"sentiment_classifier.forward_model_client": AsyncOpenAI()},
)
new_agent.eval()

print(new_agent)
```

_Output:_

```output
SentimentAnalyzer(
  (sentiment_classifier): ChatCompletion()
)
```

```{note}
Be sure to call `new_agent.eval()` method before inferencing to set the relevant layers to evaluation mode. Failing to do this might yield inconsistent inference results.
```

---

## Saving Checkpoints

If you are writing your own logic to store checkpoints instead of using the [`Trainer`](trainer.md), the recommended pattern is to save a lightweight checkpoint dictionary containing the agent's state dict, and any metadata you need (epoch, optimizer state, validation metrics).

Save only serializable pieces (agent state, optimizer state, metadata):

```python
# Create forward LM client and example agent
fwd_model = AsyncOpenAI()
agent = agents.SentimentAnalyzer()

# Define optimizer (only to show it can be included in the checkpoint)
# and run a single step so optimizer.state_dict() is populated
optimizer = afnio.optim.TGD(
    agent.parameters(),
    model_client=AsyncOpenAI(),
    momentum=3,
    model="gpt-5",
    temperature=1.0,
    max_completion_tokens=32000,
    reasoning_effort="low",
)
optimizer.step()

# Compose includes agent state and optimizer state for resuming training
checkpoint = {
    "epoch": 2,
    "batch": 3,
    "agent_state_dict": agent.state_dict(keep_vars=True),
    "optimizer_state_dict": optimizer.state_dict(),
    "val_accuracy": 0.92,
}
afnio.save(checkpoint, "checkpoints/checkpoint_epoch2.hf")
```

Notes:

- Use `.hf` (or any extension) — Afnio uses a zipped pickle format via `afnio.save` / `afnio.load`.
- Keep checkpoints small by saving `state_dict()` instead of the full agent object.

```{note}
The `.hf` extension is a naming convention inspired by the chemical symbol for [Hafnium (Hf)](https://en.wikipedia.org/wiki/Hafnium). "Afnio" is the Italian word for Hafnium.
```

---

## Saving/Loading to a Buffer

You can save to and load from an in-memory buffer — useful for unit tests, CI, IPC/network transfer, or avoiding filesystem I/O when moving checkpoints.

```python
import io

buf = io.BytesIO()
afnio.save(checkpoint, buf)
buf.seek(0)
ck = afnio.load(buf)
```

---

## Troubleshooting

- Missing model clients: `load_state_dict` will raise if required model clients are not provided. Pass a matching `model_clients` mapping.
- To resume training reliably, save and later restore the optimizer state (momentum / per-parameter buffers), any forward/backward/optimizer LM client bindings, and training counters (epoch, global step/batch).

---

## Further Reading

- [Trainer](trainer)
- [Runs and Experiments](runs_and_experiments)