afnio.cognitive.modules.deterministic_evaluator
afnio.cognitive.modules.deterministic_evaluator.DeterministicEvaluator
Bases: Module
Evaluates predictions deterministically using a user-defined evaluation function.
This module utilizes the DeterministicEvaluator
operation from afnio.autodiff.evaluator to compute evaluation scores and
explanations. The forward method takes in a prediction, a target, an
evaluation function (eval_fn), and its purpose description (eval_fn_purpose).
It also accepts a reduction function (reduction_fn) and its purpose description
(reduction_fn_purpose) to aggregate scores if needed. The method outputs
an evaluation score and an explanation, both as Variable instances. The
success_fn checks if all evaluations are successful, allowing the backward pass
to skip unnecessary gradient computations. The method outputs an evaluation
score and an explanation, both as Variable instances.
Examples:
>>> from afnio import cognitive as cog
>>> from afnio import set_backward_model_client
>>> set_backward_model_client("openai/gpt-4o")
>>> class ExactColor(cog.Module):
... def __init__(self):
... super().__init__()
... def exact_match_fn(pred: str, tgt: str) -> int:
... return 1 if pred == tgt_data else 0
... self.exact_match_fn = exact_match_fn
... self.fn_purpose = "exact match"
... self.reduction_fn = sum
... self.reduction_fn_purpose = "summation"
... self.exact_match = cog.DeterministicEvaluator()
... def forward(self, prediction, target):
... return self.exact_match(
... prediction,
... target,
... self.exact_match_fn,
... self.fn_purpose,
... self.reduction_fn,
... self.reduction_fn_purpose,
... )
>>> prediction = afnio.Variable(
... data=["the color is green", "blue"],
... role="color prediction",
... requires_grad=True
... )
>>> target = ["green", "blue"]
>>> eval = ExactColor()
>>> score, explanation = eval(prediction, target)
>>> print(score.data)
1
>>> print(explanation.data)
'The evaluation function, designed for 'exact match', compared the <DATA> fields of the predicted variable and the target variable across all samples in the batch, generating individual scores for each pair. These scores were then aggregated using the reduction function 'summation', resulting in a final aggregated score: 1.'
>>> explanation.backward()
>>> prediction.grad[0].data
'Reassess the criteria that led to the initial prediction of 'green'.'
Raises:
| Type | Description |
|---|---|
TypeError
|
If the types of |
ValueError
|
If the lengths of |
See Also
afnio.autodiff.evaluator.DeterministicEvaluator
for the underlying operation.
Source code in afnio/cognitive/modules/deterministic_evaluator.py
11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 | |
forward(prediction, target, eval_fn, eval_fn_purpose, success_fn, reduction_fn, reduction_fn_purpose)
Forward pass for the deterministic evaluator function.
Warning
Users should not call this method directly. Instead, they should call the
module instance itself, which will internally invoke this forward method.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
prediction
|
Variable
|
The predicted variable to evaluate, which can have scalar or
list |
required |
target
|
str | list[str] | Variable
|
The target (ground truth) to compare against, which can be a string,
a list of strings, or a |
required |
eval_fn
|
Callable[[Variable, Union[str, Variable]], list[Any]]
|
required | |
eval_fn_purpose
|
str | Variable
|
A brief description of the purpose of |
required |
success_fn
|
Callable[[List[Any]], bool] | None
|
A user-defined function that takes the list of scores returned
by |
required |
reduction_fn
|
Callable[[List[Any]], Any] | None
|
An optional function to aggregate scores across a batch of
predictions and targets. If |
required |
reduction_fn_purpose
|
str | Variable | None
|
A brief description of the purpose of |
required |
Returns:
| Name | Type | Description |
|---|---|---|
score |
Variable
|
A variable containing the evaluation score(s),
or their aggregation if |
explanation |
Variable
|
A variable containing the explanation(s) of the evaluation,
or their aggregation if |
Raises:
| Type | Description |
|---|---|
TypeError
|
If the types of |
ValueError
|
If the lengths of |
Source code in afnio/cognitive/modules/deterministic_evaluator.py
95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 | |