`afnio.cognitive.modules.exact_match_evaluator`

`afnio.cognitive.modules.exact_match_evaluator.ExactMatchEvaluator`

Bases: Module

Evaluates predictions using an exact match criterion.

This module leverages the ExactMatchEvaluator operation from afnio.autodiff.evaluator and is a specialized version of the DeterministicEvaluator that uses an exact matching function to compare the prediction and target. It returns an evaluation score (1 for exact match, 0 otherwise) and an explanation describing the evaluation result.

Examples:

>>> from afnio import cognitive as cog
>>> from afnio import set_backward_model_client
>>> set_backward_model_client("openai/gpt-4o")
>>> class ExactColor(cog.Module):
...     def __init__(self):
...         super().__init__()
...         self.exact_match = cog.ExactMatchEvaluator()
...     def forward(self, prediction, target):
...         return self.exact_match(prediction, target)
>>> prediction = afnio.Variable(
...     data="green",
...     role="color prediction",
...     requires_grad=True
... )
>>> target = "red"
>>> eval = ExactColor()
>>> score, explanation = eval(prediction, target)
>>> print(score.data)
0
>>> print(explanation.data)
'The evaluation function, designed for 'exact match', compared the <DATA> fields of the predicted variable and the target variable, resulting in a score: 0.'
>>> explanation.backward()
>>> prediction.grad[0].data
'Reassess the criteria that led to the initial prediction of 'green'.'

Raises:

Type	Description
`TypeError`	If the types of `prediction`, `target`, `reduction_fn`, or `reduction_fn_purpose` are not as expected.
`ValueError`	If the lengths of `prediction.data` and `target` (or `target.data`, when `target` is a `Variable`) do not match when both are lists, or if `reduction_fn_purpose` (or `reduction_fn_purpose.data`) is an empty string.

See Also

afnio.autodiff.evaluator.ExactMatchEvaluator for the underlying operation.

Source code in afnio/cognitive/modules/exact_match_evaluator.py

class ExactMatchEvaluator(Module):
    """
    Evaluates predictions using an exact match criterion.

    This module leverages the [`ExactMatchEvaluator`][afnio.autodiff.evaluator.ExactMatchEvaluator]
    operation from `afnio.autodiff.evaluator` and is a specialized version of the
    [`DeterministicEvaluator`][afnio.cognitive.modules.deterministic_evaluator.DeterministicEvaluator]
    that uses an exact matching function to compare the `prediction` and `target`.
    It returns an evaluation `score` (`1` for exact match, `0` otherwise)
    and an `explanation` describing the evaluation result.

    Examples:
        >>> from afnio import cognitive as cog
        >>> from afnio import set_backward_model_client
        >>> set_backward_model_client("openai/gpt-4o")
        >>> class ExactColor(cog.Module):
        ...     def __init__(self):
        ...         super().__init__()
        ...         self.exact_match = cog.ExactMatchEvaluator()
        ...     def forward(self, prediction, target):
        ...         return self.exact_match(prediction, target)
        >>> prediction = afnio.Variable(
        ...     data="green",
        ...     role="color prediction",
        ...     requires_grad=True
        ... )
        >>> target = "red"
        >>> eval = ExactColor()
        >>> score, explanation = eval(prediction, target)
        >>> print(score.data)
        0
        >>> print(explanation.data)
        'The evaluation function, designed for 'exact match', compared the <DATA> fields of the predicted variable and the target variable, resulting in a score: 0.'
        >>> explanation.backward()
        >>> prediction.grad[0].data
        'Reassess the criteria that led to the initial prediction of 'green'.'

    Raises:
        TypeError: If the types of `prediction`, `target`, `reduction_fn`, or
            `reduction_fn_purpose` are not as expected.
        ValueError: If the lengths of `prediction.data` and `target` (or `target.data`,
            when `target` is a `Variable`) do not match when both are lists, or if
            `reduction_fn_purpose` (or `reduction_fn_purpose.data`) is an empty string.

    See Also:
        [`afnio.autodiff.evaluator.ExactMatchEvaluator`][afnio.autodiff.evaluator.ExactMatchEvaluator]
        for the underlying operation.
    """  # noqa: E501

    reduction_fn: Optional[Callable[[List[Any]], Any]]
    reduction_fn_purpose: Optional[Union[str, Variable]]

    def __init__(self):
        super().__init__()

        self.register_function("reduction_fn", None)
        self.register_buffer("reduction_fn_purpose", None)

    def forward(
        self,
        prediction: Variable,
        target: Union[str, List[str], Variable],
        reduction_fn: Optional[Callable[[List[Any]], Any]] = sum,
        reduction_fn_purpose: Optional[Union[str, Variable]] = "summation",
    ) -> Tuple[Variable, Variable]:
        """
        Forward pass for the exact match evaluator function.

        Warning:
            Users should not call this method directly. Instead, they should call the
            module instance itself, which will internally invoke this `forward` method.

        Args:
            prediction: The predicted variable to evaluate, which can have scalar or
                list [`data`][afnio.Variable.data] (supporting both individual and
                batch processing).
            target: The target (ground truth) to compare against, which can be a string,
                a list of strings, or a `Variable`.
            reduction_fn: An optional function to aggregate scores across a batch of
                predictions and targets. If `None`, no aggregation is applied.
            reduction_fn_purpose: A brief description of the purpose of `reduction_fn`,
                used by the autodiff engine to generate explanations. Required if
                `reduction_fn` is provided.

        Returns:
            score: A variable containing the evaluation score(s),
                or their aggregation if `reduction_fn` is provided.
            explanation: A variable containing the explanation(s) of the evaluation,
                or their aggregation if `reduction_fn` is provided.

        Raises:
            TypeError: If the types of `prediction`, `target`, `reduction_fn`,
                or `reduction_fn_purpose` are not as expected.
            ValueError: If the lengths of `prediction.data` and `target` (or
                `target.data`, when `target` is a `Variable`) do not match when
                both are lists, or if `reduction_fn_purpose` (or
                `reduction_fn_purpose.data`) is an empty string.
        """
        self.reduction_fn = reduction_fn
        self.reduction_fn_purpose = (
            None
            if reduction_fn_purpose is None
            else (
                reduction_fn_purpose
                if isinstance(reduction_fn_purpose, Variable)
                else Variable(reduction_fn_purpose)
            )
        )
        return ExactMatchEvaluatorOp.apply(
            prediction, target, self.reduction_fn, self.reduction_fn_purpose
        )

`forward(prediction, target, reduction_fn=sum, reduction_fn_purpose='summation')`

Forward pass for the exact match evaluator function.

Warning

Users should not call this method directly. Instead, they should call the module instance itself, which will internally invoke this forward method.

Parameters:

Name	Type	Description	Default
`prediction`	`Variable`	The predicted variable to evaluate, which can have scalar or list `data` (supporting both individual and batch processing).	required
`target`	`str \| list[str] \| Variable`	The target (ground truth) to compare against, which can be a string, a list of strings, or a `Variable`.	required
`reduction_fn`	`Callable[[List[Any]], Any] \| None`	An optional function to aggregate scores across a batch of predictions and targets. If `None`, no aggregation is applied.	`sum`
`reduction_fn_purpose`	`str \| Variable \| None`	A brief description of the purpose of `reduction_fn`, used by the autodiff engine to generate explanations. Required if `reduction_fn` is provided.	`'summation'`

Returns:

Name	Type	Description
`score`	`Variable`	A variable containing the evaluation score(s), or their aggregation if `reduction_fn` is provided.
`explanation`	`Variable`	A variable containing the explanation(s) of the evaluation, or their aggregation if `reduction_fn` is provided.

Raises:

Type	Description
`TypeError`	If the types of `prediction`, `target`, `reduction_fn`, or `reduction_fn_purpose` are not as expected.
`ValueError`	If the lengths of `prediction.data` and `target` (or `target.data`, when `target` is a `Variable`) do not match when both are lists, or if `reduction_fn_purpose` (or `reduction_fn_purpose.data`) is an empty string.

Source code in afnio/cognitive/modules/exact_match_evaluator.py

def forward(
    self,
    prediction: Variable,
    target: Union[str, List[str], Variable],
    reduction_fn: Optional[Callable[[List[Any]], Any]] = sum,
    reduction_fn_purpose: Optional[Union[str, Variable]] = "summation",
) -> Tuple[Variable, Variable]:
    """
    Forward pass for the exact match evaluator function.

    Warning:
        Users should not call this method directly. Instead, they should call the
        module instance itself, which will internally invoke this `forward` method.

    Args:
        prediction: The predicted variable to evaluate, which can have scalar or
            list [`data`][afnio.Variable.data] (supporting both individual and
            batch processing).
        target: The target (ground truth) to compare against, which can be a string,
            a list of strings, or a `Variable`.
        reduction_fn: An optional function to aggregate scores across a batch of
            predictions and targets. If `None`, no aggregation is applied.
        reduction_fn_purpose: A brief description of the purpose of `reduction_fn`,
            used by the autodiff engine to generate explanations. Required if
            `reduction_fn` is provided.

    Returns:
        score: A variable containing the evaluation score(s),
            or their aggregation if `reduction_fn` is provided.
        explanation: A variable containing the explanation(s) of the evaluation,
            or their aggregation if `reduction_fn` is provided.

    Raises:
        TypeError: If the types of `prediction`, `target`, `reduction_fn`,
            or `reduction_fn_purpose` are not as expected.
        ValueError: If the lengths of `prediction.data` and `target` (or
            `target.data`, when `target` is a `Variable`) do not match when
            both are lists, or if `reduction_fn_purpose` (or
            `reduction_fn_purpose.data`) is an empty string.
    """
    self.reduction_fn = reduction_fn
    self.reduction_fn_purpose = (
        None
        if reduction_fn_purpose is None
        else (
            reduction_fn_purpose
            if isinstance(reduction_fn_purpose, Variable)
            else Variable(reduction_fn_purpose)
        )
    )
    return ExactMatchEvaluatorOp.apply(
        prediction, target, self.reduction_fn, self.reduction_fn_purpose
    )