Api global

Summary

All global effect methods have a similar interface and workflow:

create an instance of the global effect method you want to use
(optional) .fit() to customize the method
.plot() to plot the global effect of a feature
.eval() to evaluate the global effect of a feature at a specific grid of points

Usage

# set up the input
X = ... # input data
predict = ... # model to be explained
jacobian = ... # jacobian of the model

Create an instance of the global effect method you want to use:

PDPRHALEShapDPALEDerPDP

g_method = effector.PDP(data=X, model=predict)

g_method = effector.RHALE(data=X, model=predict, model_jac=jacobian)

g_method = effector.ShapDP(data=X, model=predict)

g_method = effector.ALE(data=X, model=predict)

g_method = effector.DerPDP(data=X, model=predict, model_jac=jacobian)

Customize the global effect method (optional):

.fit(features, **method_specific_args)

This is the place for customization

The .fit() step can be omitted if you are ok with the default settings; you can directly call the .plot(), or .eval() methods. However, if you want more control over the fitting process, you can pass additional arguments to the .fit() method. Check the Usage section below and the method-specific documentation for more information.

Usage

# customize the space partitioning algorithm
axis_partitioner = effector.axis_partitioning.Greedy(
    init_nof_bins = 50, # start from 50 bins (default: 20)
    min_points_per_bin = 10, # minimum number of points per bin (default: 2)
    cat_limit = 20 # maximum number of categories for a feature to be considered categorical (default: 10)

)
g_method.fit(
    features=[0, 1], # list of features to be analyzed
    axis_partitioner=axis_partitioner, # custom axis partitioner
)

Plot the global effect of a feature:

.plot(feature)
Usage
```
feature = ...
g_method.plot(feature, **plot_specific_args)
```
Output

PDPRHALEShapDPALEderPDP

Evaluate the global effect of a feature at a specific grid of points:

.eval(feature, xs)

Usage

# Example input
feature = ... # feature to be analyzed
xs = ... # grid of points to evaluate the global effect, e.g., np.linspace(0, 1, 100)

y, het = r_method.eval(feature, xs)

API

`effector.global_effect_ale.ALE(data, model, nof_instances=10000, axis_limits=None, feature_names=None, target_name=None)`

Bases: ALEBase

Constructor for the ALE plot.

Definition

ALE is defined as: $$ \hat{f}^{ALE}(x_s) = TODO $$

The heterogeneity is: $$ TODO $$

The std of the bin-effects is: $$ TODO $$

Notes

The required parameters are data and model. The rest are optional.

Parameters:

Name	Type	Description	Default
`data`	`ndarray`	the design matrix shape: `(N,D)`	required
`model`	`callable`	the black-box model. Must be a `Callable` with: input: `ndarray` of shape `(N, D)` output: `ndarray` of shape `(N, )`	required
`nof_instances`	`Union[int, str]`	the number of instances to use for the explanation use an `int`, to specify the number of instances use `"all"`, to use all the instances	`10000`
`axis_limits`	`Optional[ndarray]`	The limits of the feature effect plot along each axis use a `ndarray` of shape `(2, D)`, to specify them manually use `None`, to be inferred from the data	`None`
`feature_names`	`Optional[List]`	The names of the features use a `list` of `str`, to specify the name manually. For example: `["age", "weight", ...]` use `None`, to keep the default names: `["x_0", "x_1", ...]`	`None`
`target_name`	`Optional[str]`	The name of the target variable use a `str`, to specify it name manually. For example: `"price"` use `None`, to keep the default name: `"y"`	`None`

Methods:

Name	Description
`fit`	Fit the ALE plot.
`eval`	Evalueate the (RH)ALE feature effect of feature `feature` at points `xs`.
`plot`	Plot the (RH)ALE feature effect of feature `feature`.

Source code in effector/global_effect_ale.py

def __init__(
    self,
    data: np.ndarray,
    model: callable,
    nof_instances: Union[int, str] = 10_000,
    axis_limits: Optional[np.ndarray] = None,
    feature_names: Optional[List] = None,
    target_name: Optional[str] = None,
):
    """
    Constructor for the ALE plot.

    Definition:
        ALE is defined as:
        $$
        \hat{f}^{ALE}(x_s) = TODO
        $$

        The heterogeneity is:
        $$
        TODO
        $$

        The std of the bin-effects is:
        $$
        TODO
        $$

    Notes:
        - The required parameters are `data` and `model`. The rest are optional.

    Args:
        data: the design matrix

            - shape: `(N,D)`
        model: the black-box model. Must be a `Callable` with:

            - input: `ndarray` of shape `(N, D)`
            - output: `ndarray` of shape `(N, )`

        nof_instances: the number of instances to use for the explanation

            - use an `int`, to specify the number of instances
            - use `"all"`, to use all the instances

        axis_limits: The limits of the feature effect plot along each axis

            - use a `ndarray` of shape `(2, D)`, to specify them manually
            - use `None`, to be inferred from the data

        feature_names: The names of the features

            - use a `list` of `str`, to specify the name manually. For example: `                  ["age", "weight", ...]`
            - use `None`, to keep the default names: `["x_0", "x_1", ...]`

        target_name: The name of the target variable

            - use a `str`, to specify it name manually. For example: `"price"`
            - use `None`, to keep the default name: `"y"`
    """
    self.bin_limits = {}
    self.data_effect_ale = {}
    super(ALE, self).__init__(
        data,
        model,
        None,
        None,
        nof_instances,
        axis_limits,
        feature_names,
        target_name,
        "ALE",
    )

`fit(features='all', binning_method='fixed', centering=True, points_for_centering=30)`

Fit the ALE plot.

Parameters:

Name	Type	Description	Default
`features`	`Union[int, str, list]`	the features to fit. If set to "all", all the features will be fitted.	`'all'`
`binning_method`	`Union[str, Fixed]`	If set to `"fixed"`, the ALE plot will be computed with the default values, which are `20` bins with at least `10` points per bin and the feature is considered as categorical if it has less than `15` unique values. If you want to change the parameters of the method, you pass an instance of the class `effector.binning_methods.Fixed` with the desired parameters. For example: `Fixed(nof_bins=20, min_points_per_bin=0, cat_limit=10)`	`'fixed'`
`centering`	`Union[bool, str]`	whether to compute the normalization constant for centering the plot: `False` means no centering `True` or `zero_integral` centers around the `y` axis. `zero_start` starts the plot from `y=0`.	`True`
`points_for_centering`	`int`	the number of points to use for centering the plot. Default is 100.	`30`

Source code in effector/global_effect_ale.py

def fit(
    self,
    features: typing.Union[int, str, list] = "all",
    binning_method: typing.Union[str, ap.Fixed] = "fixed",
    centering: typing.Union[bool, str] = True,
    points_for_centering: int = 30
) -> None:
    """Fit the ALE plot.

    Args:
        features: the features to fit. If set to "all", all the features will be fitted.

        binning_method:

            - If set to `"fixed"`, the ALE plot will be computed with the  default values, which are
            `20` bins with at least `10` points per bin and the feature is considered as categorical if it has
            less than `15` unique values.
            - If you want to change the parameters of the method, you pass an instance of the
            class `effector.binning_methods.Fixed` with the desired parameters.
            For example: `Fixed(nof_bins=20, min_points_per_bin=0, cat_limit=10)`

        centering: whether to compute the normalization constant for centering the plot:

            - `False` means no centering
            - `True` or `zero_integral` centers around the `y` axis.
            - `zero_start` starts the plot from `y=0`.

        points_for_centering: the number of points to use for centering the plot. Default is 100.
    """
    assert binning_method == "fixed" or isinstance(
        binning_method, ap.Fixed
    ), "ALE can work only with the fixed binning method!"

    self._fit_loop(features, binning_method, centering, points_for_centering)

`eval(feature, xs, heterogeneity=False, centering=True, **kwargs)`

Evalueate the (RH)ALE feature effect of feature feature at points xs.

Notes

This is a common method inherited by both ALE and RHALE.

Parameters:

Name	Type	Description	Default
`feature`	`int`	index of feature of interest	required
`xs`	`ndarray`	the points along the s-th axis to evaluate the FE plot - `np.ndarray` of shape `(T, )`	required
`heterogeneity`	`bool`	whether to return heterogeneity: `False`, returns the mean effect `y` at the given `xs` `True`, returns a tuple `(y, H)` of two `ndarrays`; `y` is the mean effect and `H` is the heterogeneity evaluated at `xs`	`False`
`centering`	`Union[bool, str]`	whether to center the plot: `False` means no centering `True` or `zero_integral` centers around the `y` axis. `zero_start` starts the plot from `y=0`.	`True`

Returns: the mean effect y, if heterogeneity=False (default) or a tuple (y, std) otherwise

Source code in effector/global_effect_ale.py

def eval(
    self,
    feature: int,
    xs: np.ndarray,
    heterogeneity: bool = False,
    centering: typing.Union[bool, str] = True,
    **kwargs
) -> Union[np.ndarray, Tuple[np.ndarray, np.ndarray]]:
    """Evalueate the (RH)ALE feature effect of feature `feature` at points `xs`.

    Notes:
        This is a common method inherited by both ALE and RHALE.

    Args:
        feature: index of feature of interest
        xs: the points along the s-th axis to evaluate the FE plot
          - `np.ndarray` of shape `(T, )`
        heterogeneity: whether to return heterogeneity:

              - `False`, returns the mean effect `y` at the given `xs`
              - `True`, returns a tuple `(y, H)` of two `ndarrays`; `y` is the mean effect and `H` is the
              heterogeneity evaluated at `xs`

        centering: whether to center the plot:

            - `False` means no centering
            - `True` or `zero_integral` centers around the `y` axis.
            - `zero_start` starts the plot from `y=0`.
    Returns:
        the mean effect `y`, if `heterogeneity=False` (default) or a tuple `(y, std)` otherwise

    """
    centering = helpers.prep_centering(centering)

    if self.requires_refit(feature, centering):
        self.fit(features=feature, centering=centering)

    # Check if the lower bound is less than the upper bound
    assert self.axis_limits[0, feature] < self.axis_limits[1, feature]

    # Evaluate the feature
    yy = self._eval_unnorm(feature, xs, heterogeneity=heterogeneity)
    y, std = yy if heterogeneity else (yy, None)

    # Center if asked
    y = (
        y - self.feature_effect["feature_" + str(feature)]["norm_const"]
        if centering
        else y
    )

    return (y, std) if heterogeneity is not False else y

`plot(feature, heterogeneity=True, centering=True, scale_x=None, scale_y=None, show_avg_output=False, y_limits=None, dy_limits=None, show_only_aggregated=False, show_plot=True)`

Plot the (RH)ALE feature effect of feature feature.

Notes

This is a common method inherited by both ALE and RHALE.

Parameters:

Name	Type	Description	Default
`feature`	`int`	the feature to plot	required
`heterogeneity`	`bool`	whether to plot the heterogeneity `False`, plots only the mean effect `True`, the std of the bin-effects will be plotted using a red vertical bar	`True`
`centering`	`Union[bool, str]`	whether to center the plot: `False` means no centering `True` or `zero_integral` centers around the `y` axis. `zero_start` starts the plot from `y=0`.	`True`
`scale_x`	`Optional[dict]`	None or Dict with keys ['std', 'mean'] If set to None, no scaling will be applied. If set to a dict, the x-axis will be scaled by the standard deviation and the mean.	`None`
`scale_y`	`Optional[dict]`	None or Dict with keys ['std', 'mean'] If set to None, no scaling will be applied. If set to a dict, the y-axis will be scaled by the standard deviation and the mean.	`None`
`show_avg_output`	`bool`	if True, the average output will be shown as a horizontal line.	`False`
`y_limits`	`Optional[List]`	None or tuple, the limits of the y-axis If set to None, the limits of the y-axis are set automatically If set to a tuple, the limits are manually set	`None`
`dy_limits`	`Optional[List]`	None or tuple, the limits of the dy-axis If set to None, the limits of the dy-axis are set automatically If set to a tuple, the limits are manually set	`None`
`show_only_aggregated`	`bool`	if True, only the main ale plot will be shown	`False`
`show_plot`	`bool`	if True, the plot will be shown	`True`

Source code in effector/global_effect_ale.py

def plot(
    self,
    feature: int,
    heterogeneity: bool = True,
    centering: Union[bool, str] = True,
    scale_x: Optional[dict] = None,
    scale_y: Optional[dict] = None,
    show_avg_output: bool = False,
    y_limits: Optional[List] = None,
    dy_limits: Optional[List] = None,
    show_only_aggregated: bool = False,
    show_plot: bool = True,
):
    """
    Plot the (RH)ALE feature effect of feature `feature`.

    Notes:
        This is a common method inherited by both ALE and RHALE.

    Parameters:
        feature: the feature to plot
        heterogeneity: whether to plot the heterogeneity

              - `False`, plots only the mean effect
              - `True`, the std of the bin-effects will be plotted using a red vertical bar

        centering: whether to center the plot:

            - `False` means no centering
            - `True` or `zero_integral` centers around the `y` axis.
            - `zero_start` starts the plot from `y=0`.

        scale_x: None or Dict with keys ['std', 'mean']

            - If set to None, no scaling will be applied.
            - If set to a dict, the x-axis will be scaled by the standard deviation and the mean.
        scale_y: None or Dict with keys ['std', 'mean']

            - If set to None, no scaling will be applied.
            - If set to a dict, the y-axis will be scaled by the standard deviation and the mean.
        show_avg_output: if True, the average output will be shown as a horizontal line.
        y_limits: None or tuple, the limits of the y-axis

            - If set to None, the limits of the y-axis are set automatically
            - If set to a tuple, the limits are manually set

        dy_limits: None or tuple, the limits of the dy-axis

            - If set to None, the limits of the dy-axis are set automatically
            - If set to a tuple, the limits are manually set

        show_only_aggregated: if True, only the main ale plot will be shown
        show_plot: if True, the plot will be shown
    """
    heterogeneity = helpers.prep_confidence_interval(heterogeneity)
    centering = helpers.prep_centering(centering)

    # hack to fit the feature if not fitted
    self.eval(
        feature, np.array([self.axis_limits[0, feature]]), centering=centering
    )

    if show_avg_output:
        avg_output = helpers.prep_avg_output(
            self.data, self.model, self.avg_output, scale_y
        )
    else:
        avg_output = None

    title = "Accumulated Local Effects (ALE)" if self.method_name == "ale" else "Robust and Heterogeneity-Aware ALE (RHALE)"
    ret = vis.ale_plot(
        self.feature_effect["feature_" + str(feature)],
        self.eval,
        feature,
        centering=centering,
        error=heterogeneity,
        scale_x=scale_x,
        scale_y=scale_y,
        title=title,
        avg_output=avg_output,
        feature_names=self.feature_names,
        target_name=self.target_name,
        y_limits=y_limits,
        dy_limits=dy_limits,
        show_only_aggregated=show_only_aggregated,
        show_plot=show_plot,
    )

    if not show_plot:
        fig, ax = ret
        return fig, ax

`effector.global_effect_ale.RHALE(data, model, model_jac=None, nof_instances=10000, axis_limits=None, data_effect=None, feature_names=None, target_name=None)`

Bases: ALEBase

Constructor for RHALE.

Definition

RHALE is defined as: $$ \hat{f}^{RHALE}(x_s) = TODO $$

The heterogeneity is: $$ TODO $$

The std of the bin-effects is: $$ TODO $$

Notes

The required parameters are data and model. The rest are optional.

Parameters:

Name	Type	Description	Default
`data`	`ndarray`	the design matrix shape: `(N,D)`	required
`model`	`callable`	the black-box model. Must be a `Callable` with: input: `ndarray` of shape `(N, D)` output: `ndarray` of shape `(N, )`	required
`model_jac`	`Union[None, callable]`	the Jacobian of the model. Must be a `Callable` with: input: `ndarray` of shape `(N, D)` output: `ndarray` of shape `(N, D)`	`None`
`nof_instances`	`Union[int, str]`	the number of instances to use for the explanation use an `int`, to specify the number of instances use `"all"`, to use all the instances	`10000`
`axis_limits`	`Optional[ndarray]`	The limits of the feature effect plot along each axis use a `ndarray` of shape `(2, D)`, to specify them manually use `None`, to be inferred from the data	`None`
`data_effect`	`Optional[ndarray]`	if np.ndarray, the model Jacobian computed on the `data` if None, the Jacobian will be computed using model_jac	`None`
`feature_names`	`Optional[list]`	The names of the features use a `list` of `str`, to specify the name manually. For example: `["age", "weight", ...]` use `None`, to keep the default names: `["x_0", "x_1", ...]`	`None`
`target_name`	`Optional[str]`	The name of the target variable use a `str`, to specify it name manually. For example: `"price"` use `None`, to keep the default name: `"y"`	`None`

Methods:

Name	Description
`fit`	Fit the model.
`eval`	Evalueate the (RH)ALE feature effect of feature `feature` at points `xs`.
`plot`	Plot the (RH)ALE feature effect of feature `feature`.

Source code in effector/global_effect_ale.py

def __init__(
    self,
    data: np.ndarray,
    model: callable,
    model_jac: typing.Union[None, callable] = None,
    nof_instances: typing.Union[int, str] = 10_000,
    axis_limits: typing.Optional[np.ndarray] = None,
    data_effect: typing.Optional[np.ndarray] = None,
    feature_names: typing.Optional[list] = None,
    target_name: typing.Optional[str] = None,
):
    """
    Constructor for RHALE.

    Definition:
        RHALE is defined as:
        $$
        \hat{f}^{RHALE}(x_s) = TODO
        $$

        The heterogeneity is:
        $$
        TODO
        $$

        The std of the bin-effects is:
        $$
        TODO
        $$

    Notes:
        The required parameters are `data` and `model`. The rest are optional.

    Args:
        data: the design matrix

            - shape: `(N,D)`
        model: the black-box model. Must be a `Callable` with:

            - input: `ndarray` of shape `(N, D)`
            - output: `ndarray` of shape `(N, )`

        model_jac: the Jacobian of the model. Must be a `Callable` with:

            - input: `ndarray` of shape `(N, D)`
            - output: `ndarray` of shape `(N, D)`

        nof_instances: the number of instances to use for the explanation

            - use an `int`, to specify the number of instances
            - use `"all"`, to use all the instances

        axis_limits: The limits of the feature effect plot along each axis

            - use a `ndarray` of shape `(2, D)`, to specify them manually
            - use `None`, to be inferred from the data

        data_effect:
            - if np.ndarray, the model Jacobian computed on the `data`
            - if None, the Jacobian will be computed using model_jac

        feature_names: The names of the features

            - use a `list` of `str`, to specify the name manually. For example: `["age", "weight", ...]`
            - use `None`, to keep the default names: `["x_0", "x_1", ...]`

        target_name: The name of the target variable

            - use a `str`, to specify it name manually. For example: `"price"`
            - use `None`, to keep the default name: `"y"`
    """
    super(RHALE, self).__init__(
        data,
        model,
        model_jac,
        data_effect,
        nof_instances,
        axis_limits,
        feature_names,
        target_name,
        "RHALE",
    )

`fit(features='all', binning_method='greedy', centering=True, points_for_centering=30)`

Fit the model.

Parameters:

Name	Type	Description	Default
`features`	`(int, str, list)`	the features to fit. If set to "all", all the features will be fitted.	`'all'`
`binning_method`	`str`	the binning method to use. Use `"greedy"` for using the Greedy binning solution with the default parameters. For custom parameters initialize a `binning_methods.Greedy` object Use `"dp"` for using a Dynamic Programming binning solution with the default parameters. For custom parameters initialize a `binning_methods.DynamicProgramming` object Use `"fixed"` for using a Fixed binning solution with the default parameters. For custom parameters initialize a `binning_methods.Fixed` object	`'greedy'`
`centering`	`Union[bool, str]`	whether to compute the normalization constant for centering the plot: `False` means no centering `True` or `zero_integral` centers around the `y` axis `zero_start` starts the plot from `y=0`	`True`
`points_for_centering`	`int`	the number of points to use for centering the plot. Default is 100.	`30`

Source code in effector/global_effect_ale.py

def fit(
    self,
    features: typing.Union[int, str, list] = "all",
    binning_method: typing.Union[
        str, ap.DynamicProgramming, ap.Greedy, ap.Fixed
    ] = "greedy",
    centering: typing.Union[bool, str] = True,
    points_for_centering: int = 30
) -> None:
    """Fit the model.

    Args:
        features (int, str, list): the features to fit.

            - If set to "all", all the features will be fitted.

        binning_method (str): the binning method to use.

            - Use `"greedy"` for using the Greedy binning solution with the default parameters.
              For custom parameters initialize a `binning_methods.Greedy` object
            - Use `"dp"` for using a Dynamic Programming binning solution with the default parameters.
              For custom parameters initialize a `binning_methods.DynamicProgramming` object
            - Use `"fixed"` for using a Fixed binning solution with the default parameters.
              For custom parameters initialize a `binning_methods.Fixed` object

        centering: whether to compute the normalization constant for centering the plot:

            - `False` means no centering
            - `True` or `zero_integral` centers around the `y` axis
            - `zero_start` starts the plot from `y=0`

        points_for_centering: the number of points to use for centering the plot. Default is 100.
    """
    assert (
        binning_method in ["greedy", "dynamic", "fixed"]
        or isinstance(binning_method, ap.Greedy)
        or isinstance(binning_method, ap.DynamicProgramming)
        or isinstance(binning_method, ap.Fixed)
    ), "Unknown binning method!"

    self._fit_loop(features, binning_method, centering, points_for_centering)

`eval(feature, xs, heterogeneity=False, centering=True, **kwargs)`

Evalueate the (RH)ALE feature effect of feature feature at points xs.

Notes

This is a common method inherited by both ALE and RHALE.

Parameters:

Name	Type	Description	Default
`feature`	`int`	index of feature of interest	required
`xs`	`ndarray`	the points along the s-th axis to evaluate the FE plot - `np.ndarray` of shape `(T, )`	required
`heterogeneity`	`bool`	whether to return heterogeneity: `False`, returns the mean effect `y` at the given `xs` `True`, returns a tuple `(y, H)` of two `ndarrays`; `y` is the mean effect and `H` is the heterogeneity evaluated at `xs`	`False`
`centering`	`Union[bool, str]`	whether to center the plot: `False` means no centering `True` or `zero_integral` centers around the `y` axis. `zero_start` starts the plot from `y=0`.	`True`

Returns: the mean effect y, if heterogeneity=False (default) or a tuple (y, std) otherwise

Source code in effector/global_effect_ale.py

def eval(
    self,
    feature: int,
    xs: np.ndarray,
    heterogeneity: bool = False,
    centering: typing.Union[bool, str] = True,
    **kwargs
) -> Union[np.ndarray, Tuple[np.ndarray, np.ndarray]]:
    """Evalueate the (RH)ALE feature effect of feature `feature` at points `xs`.

    Notes:
        This is a common method inherited by both ALE and RHALE.

    Args:
        feature: index of feature of interest
        xs: the points along the s-th axis to evaluate the FE plot
          - `np.ndarray` of shape `(T, )`
        heterogeneity: whether to return heterogeneity:

              - `False`, returns the mean effect `y` at the given `xs`
              - `True`, returns a tuple `(y, H)` of two `ndarrays`; `y` is the mean effect and `H` is the
              heterogeneity evaluated at `xs`

        centering: whether to center the plot:

            - `False` means no centering
            - `True` or `zero_integral` centers around the `y` axis.
            - `zero_start` starts the plot from `y=0`.
    Returns:
        the mean effect `y`, if `heterogeneity=False` (default) or a tuple `(y, std)` otherwise

    """
    centering = helpers.prep_centering(centering)

    if self.requires_refit(feature, centering):
        self.fit(features=feature, centering=centering)

    # Check if the lower bound is less than the upper bound
    assert self.axis_limits[0, feature] < self.axis_limits[1, feature]

    # Evaluate the feature
    yy = self._eval_unnorm(feature, xs, heterogeneity=heterogeneity)
    y, std = yy if heterogeneity else (yy, None)

    # Center if asked
    y = (
        y - self.feature_effect["feature_" + str(feature)]["norm_const"]
        if centering
        else y
    )

    return (y, std) if heterogeneity is not False else y

`plot(feature, heterogeneity=True, centering=True, scale_x=None, scale_y=None, show_avg_output=False, y_limits=None, dy_limits=None, show_only_aggregated=False, show_plot=True)`

Plot the (RH)ALE feature effect of feature feature.

Notes

This is a common method inherited by both ALE and RHALE.

Parameters:

Name	Type	Description	Default
`feature`	`int`	the feature to plot	required
`heterogeneity`	`bool`	whether to plot the heterogeneity `False`, plots only the mean effect `True`, the std of the bin-effects will be plotted using a red vertical bar	`True`
`centering`	`Union[bool, str]`	whether to center the plot: `False` means no centering `True` or `zero_integral` centers around the `y` axis. `zero_start` starts the plot from `y=0`.	`True`
`scale_x`	`Optional[dict]`	None or Dict with keys ['std', 'mean'] If set to None, no scaling will be applied. If set to a dict, the x-axis will be scaled by the standard deviation and the mean.	`None`
`scale_y`	`Optional[dict]`	None or Dict with keys ['std', 'mean'] If set to None, no scaling will be applied. If set to a dict, the y-axis will be scaled by the standard deviation and the mean.	`None`
`show_avg_output`	`bool`	if True, the average output will be shown as a horizontal line.	`False`
`y_limits`	`Optional[List]`	None or tuple, the limits of the y-axis If set to None, the limits of the y-axis are set automatically If set to a tuple, the limits are manually set	`None`
`dy_limits`	`Optional[List]`	None or tuple, the limits of the dy-axis If set to None, the limits of the dy-axis are set automatically If set to a tuple, the limits are manually set	`None`
`show_only_aggregated`	`bool`	if True, only the main ale plot will be shown	`False`
`show_plot`	`bool`	if True, the plot will be shown	`True`

Source code in effector/global_effect_ale.py

def plot(
    self,
    feature: int,
    heterogeneity: bool = True,
    centering: Union[bool, str] = True,
    scale_x: Optional[dict] = None,
    scale_y: Optional[dict] = None,
    show_avg_output: bool = False,
    y_limits: Optional[List] = None,
    dy_limits: Optional[List] = None,
    show_only_aggregated: bool = False,
    show_plot: bool = True,
):
    """
    Plot the (RH)ALE feature effect of feature `feature`.

    Notes:
        This is a common method inherited by both ALE and RHALE.

    Parameters:
        feature: the feature to plot
        heterogeneity: whether to plot the heterogeneity

              - `False`, plots only the mean effect
              - `True`, the std of the bin-effects will be plotted using a red vertical bar

        centering: whether to center the plot:

            - `False` means no centering
            - `True` or `zero_integral` centers around the `y` axis.
            - `zero_start` starts the plot from `y=0`.

        scale_x: None or Dict with keys ['std', 'mean']

            - If set to None, no scaling will be applied.
            - If set to a dict, the x-axis will be scaled by the standard deviation and the mean.
        scale_y: None or Dict with keys ['std', 'mean']

            - If set to None, no scaling will be applied.
            - If set to a dict, the y-axis will be scaled by the standard deviation and the mean.
        show_avg_output: if True, the average output will be shown as a horizontal line.
        y_limits: None or tuple, the limits of the y-axis

            - If set to None, the limits of the y-axis are set automatically
            - If set to a tuple, the limits are manually set

        dy_limits: None or tuple, the limits of the dy-axis

            - If set to None, the limits of the dy-axis are set automatically
            - If set to a tuple, the limits are manually set

        show_only_aggregated: if True, only the main ale plot will be shown
        show_plot: if True, the plot will be shown
    """
    heterogeneity = helpers.prep_confidence_interval(heterogeneity)
    centering = helpers.prep_centering(centering)

    # hack to fit the feature if not fitted
    self.eval(
        feature, np.array([self.axis_limits[0, feature]]), centering=centering
    )

    if show_avg_output:
        avg_output = helpers.prep_avg_output(
            self.data, self.model, self.avg_output, scale_y
        )
    else:
        avg_output = None

    title = "Accumulated Local Effects (ALE)" if self.method_name == "ale" else "Robust and Heterogeneity-Aware ALE (RHALE)"
    ret = vis.ale_plot(
        self.feature_effect["feature_" + str(feature)],
        self.eval,
        feature,
        centering=centering,
        error=heterogeneity,
        scale_x=scale_x,
        scale_y=scale_y,
        title=title,
        avg_output=avg_output,
        feature_names=self.feature_names,
        target_name=self.target_name,
        y_limits=y_limits,
        dy_limits=dy_limits,
        show_only_aggregated=show_only_aggregated,
        show_plot=show_plot,
    )

    if not show_plot:
        fig, ax = ret
        return fig, ax

`effector.global_effect_pdp.PDP(data, model, axis_limits=None, nof_instances=10000, feature_names=None, target_name=None)`

Bases: PDPBase

Constructor of the PDP class.

Definition

PDP: $$ PDP(x_s) = {1 \over N} \sum_{i=1}^N f(x_s, \mathbf{x}_c^i) $$

centered-PDP: $$ PDP_c(x_s) = PDP(x_s) - c, \quad c = {1 \over M} \sum_{j=1}^M PDP(x_s^j) $$

ICE: $$ ICE^i(x_s) = f(x_s, \mathbf{x}_c^i), \quad i=1, \dots, N $$

centered-ICE: $$ ICE_c^i(x_s) = ICE^i(x_s) - c_i, \quad c_i = {1 \over M} \sum_{j=1}^M ICE^i(x_s^j) $$

heterogeneity function: $$ h(x_s) = {1 \over N} \sum_{i=1}^N ( ICE_c^i(x_s) - PDP_c(x_s) )^2 $$

The heterogeneity value is: $$ \mathcal{H}(x_s) = {1 \over M} \sum_{j=1}^M h(x_s^j), $$ where $x_s^j$ are an equally spaced grid of points in $[x_s^{\min}, x_s^{\max}]$.

Notes

The required parameters are data and model. The rest are optional.

Parameters:

Name	Type	Description	Default
`data`	`ndarray`	the design matrix shape: `(N,D)`	required
`model`	`Callable`	the black-box model. Must be a `Callable` with: input: `ndarray` of shape `(N, D)` output: `ndarray` of shape `(N,)`	required
`axis_limits`	`Optional[ndarray]`	The limits of the feature effect plot along each axis use a `ndarray` of shape `(2, D)`, to specify them manually use `None`, to be inferred from the data	`None`
`nof_instances`	`Union[int, str]`	maximum number of instances to be used use "all", for using all instances. use an `int`, for selecting `nof_instances` instances randomly.	`10000`
`feature_names`	`Optional[List]`	The names of the features use a `list` of `str`, to specify the name manually. For example: `["age", "weight", ...]` use `None`, to keep the default names: `["x_0", "x_1", ...]`	`None`
`target_name`	`Optional[str]`	The name of the target variable use a `str`, to specify it name manually. For example: `"price"` use `None`, to keep the default name: `"y"`	`None`

Methods:

Name	Description
`fit`	Fit the Feature effect to the data.
`eval`	Evaluate the effect of the s-th feature at positions `xs`.
`plot`	Plot the feature effect.

Source code in effector/global_effect_pdp.py

def __init__(
    self,
    data: np.ndarray,
    model: Callable,
    axis_limits: Optional[np.ndarray] = None,
    nof_instances: Union[int, str] = 10_000,
    feature_names: Optional[List] = None,
    target_name: Optional[str] = None,
):
    """
    Constructor of the PDP class.

    Definition:
        PDP:
        $$
        PDP(x_s) = {1 \over N} \sum_{i=1}^N f(x_s, \mathbf{x}_c^i)
        $$

        centered-PDP:
        $$
        PDP_c(x_s) = PDP(x_s) - c, \quad c = {1 \over M} \sum_{j=1}^M PDP(x_s^j)
        $$

        ICE:
        $$
        ICE^i(x_s) = f(x_s, \mathbf{x}_c^i), \quad i=1, \dots, N
        $$

        centered-ICE:
        $$
        ICE_c^i(x_s) = ICE^i(x_s) - c_i, \quad c_i = {1 \over M} \sum_{j=1}^M ICE^i(x_s^j)
        $$

        heterogeneity function:
        $$
        h(x_s) = {1 \over N} \sum_{i=1}^N ( ICE_c^i(x_s) - PDP_c(x_s) )^2
        $$

        The heterogeneity value is:
        $$
        \mathcal{H}(x_s) = {1 \over M} \sum_{j=1}^M h(x_s^j),
        $$
        where $x_s^j$ are an equally spaced grid of points in $[x_s^{\min}, x_s^{\max}]$.

    Notes:
        The required parameters are `data` and `model`. The rest are optional.

    Args:
        data: the design matrix

            - shape: `(N,D)`
        model: the black-box model. Must be a `Callable` with:

            - input: `ndarray` of shape `(N, D)`
            - output: `ndarray` of shape `(N,)`

        axis_limits: The limits of the feature effect plot along each axis

            - use a `ndarray` of shape `(2, D)`, to specify them manually
            - use `None`, to be inferred from the data

        nof_instances: maximum number of instances to be used

            - use "all", for using all instances.
            - use an `int`, for selecting `nof_instances` instances randomly.

        feature_names: The names of the features

            - use a `list` of `str`, to specify the name manually. For example: `["age", "weight", ...]`
            - use `None`, to keep the default names: `["x_0", "x_1", ...]`

        target_name: The name of the target variable

            - use a `str`, to specify it name manually. For example: `"price"`
            - use `None`, to keep the default name: `"y"`
    """

    super(PDP, self).__init__(
        data,
        model,
        None,
        axis_limits,
        nof_instances,
        feature_names,
        target_name,
        method_name="PDP",
    )

`fit(features='all', centering=False, points_for_centering=30, use_vectorized=True)`

Fit the Feature effect to the data.

Notes

You can use .eval or .plot without calling .fit explicitly. The only thing that .fit does is to compute the normalization constant for centering the PDP and ICE plots. This will be automatically done when calling eval or plot, so there is no need to call fit explicitly.

Parameters:

Name	Type	Description	Default
`features`	`Union[int, str, list]`	the features to fit. - If set to "all", all the features will be fitted.	`'all'`
`centering`	`Union[bool, str]`	whether to center the plot: `False` means no centering `True` or `zero_integral` centers around the `y` axis. `zero_start` starts the plot from `y=0`.	`False`
`points_for_centering`	`int`	number of linspaced points along the feature axis used for centering.	`30`
`use_vectorized`	`bool`	whether to use vectorized operations for the PDP and ICE curves	`True`

Source code in effector/global_effect_pdp.py

def fit(
    self,
    features: Union[int, str, list] = "all",
    centering: Union[bool, str] = False,
    points_for_centering: int = 30,
    use_vectorized: bool = True,
):
    """
    Fit the Feature effect to the data.

    Notes:
        You can use `.eval` or `.plot` without calling `.fit` explicitly.
        The only thing that `.fit` does is to compute the normalization constant for centering the PDP and ICE plots.
        This will be automatically done when calling `eval` or `plot`, so there is no need to call `fit` explicitly.

    Args:
        features: the features to fit.
            - If set to "all", all the features will be fitted.

        centering: whether to center the plot:

            - `False` means no centering
            - `True` or `zero_integral` centers around the `y` axis.
            - `zero_start` starts the plot from `y=0`.

        points_for_centering: number of linspaced points along the feature axis used for centering.
        use_vectorized: whether to use vectorized operations for the PDP and ICE curves

    """
    centering = helpers.prep_centering(centering)
    features = helpers.prep_features(features, self.dim)

    for s in features:
        self.feature_effect["feature_" + str(s)] = self._fit_feature(
            s, centering, points_for_centering, use_vectorized
        )
        self.is_fitted[s] = True
        self.fit_args["feature_" + str(s)] = {
            "centering": centering,
            "points_for_centering": points_for_centering,
        }

`eval(feature, xs, heterogeneity=False, centering=False, return_all=False, use_vectorized=True)`

Evaluate the effect of the s-th feature at positions xs.

Parameters:

Name	Type	Description	Default
`feature`	`int`	index of feature of interest	required
`xs`	`ndarray`	the points along the s-th axis to evaluate the FE plot `np.ndarray` of shape `(T, )`	required
`heterogeneity`	`bool`	whether to return the heterogeneity measures. if `heterogeneity=False`, the function returns the mean effect at the given `xs` If `heterogeneity=True`, the function returns `(y, std)` where `y` is the mean effect and `std` is the standard deviation of the mean effect	`False`
`centering`	`Union[bool, str]`	whether to center the PDP If `centering` is `False`, the PDP not centered If `centering` is `True` or `zero_integral`, the PDP is centered around the `y` axis. If `centering` is `zero_start`, the PDP starts from `y=0`.	`False`
`return_all`	`bool`	whether to return PDP and ICE plots evaluated at `xs` If `return_all=False`, the function returns the mean effect at the given `xs` If `return_all=True`, the function returns a `ndarray` of shape `(T, N)` with the `N` ICE plots evaluated at `xs`	`False`
`use_vectorized`	`bool`	whether to use the vectorized version of the computation	`True`

Returns:

Type	Description
`Union[ndarray, Tuple[ndarray, ndarray]]`	the mean effect `y`, if `heterogeneity=False` (default) or a tuple `(y, std)` otherwise

Source code in effector/global_effect_pdp.py

def eval(
    self,
    feature: int,
    xs: np.ndarray,
    heterogeneity: bool = False,
    centering: typing.Union[bool, str] = False,
    return_all: bool = False,
    use_vectorized: bool = True,
) -> typing.Union[np.ndarray, typing.Tuple[np.ndarray, np.ndarray]]:
    """Evaluate the effect of the s-th feature at positions `xs`.

    Args:
        feature: index of feature of interest
        xs: the points along the s-th axis to evaluate the FE plot

          - `np.ndarray` of shape `(T, )`

        heterogeneity: whether to return the heterogeneity measures.

              - if `heterogeneity=False`, the function returns the mean effect at the given `xs`
              - If `heterogeneity=True`, the function returns `(y, std)` where `y` is the mean effect and `std` is the standard deviation of the mean effect

        centering: whether to center the PDP

            - If `centering` is `False`, the PDP not centered
            - If `centering` is `True` or `zero_integral`, the PDP is centered around the `y` axis.
            - If `centering` is `zero_start`, the PDP starts from `y=0`.

        return_all: whether to return PDP and ICE plots evaluated at `xs`

            - If `return_all=False`, the function returns the mean effect at the given `xs`
            - If `return_all=True`, the function returns a `ndarray` of shape `(T, N)` with the `N` ICE plots evaluated at `xs`

        use_vectorized: whether to use the vectorized version of the computation

    Returns:
        the mean effect `y`, if `heterogeneity=False` (default) or a tuple `(y, std)` otherwise

    """
    centering = helpers.prep_centering(centering)

    if self.requires_refit(feature, centering):
        self.fit(
            features=feature, centering=centering, use_vectorized=use_vectorized
        )

    # Check if the lower bound is less than the upper bound
    assert self.axis_limits[0, feature] < self.axis_limits[1, feature]

    # new implementation
    y_ice = self._predict(self.data, xs, feature, use_vectorized)
    if centering:
        norm_consts = np.expand_dims(
            self.feature_effect["feature_" + str(feature)]["norm_const"], axis=0
        )
        y_ice = y_ice - norm_consts

    y_mean = np.mean(y_ice, axis=1)

    if return_all:
        return y_ice

    if heterogeneity:
        y_var = np.var(y_ice, axis=1)
        return y_mean, y_var
    else:
        return y_mean

`plot(feature, heterogeneity='ice', centering=True, nof_points=30, scale_x=None, scale_y=None, nof_ice='all', show_avg_output=False, y_limits=None, use_vectorized=True, show_plot=True)`

Plot the feature effect.

Parameters:

Name	Type	Description	Default
`feature`	`int`	the feature to plot	required
`heterogeneity`	`Union[bool, str]`	whether to plot the heterogeneity `False`, plot only the mean effect `True` or `std`, plot the standard deviation of the ICE curves `ice`, also plot the ICE curves	`'ice'`
`centering`	`Union[bool, str]`	whether to center the plot `False` means no centering `True` or `zero_integral` centers around the `y` axis. `zero_start` starts the plot from `y=0`.	`True`
`nof_points`	`int`	the grid size for the PDP plot	`30`
`scale_x`	`Optional[dict]`	None or Dict with keys ['std', 'mean'] If set to None, no scaling will be applied. If set to a dict, the x-axis will be scaled `x = (x + mean) * std`	`None`
`scale_y`	`Optional[dict]`	None or Dict with keys ['std', 'mean'] If set to None, no scaling will be applied. If set to a dict, the y-axis will be scaled `y = (y + mean) * std`	`None`
`nof_ice`	`Union[int, str]`	number of ICE plots to show on top of the SHAP curve	`'all'`
`show_avg_output`	`bool`	whether to show the average output of the model	`False`
`y_limits`	`Optional[List]`	None or tuple, the limits of the y-axis If set to None, the limits of the y-axis are set automatically If set to a tuple, the limits are manually set	`None`
`use_vectorized`	`bool`	whether to use the vectorized version of the PDP computation	`True`

Source code in effector/global_effect_pdp.py

def plot(
    self,
    feature: int,
    heterogeneity: Union[bool, str] = "ice",
    centering: Union[bool, str] = True,
    nof_points: int = 30,
    scale_x: Optional[dict] = None,
    scale_y: Optional[dict] = None,
    nof_ice: Union[int, str] = "all",
    show_avg_output: bool = False,
    y_limits: Optional[List] = None,
    use_vectorized: bool = True,
    show_plot: bool = True,
):
    """
    Plot the feature effect.

    Parameters:
        feature: the feature to plot
        heterogeneity: whether to plot the heterogeneity

              - `False`, plot only the mean effect
              - `True` or `std`, plot the standard deviation of the ICE curves
              - `ice`, also plot the ICE curves

        centering: whether to center the plot

            - `False` means no centering
            - `True` or `zero_integral` centers around the `y` axis.
            - `zero_start` starts the plot from `y=0`.

        nof_points: the grid size for the PDP plot

        scale_x: None or Dict with keys ['std', 'mean']

            - If set to None, no scaling will be applied.
            - If set to a dict, the x-axis will be scaled `x = (x + mean) * std`

        scale_y: None or Dict with keys ['std', 'mean']

            - If set to None, no scaling will be applied.
            - If set to a dict, the y-axis will be scaled `y = (y + mean) * std`

        nof_ice: number of ICE plots to show on top of the SHAP curve
        show_avg_output: whether to show the average output of the model

        y_limits: None or tuple, the limits of the y-axis

            - If set to None, the limits of the y-axis are set automatically
            - If set to a tuple, the limits are manually set

        use_vectorized: whether to use the vectorized version of the PDP computation
    """
    ret = self._plot(
        feature,
        heterogeneity,
        centering,
        nof_points,
        scale_x,
        scale_y,
        nof_ice,
        show_avg_output,
        y_limits,
        use_vectorized,
        show_plot
    )

    if not show_plot:
        return ret

`effector.global_effect_pdp.DerPDP(data, model, model_jac=None, axis_limits=None, nof_instances=10000, feature_names=None, target_name=None)`

Bases: PDPBase

Constructor of the DerivativePDP class.

Definition

d-PDP: $$ dPDP(x_s) = {1 \over N} \sum_{i=1}^N {\partial f \over \partial x_s}(x_s, \mathbf{x}_c^i) $$

centered-PDP: $$ dPDP_c(x_s) = dPDP(x_s) - c, \quad c = {1 \over M} \sum_{j=1}^M dPDP(x_s^j) $$

ICE: $$ dICE^i(x_s) = {\partial f \over \partial x_s}(x_s, \mathbf{x}_c^i), \quad i=1, \dots, N $$

centered-ICE: $$ dICE_c^i(x_s) = dICE^i(x_s) - c_i, \quad c_i = {1 \over M} \sum_{j=1}^M dICE^i(x_s^j) $$

heterogeneity function: $$ h(x_s) = {1 \over N} \sum_{i=1}^N ( dICE_c^i(x_s) - dPDP_c(x_s) )^2 $$

The heterogeneity value is: $$ \mathcal{H}(x_s) = {1 \over M} \sum_{j=1}^M h(x_s^j), $$ where $x_s^j$ are an equally spaced grid of points in $[x_s^{\min}, x_s^{\max}]$.

Notes

The required parameters are data and model. The rest are optional.
The model_jac is the Jacobian of the model. If None, the Jacobian will be computed numerically.

Parameters:

Name	Type	Description	Default
`data`	`ndarray`	the design matrix shape: `(N,D)`	required
`model`	`Callable`	the black-box model. Must be a `Callable` with: input: `ndarray` of shape `(N, D)` output: `ndarray` of shape `(N, )`	required
`model_jac`	`Optional[Callable]`	the black-box model Jacobian. Must be a `Callable` with: input: `ndarray` of shape `(N, D)` output: `ndarray` of shape `(N, D)`	`None`
`axis_limits`	`Optional[ndarray]`	The limits of the feature effect plot along each axis use a `ndarray` of shape `(2, D)`, to specify them manually use `None`, to be inferred from the data	`None`
`nof_instances`	`Union[int, str]`	maximum number of instances to be used for PDP. use "all", for using all instances. use an `int`, for using `nof_instances` instances.	`10000`
`feature_names`	`Optional[List]`	The names of the features use a `list` of `str`, to specify the name manually. For example: `["age", "weight", ...]` use `None`, to keep the default names: `["x_0", "x_1", ...]`	`None`
`target_name`	`Optional[str]`	The name of the target variable use a `str`, to specify it name manually. For example: `"price"` use `None`, to keep the default name: `"y"`	`None`

Methods:

Name	Description
`fit`	Fit the Feature effect to the data.
`eval`	Evaluate the effect of the s-th feature at positions `xs`.
`plot`	Plot the feature effect.

Source code in effector/global_effect_pdp.py

def __init__(
    self,
    data: np.ndarray,
    model: Callable,
    model_jac: Optional[Callable] = None,
    axis_limits: Optional[np.ndarray] = None,
    nof_instances: Union[int, str] = 10_000,
    feature_names: Optional[List] = None,
    target_name: Optional[str] = None,
):
    """
    Constructor of the DerivativePDP class.

    Definition:
        d-PDP:
        $$
        dPDP(x_s) = {1 \over N} \sum_{i=1}^N {\partial f \over \partial x_s}(x_s, \mathbf{x}_c^i)
        $$

        centered-PDP:
        $$
        dPDP_c(x_s) = dPDP(x_s) - c, \quad c = {1 \over M} \sum_{j=1}^M dPDP(x_s^j)
        $$

        ICE:
        $$
        dICE^i(x_s) = {\partial f \over \partial x_s}(x_s, \mathbf{x}_c^i), \quad i=1, \dots, N
        $$

        centered-ICE:
        $$
        dICE_c^i(x_s) = dICE^i(x_s) - c_i, \quad c_i = {1 \over M} \sum_{j=1}^M dICE^i(x_s^j)
        $$

        heterogeneity function:
        $$
        h(x_s) = {1 \over N} \sum_{i=1}^N ( dICE_c^i(x_s) - dPDP_c(x_s) )^2
        $$

        The heterogeneity value is:
        $$
        \mathcal{H}(x_s) = {1 \over M} \sum_{j=1}^M h(x_s^j),
        $$
        where $x_s^j$ are an equally spaced grid of points in $[x_s^{\min}, x_s^{\max}]$.

    Notes:
        - The required parameters are `data` and `model`. The rest are optional.
        - The `model_jac` is the Jacobian of the model. If `None`, the Jacobian will be computed numerically.

    Args:
        data: the design matrix

            - shape: `(N,D)`
        model: the black-box model. Must be a `Callable` with:

            - input: `ndarray` of shape `(N, D)`
            - output: `ndarray` of shape `(N, )`

        model_jac: the black-box model Jacobian. Must be a `Callable` with:

            - input: `ndarray` of shape `(N, D)`
            - output: `ndarray` of shape `(N, D)`

        axis_limits: The limits of the feature effect plot along each axis

            - use a `ndarray` of shape `(2, D)`, to specify them manually
            - use `None`, to be inferred from the data

        nof_instances: maximum number of instances to be used for PDP.

            - use "all", for using all instances.
            - use an `int`, for using `nof_instances` instances.

        feature_names: The names of the features

            - use a `list` of `str`, to specify the name manually. For example: `["age", "weight", ...]`
            - use `None`, to keep the default names: `["x_0", "x_1", ...]`

        target_name: The name of the target variable

            - use a `str`, to specify it name manually. For example: `"price"`
            - use `None`, to keep the default name: `"y"`
    """

    super(DerPDP, self).__init__(
        data,
        model,
        model_jac,
        axis_limits,
        nof_instances,
        feature_names,
        target_name,
        method_name="d-PDP",
    )

`fit(features='all', centering=False, points_for_centering=30, use_vectorized=True)`

Fit the Feature effect to the data.

Notes

You can use .eval or .plot without calling .fit explicitly. The only thing that .fit does is to compute the normalization constant for centering the PDP and ICE plots. This will be automatically done when calling eval or plot, so there is no need to call fit explicitly.

Parameters:

Name	Type	Description	Default
`features`	`Union[int, str, list]`	the features to fit. - If set to "all", all the features will be fitted.	`'all'`
`centering`	`Union[bool, str]`	whether to center the plot: `False` means no centering `True` or `zero_integral` centers around the `y` axis. `zero_start` starts the plot from `y=0`.	`False`
`points_for_centering`	`int`	number of linspaced points along the feature axis used for centering.	`30`
`use_vectorized`	`bool`	whether to use vectorized operations for the PDP and ICE curves	`True`

Source code in effector/global_effect_pdp.py

def fit(
    self,
    features: Union[int, str, list] = "all",
    centering: Union[bool, str] = False,
    points_for_centering: int = 30,
    use_vectorized: bool = True,
):
    """
    Fit the Feature effect to the data.

    Notes:
        You can use `.eval` or `.plot` without calling `.fit` explicitly.
        The only thing that `.fit` does is to compute the normalization constant for centering the PDP and ICE plots.
        This will be automatically done when calling `eval` or `plot`, so there is no need to call `fit` explicitly.

    Args:
        features: the features to fit.
            - If set to "all", all the features will be fitted.

        centering: whether to center the plot:

            - `False` means no centering
            - `True` or `zero_integral` centers around the `y` axis.
            - `zero_start` starts the plot from `y=0`.

        points_for_centering: number of linspaced points along the feature axis used for centering.
        use_vectorized: whether to use vectorized operations for the PDP and ICE curves

    """
    centering = helpers.prep_centering(centering)
    features = helpers.prep_features(features, self.dim)

    for s in features:
        self.feature_effect["feature_" + str(s)] = self._fit_feature(
            s, centering, points_for_centering, use_vectorized
        )
        self.is_fitted[s] = True
        self.fit_args["feature_" + str(s)] = {
            "centering": centering,
            "points_for_centering": points_for_centering,
        }

`eval(feature, xs, heterogeneity=False, centering=False, return_all=False, use_vectorized=True)`

Evaluate the effect of the s-th feature at positions xs.

Parameters:

Name	Type	Description	Default
`feature`	`int`	index of feature of interest	required
`xs`	`ndarray`	the points along the s-th axis to evaluate the FE plot `np.ndarray` of shape `(T, )`	required
`heterogeneity`	`bool`	whether to return the heterogeneity measures. if `heterogeneity=False`, the function returns the mean effect at the given `xs` If `heterogeneity=True`, the function returns `(y, std)` where `y` is the mean effect and `std` is the standard deviation of the mean effect	`False`
`centering`	`Union[bool, str]`	whether to center the PDP If `centering` is `False`, the PDP not centered If `centering` is `True` or `zero_integral`, the PDP is centered around the `y` axis. If `centering` is `zero_start`, the PDP starts from `y=0`.	`False`
`return_all`	`bool`	whether to return PDP and ICE plots evaluated at `xs` If `return_all=False`, the function returns the mean effect at the given `xs` If `return_all=True`, the function returns a `ndarray` of shape `(T, N)` with the `N` ICE plots evaluated at `xs`	`False`
`use_vectorized`	`bool`	whether to use the vectorized version of the computation	`True`

Returns:

Type	Description
`Union[ndarray, Tuple[ndarray, ndarray]]`	the mean effect `y`, if `heterogeneity=False` (default) or a tuple `(y, std)` otherwise

Source code in effector/global_effect_pdp.py

def eval(
    self,
    feature: int,
    xs: np.ndarray,
    heterogeneity: bool = False,
    centering: typing.Union[bool, str] = False,
    return_all: bool = False,
    use_vectorized: bool = True,
) -> typing.Union[np.ndarray, typing.Tuple[np.ndarray, np.ndarray]]:
    """Evaluate the effect of the s-th feature at positions `xs`.

    Args:
        feature: index of feature of interest
        xs: the points along the s-th axis to evaluate the FE plot

          - `np.ndarray` of shape `(T, )`

        heterogeneity: whether to return the heterogeneity measures.

              - if `heterogeneity=False`, the function returns the mean effect at the given `xs`
              - If `heterogeneity=True`, the function returns `(y, std)` where `y` is the mean effect and `std` is the standard deviation of the mean effect

        centering: whether to center the PDP

            - If `centering` is `False`, the PDP not centered
            - If `centering` is `True` or `zero_integral`, the PDP is centered around the `y` axis.
            - If `centering` is `zero_start`, the PDP starts from `y=0`.

        return_all: whether to return PDP and ICE plots evaluated at `xs`

            - If `return_all=False`, the function returns the mean effect at the given `xs`
            - If `return_all=True`, the function returns a `ndarray` of shape `(T, N)` with the `N` ICE plots evaluated at `xs`

        use_vectorized: whether to use the vectorized version of the computation

    Returns:
        the mean effect `y`, if `heterogeneity=False` (default) or a tuple `(y, std)` otherwise

    """
    centering = helpers.prep_centering(centering)

    if self.requires_refit(feature, centering):
        self.fit(
            features=feature, centering=centering, use_vectorized=use_vectorized
        )

    # Check if the lower bound is less than the upper bound
    assert self.axis_limits[0, feature] < self.axis_limits[1, feature]

    # new implementation
    y_ice = self._predict(self.data, xs, feature, use_vectorized)
    if centering:
        norm_consts = np.expand_dims(
            self.feature_effect["feature_" + str(feature)]["norm_const"], axis=0
        )
        y_ice = y_ice - norm_consts

    y_mean = np.mean(y_ice, axis=1)

    if return_all:
        return y_ice

    if heterogeneity:
        y_var = np.var(y_ice, axis=1)
        return y_mean, y_var
    else:
        return y_mean

`plot(feature, heterogeneity='ice', centering=False, nof_points=30, scale_x=None, scale_y=None, nof_ice=100, show_avg_output=False, dy_limits=None, use_vectorized=True, show_plot=True)`

Plot the feature effect.

Parameters:

Name	Type	Description	Default
`feature`	`int`	the feature to plot	required
`heterogeneity`	`Union[bool, str]`	whether to plot the heterogeneity `False`, plot only the mean effect `True` or `std`, plot the standard deviation of the ICE curves `ice`, also plot the ICE curves	`'ice'`
`centering`	`Union[bool, str]`	whether to center the plot `False` means no centering `True` or `zero_integral` centers around the `y` axis. `zero_start` starts the plot from `y=0`.	`False`
`nof_points`	`int`	the grid size for the PDP plot	`30`
`scale_x`	`Optional[dict]`	None or Dict with keys ['std', 'mean'] If set to None, no scaling will be applied. If set to a dict, the x-axis will be scaled `x = (x + mean) * std`	`None`
`scale_y`	`Optional[dict]`	None or Dict with keys ['std', 'mean'] If set to None, no scaling will be applied. If set to a dict, the y-axis will be scaled `y = (y + mean) * std`	`None`
`nof_ice`	`Union[int, str]`	number of ICE plots to show on top of the SHAP curve	`100`
`show_avg_output`	`bool`	whether to show the average output of the model	`False`
`dy_limits`	`Optional[List]`	None or tuple, the limits of the y-axis for the derivative PDP If set to None, the limits of the y-axis are set automatically If set to a tuple, the limits are manually set	`None`
`use_vectorized`	`bool`	whether to use the vectorized version of the PDP computation	`True`
`show_plot`	`bool`	whether to show the plot	`True`

Source code in effector/global_effect_pdp.py

def plot(
    self,
    feature: int,
    heterogeneity: Union[bool, str] = "ice",
    centering: Union[bool, str] = False,
    nof_points: int = 30,
    scale_x: Optional[dict] = None,
    scale_y: Optional[dict] = None,
    nof_ice: Union[int, str] = 100,
    show_avg_output: bool = False,
    dy_limits: Optional[List] = None,
    use_vectorized: bool = True,
    show_plot: bool = True,
):
    """
    Plot the feature effect.

    Parameters:
        feature: the feature to plot
        heterogeneity: whether to plot the heterogeneity

              - `False`, plot only the mean effect
              - `True` or `std`, plot the standard deviation of the ICE curves
              - `ice`, also plot the ICE curves

        centering: whether to center the plot

            - `False` means no centering
            - `True` or `zero_integral` centers around the `y` axis.
            - `zero_start` starts the plot from `y=0`.

        nof_points: the grid size for the PDP plot

        scale_x: None or Dict with keys ['std', 'mean']

            - If set to None, no scaling will be applied.
            - If set to a dict, the x-axis will be scaled `x = (x + mean) * std`

        scale_y: None or Dict with keys ['std', 'mean']

            - If set to None, no scaling will be applied.
            - If set to a dict, the y-axis will be scaled `y = (y + mean) * std`

        nof_ice: number of ICE plots to show on top of the SHAP curve
        show_avg_output: whether to show the average output of the model

        dy_limits: None or tuple, the limits of the y-axis for the derivative PDP

            - If set to None, the limits of the y-axis are set automatically
            - If set to a tuple, the limits are manually set

        use_vectorized: whether to use the vectorized version of the PDP computation
        show_plot: whether to show the plot
    """
    ret = self._plot(
        feature,
        heterogeneity,
        centering,
        nof_points,
        scale_x,
        scale_y,
        nof_ice,
        show_avg_output,
        dy_limits,
        use_vectorized,
        show_plot,
    )

    if not show_plot:
        fig, ax = ret
        return fig, ax

`effector.global_effect_shap.ShapDP(data, model, axis_limits=None, nof_instances=1000, feature_names=None, target_name=None, shap_values=None, backend='shap')`

Bases: GlobalEffectBase

Constructor of the ShapDP class.

Definition

The value of a coalition of $S$ features is estimated as: $$ \hat{v}(S) = {1 \over N} \sum_{i=1}^N [f(\mathbf{x}_S \cup \mathbf{x}_C^i) - f(\mathbf{x}^i) ] $$ $\hat{v}(S)$ quantifies the contribution when the features in $S$ are set to $\mathbf{x}_S$. For all instances, we compute two outputs:

$f(\mathbf{x}_S \cup \mathbf{x}_C^i)$ is the output of the model when the features in $S$ are set to $\mathbf{x}_S$ and the rest of the features are left as they are
$f(\mathbf{x}^i)$ is the output of the model when the instance is left as is The average difference (over all instances) between these two outputs is the value of the coalition $S$.

The contribution of a feature $j$ added to a coalition $S$ is estimated as: $$ \hat{\Delta}_{S, j} = \hat{v}(S \cup {j}) - \hat{v}(S) $$

The SHAP value of a feature $j$ with value $x_j$ is the average contribution of feature $j$ across all possible coalitions with a weight $w_{S, j}$:

\[ \hat{\phi}_j(x_j) = {1 \over N} \sum_{S \subseteq \{1, \dots, D\} \setminus \{j\}} w_{S, j} \hat{\Delta}_{S, j} \]

where $w_{S, j}$ assures that the contribution of feature $j$ is the same for all coalitions of the same size. For example, there are $D-1$ ways for $x_j$ to enter a coalition of $|S| = 1$ feature, so $w_{S, j} = {1 \over D (D-1)}$ for each of them. In contrast, there is only one way for $x_j$ to enter a coaltion of $|S|=0$ (to be the first specified feature), so $w_{S, j} = {1 \over D}$.

The SHAP Dependence Plot (SHAP-DP) is a spline $\hat{f}^{SDP}_j(x_j)$ fit to the dataset $\{(x_j^i, \hat{\phi}_j(x_j^i))\}_{i=1}^N$ using the UnivariateSpline function from scipy.interpolate.

Notes

The required parameters are data and model. The rest are optional.
SHAP values are computed using either the shap package (backend="shap") or the shapiq package (backend="shapiq").
SHAP values are centered by default, i.e., the average SHAP value is subtracted from the SHAP values.
More details on the SHAP values can be found in the original paper and in the book Interpreting Machine Learning Models with SHAP

Parameters:

Name	Type	Description	Default
`data`	`ndarray`	the design matrix shape: `(N,D)`	required
`model`	`Callable`	the black-box model. Must be a `Callable` with: input: `ndarray` of shape `(N, D)` output: `ndarray` of shape `(N,)`	required
`axis_limits`	`Optional[ndarray]`	The limits of the feature effect plot along each axis use a `ndarray` of shape `(2, D)`, to specify them manually use `None`, to be inferred from the data	`None`
`nof_instances`	`Union[int, str]`	maximum number of instances to be used for SHAP estimation. use `"all"`, for using all instances. use an `int`, for using `nof_instances` instances.	`1000`
`avg_output`		The average output of the model. use a `float`, to specify it manually use `None`, to be inferred as `np.mean(model(data))`	required
`feature_names`	`Optional[List[str]]`	The names of the features use a `list` of `str`, to specify the name manually. For example: `["age", "weight", ...]` use `None`, to keep the default names: `["x_0", "x_1", ...]`	`None`
`target_name`	`Optional[str]`	The name of the target variable use a `str`, to specify it name manually. For example: `"price"` use `None`, to keep the default name: `"y"`	`None`
`shap_values`	`Optional[ndarray]`	The SHAP values of the model if shap values are already computed, they can be passed here if `None`, the SHAP values will be computed using the `shap` package	`None`
`backend`	`str`	Package to compute SHAP values use `"shap"` for the `shap` package (default) use `"shapiq"` for the `shapiq` package	`'shap'`

Notes

SHAP values are expensive to compute. To speed up the computation consider using a subset of the dataset. The nof_instances parameter controls the number of instances used for computing the SHAP values. The default value is 1_000 instances, which is a good trade-off between speed and accuracy.

Methods:

Name	Description
`fit`	Fit the SHAP Dependence Plot to the data.
`eval`	Evaluate the effect of the s-th feature at positions `xs`.
`plot`	Plot the SHAP Dependence Plot (SDP) of the s-th feature.

Source code in effector/global_effect_shap.py

def __init__(
    self,
    data: np.ndarray,
    model: Callable,
    axis_limits: Optional[np.ndarray] = None,
    nof_instances: Union[int, str] = 1_000,
    feature_names: Optional[List[str]] = None,
    target_name: Optional[str] = None,
    shap_values: Optional[np.ndarray] = None,
    backend: str = "shap",
):
    """
    Constructor of the ShapDP class.

    ??? note "Definition"

        The value of a coalition of $S$ features is estimated as:
        $$
        \hat{v}(S) = {1 \over N} \sum_{i=1}^N [f(\mathbf{x}_S \cup \mathbf{x}_C^i) - f(\mathbf{x}^i) ]
        $$
        $\hat{v}(S)$ quantifies the contribution when the features in $S$ are set to $\mathbf{x}_S$.
        For all instances, we compute two outputs:

          - $f(\mathbf{x}_S \cup \mathbf{x}_C^i)$ is the output of the model when the features in $S$ are set to $\mathbf{x}_S$ and the rest of the features are left as they are
          - $f(\mathbf{x}^i)$ is the output of the model when the instance is left as is
        The average difference (over all instances) between these two outputs is the value of the coalition $S$.

        The contribution of a feature $j$ added to a coalition $S$ is estimated as:
        $$
        \hat{\Delta}_{S, j} = \hat{v}(S \cup \{j\}) - \hat{v}(S)
        $$

        The SHAP value of a feature $j$ with value $x_j$ is the average contribution of feature $j$ across all possible coalitions with a weight $w_{S, j}$:

        $$
        \hat{\phi}_j(x_j) = {1 \over N} \sum_{S \subseteq \{1, \dots, D\} \setminus \{j\}} w_{S, j} \hat{\Delta}_{S, j}
        $$

        where $w_{S, j}$ assures that the contribution of feature $j$ is the same for all coalitions of the same size. For example, there are $D-1$ ways for $x_j$ to enter a coalition of $|S| = 1$ feature, so $w_{S, j} = {1 \over D (D-1)}$ for each of them. In contrast, there is only one way for $x_j$ to enter a coaltion of $|S|=0$ (to be the first specified feature), so $w_{S, j} = {1 \over D}$.

        The SHAP Dependence Plot (SHAP-DP) is a spline $\hat{f}^{SDP}_j(x_j)$ fit to the dataset $\{(x_j^i, \hat{\phi}_j(x_j^i))\}_{i=1}^N$ using the `UnivariateSpline` function from `scipy.interpolate`.

    ??? note "Notes"

        * The required parameters are `data` and `model`. The rest are optional.
        * SHAP values are computed using either the [`shap`](https://shap.readthedocs.io/en/latest/) package (`backend="shap"`) or the [`shapiq`](https://shapiq.readthedocs.io/en/latest/) package (`backend="shapiq"`).
        * SHAP values are centered by default, i.e., the average SHAP value is subtracted from the SHAP values.
        * More details on the SHAP values can be found in the [original paper](https://arxiv.org/abs/1705.07874) and in the book [Interpreting Machine Learning Models with SHAP](https://christophmolnar.com/books/shap/)

    Args:
        data: the design matrix

            - shape: `(N,D)`
        model: the black-box model. Must be a `Callable` with:

            - input: `ndarray` of shape `(N, D)`
            - output: `ndarray` of shape `(N,)`

        axis_limits: The limits of the feature effect plot along each axis

            - use a `ndarray` of shape `(2, D)`, to specify them manually
            - use `None`, to be inferred from the data

        nof_instances: maximum number of instances to be used for SHAP estimation.

            - use `"all"`, for using all instances.
            - use an `int`, for using `nof_instances` instances.

        avg_output: The average output of the model.

            - use a `float`, to specify it manually
            - use `None`, to be inferred as `np.mean(model(data))`

        feature_names: The names of the features

            - use a `list` of `str`, to specify the name manually. For example: `                  ["age", "weight", ...]`
            - use `None`, to keep the default names: `["x_0", "x_1", ...]`

        target_name: The name of the target variable

            - use a `str`, to specify it name manually. For example: `"price"`
            - use `None`, to keep the default name: `"y"`

        shap_values: The SHAP values of the model

            - if shap values are already computed, they can be passed here
            - if `None`, the SHAP values will be computed using the `shap` package

        backend: Package to compute SHAP values

            - use `"shap"` for the `shap` package (default)
            - use `"shapiq"` for the `shapiq` package

    Notes:
        SHAP values are expensive to compute.
        To speed up the computation consider using a subset of the dataset.
        The `nof_instances` parameter controls the number of instances used for computing the SHAP values.
        The default value is `1_000` instances, which is a good trade-off between speed and accuracy.
    """
    self.shap_values = shap_values if shap_values is not None else None
    assert backend in ["shap", "shapiq"]
    self.backend = backend
    super(ShapDP, self).__init__(
        "SHAP DP",
        data,
        model,
        None,
        None,
        nof_instances,
        axis_limits,
        feature_names,
        target_name,
    )

`fit(features='all', centering=True, points_for_centering=30, binning_method='greedy', budget=512, shap_explainer_kwargs=None, shap_explanation_kwargs=None)`

Fit the SHAP Dependence Plot to the data.

Notes

The SHAP Dependence Plot (SDP) $\hat{f}^{SDP}_j(x_j)$ is a spline fit to the dataset $\{(x_j^i, \hat{\phi}_j(x_j^i))\}_{i=1}^N$ using the UnivariateSpline function from scipy.interpolate.

The SHAP standard deviation, $\hat{\sigma}^{SDP}_j(x_j)$, is a spline fit to the absolute value of the residuals, i.e., to the dataset $\{(x_j^i, |\hat{\phi}_j(x_j^i) - \hat{f}^{SDP}_j(x_j^i)|)\}_{i=1}^N$, using the UnivariateSpline function from scipy.interpolate.

Parameters:

Name	Type	Description	Default
`features`	`Union[int, str, List]`	the features to fit. - If set to "all", all the features will be fitted.	`'all'`
`centering`	`Union[bool, str]`	If set to False, no centering will be applied. If set to "zero_integral" or True, the integral of the feature effect will be set to zero. If set to "zero_mean", the mean of the feature effect will be set to zero.	`True`
`points_for_centering`	`Union[int, str]`	number of linspaced points along the feature axis used for centering. If set to `all`, all the dataset points will be used.	`30`
`binning_method`	`Union[str, Greedy, Fixed]`	the binning method to be used for fitting a piecewise linear function to the SHAP values. If set to "greedy", the greedy binning method will be used. If set to "fixed", the fixed binning method will be used.	`'greedy'`
`budget`	`int`	Budget to use for the approximation. Defaults to 512. - Increasing the budget improves the approximation at the cost of slower computation. - Decrease the budget for faster computation at the cost of approximation error.	`512`
`shap_explainer_kwargs`	`Optional[dict]`	the keyword arguments to be passed to the `shap.Explainer` or `shapiq.Explainer` class, depending on the backend. Code behind the scene Check the code that is running behind the scene before customizing `shap_explainer_kwargs`. explainer_kwargs = explainer_kwargs.copy() if explainer_kwargs else {} explanation_kwargs = explanation_kwargs.copy() if explanation_kwargs else {} if self.backend == "shap": explainer_defaults = {"masker": data} explanation_defaults = {"max_evals": budget} elif self.backend == "shapiq": explainer_defaults = { "data": data, "index": "SV", "max_order": 1, "approximator": "permutation", "imputer": "marginal", } explanation_defaults = {"budget": budget} else: raise ValueError("`backend` should be either 'shap' or 'shapiq'") explainer_kwargs = {explainer_defaults, explainer_kwargs} # User args override defaults explanation_kwargs = {explanation_defaults, explanation_kwargs} # User args override defaults if self.backend == "shap": explainer = shap.Explainer(model, explainer_kwargs) explanation = explainer(data, explanation_kwargs) self.shap_values = explanation.values elif self.backend == "shapiq": explainer = shapiq.Explainer(model, explainer_kwargs) explanations = explainer.explain_X(data, explanation_kwargs) self.shap_values = np.stack([ex.get_n_order_values(1) for ex in explanations]) else: raise ValueError("`backend` should be either 'shap' or 'shapiq'") Be careful with custom arguments For customizing `shap_explainer_kwargs` and `shap_explanation_kwargs` args, check the official documentation of `shap` and `shapiq` packages.	`None`
`shap_explanation_kwargs`	`Optional[dict]`	the keyword arguments to be passed to the `shap` or `shapiq` Explainer to compute the SHAP values. Code behind the scene Check the code that is running behind the scene before customizing `shap_explanation_kwargs`. explainer_kwargs = explainer_kwargs.copy() if explainer_kwargs else {} explanation_kwargs = explanation_kwargs.copy() if explanation_kwargs else {} if self.backend == "shap": explainer_defaults = {"masker": data} explanation_defaults = {"max_evals": budget} elif self.backend == "shapiq": explainer_defaults = { "data": data, "index": "SV", "max_order": 1, "approximator": "permutation", "imputer": "marginal", } explanation_defaults = {"budget": budget} else: raise ValueError("`backend` should be either 'shap' or 'shapiq'") explainer_kwargs = {explainer_defaults, explainer_kwargs} # User args override defaults explanation_kwargs = {explanation_defaults, explanation_kwargs} # User args override defaults if self.backend == "shap": explainer = shap.Explainer(model, explainer_kwargs) explanation = explainer(data, explanation_kwargs) self.shap_values = explanation.values elif self.backend == "shapiq": explainer = shapiq.Explainer(model, explainer_kwargs) explanations = explainer.explain_X(data, explanation_kwargs) self.shap_values = np.stack([ex.get_n_order_values(1) for ex in explanations]) else: raise ValueError("`backend` should be either 'shap' or 'shapiq'") Be careful with custom arguments For customizing `shap_explainer_kwargs` and `shap_explanation_kwargs` args, check the official documentation of `shap` and `shapiq` packages.	`None`

Source code in effector/global_effect_shap.py

def fit(
    self,
    features: Union[int, str, List] = "all",
    centering: Union[bool, str] = True,
    points_for_centering: Union[int, str] = 30,
    binning_method: Union[str, ap.Greedy, ap.Fixed] = "greedy",
    budget: int = 512,
    shap_explainer_kwargs: Optional[dict] = None,
    shap_explanation_kwargs: Optional[dict] = None,
) -> None:
    """Fit the SHAP Dependence Plot to the data.

    Notes:
        The SHAP Dependence Plot (SDP) $\hat{f}^{SDP}_j(x_j)$ is a spline fit to
        the dataset $\{(x_j^i, \hat{\phi}_j(x_j^i))\}_{i=1}^N$
        using the `UnivariateSpline` function from `scipy.interpolate`.

        The SHAP standard deviation, $\hat{\sigma}^{SDP}_j(x_j)$, is a spline fit            to the absolute value of the residuals, i.e., to the dataset $\{(x_j^i, |\hat{\phi}_j(x_j^i) - \hat{f}^{SDP}_j(x_j^i)|)\}_{i=1}^N$, using the `UnivariateSpline` function from `scipy.interpolate`.

    Args:
        features: the features to fit.
            - If set to "all", all the features will be fitted.
        centering:
            - If set to False, no centering will be applied.
            - If set to "zero_integral" or True, the integral of the feature effect will be set to zero.
            - If set to "zero_mean", the mean of the feature effect will be set to zero.

        points_for_centering: number of linspaced points along the feature axis used for centering.

            - If set to `all`, all the dataset points will be used.


        binning_method: the binning method to be used for fitting a piecewise linear function to the SHAP values.

            - If set to "greedy", the greedy binning method will be used.
            - If set to "fixed", the fixed binning method will be used.

        budget: Budget to use for the approximation. Defaults to 512.
            - Increasing the budget improves the approximation at the cost of slower computation.
            - Decrease the budget for faster computation at the cost of approximation error.

        shap_explainer_kwargs: the keyword arguments to be passed to the `shap.Explainer` or `shapiq.Explainer` class, depending on the backend.

            ??? note "Code behind the scene"
                Check the code that is running behind the scene before customizing `shap_explainer_kwargs`.

                ```python
                explainer_kwargs = explainer_kwargs.copy() if explainer_kwargs else {}
                explanation_kwargs = explanation_kwargs.copy() if explanation_kwargs else {}
                if self.backend == "shap":
                    explainer_defaults = {"masker": data}
                    explanation_defaults = {"max_evals": budget}
                elif self.backend == "shapiq":
                    explainer_defaults = {
                        "data": data,
                        "index": "SV",
                        "max_order": 1,
                        "approximator": "permutation",
                        "imputer": "marginal",
                    }
                    explanation_defaults = {"budget": budget}
                else:
                    raise ValueError("`backend` should be either 'shap' or 'shapiq'")
                explainer_kwargs = {**explainer_defaults, **explainer_kwargs}  # User args override defaults
                explanation_kwargs = {**explanation_defaults, **explanation_kwargs}  # User args override defaults

                if self.backend == "shap":
                    explainer = shap.Explainer(model, **explainer_kwargs)
                    explanation = explainer(data, **explanation_kwargs)
                    self.shap_values = explanation.values
                elif self.backend == "shapiq":
                    explainer = shapiq.Explainer(model, **explainer_kwargs)
                    explanations = explainer.explain_X(data, **explanation_kwargs)
                    self.shap_values = np.stack([ex.get_n_order_values(1) for ex in explanations])
                else:
                    raise ValueError("`backend` should be either 'shap' or 'shapiq'")
                ```

            ??? warning "Be careful with custom arguments"

                For customizing `shap_explainer_kwargs` and `shap_explanation_kwargs` args,
                check the official documentation of [`shap`](https://shap.readthedocs.io/en/latest/) and [`shapiq`](https://shapiq.readthedocs.io/en/latest/) packages.

        shap_explanation_kwargs: the keyword arguments to be passed to the `shap` or `shapiq` Explainer to compute the SHAP values.

            ??? note "Code behind the scene"

                Check the code that is running behind the scene before customizing `shap_explanation_kwargs`.

                ```python
                explainer_kwargs = explainer_kwargs.copy() if explainer_kwargs else {}
                explanation_kwargs = explanation_kwargs.copy() if explanation_kwargs else {}
                if self.backend == "shap":
                    explainer_defaults = {"masker": data}
                    explanation_defaults = {"max_evals": budget}
                elif self.backend == "shapiq":
                    explainer_defaults = {
                        "data": data,
                        "index": "SV",
                        "max_order": 1,
                        "approximator": "permutation",
                        "imputer": "marginal",
                    }
                    explanation_defaults = {"budget": budget}
                else:
                    raise ValueError("`backend` should be either 'shap' or 'shapiq'")
                explainer_kwargs = {**explainer_defaults, **explainer_kwargs}  # User args override defaults
                explanation_kwargs = {**explanation_defaults, **explanation_kwargs}  # User args override defaults

                if self.backend == "shap":
                    explainer = shap.Explainer(model, **explainer_kwargs)
                    explanation = explainer(data, **explanation_kwargs)
                    self.shap_values = explanation.values
                elif self.backend == "shapiq":
                    explainer = shapiq.Explainer(model, **explainer_kwargs)
                    explanations = explainer.explain_X(data, **explanation_kwargs)
                    self.shap_values = np.stack([ex.get_n_order_values(1) for ex in explanations])
                else:
                    raise ValueError("`backend` should be either 'shap' or 'shapiq'")
                ```

            ??? warning "Be careful with custom arguments"

                For customizing `shap_explainer_kwargs` and `shap_explanation_kwargs` args,
                check the official documentation of [`shap`](https://shap.readthedocs.io/en/latest/) and [`shapiq`](https://shapiq.readthedocs.io/en/latest/) packages.

    """
    centering = helpers.prep_centering(centering)
    features = helpers.prep_features(features, self.dim)

    # new implementation
    for s in features:
        self.feature_effect["feature_" + str(s)] = self._fit_feature(
            s,
            binning_method,
            centering,
            points_for_centering,
            budget,
            shap_explainer_kwargs,
            shap_explanation_kwargs
        )
        self.is_fitted[s] = True
        self.fit_args["feature_" + str(s)] = {
            "centering": centering,
            "points_for_centering": points_for_centering,
        }

`eval(feature, xs, heterogeneity=True, centering=True)`

Evaluate the effect of the s-th feature at positions xs.

Parameters:

Name	Type	Description	Default
`feature`	`int`	index of feature of interest	required
`xs`	`ndarray`	the points along the s-th axis to evaluate the FE plot `np.ndarray` of shape `(T,)`	required
`heterogeneity`	`bool`	whether to return the heterogeneity measures. if `heterogeneity=False`, the function returns the mean effect at the given `xs` If `heterogeneity=True`, the function returns `(y, std)` where `y` is the mean effect and `std` is the standard deviation of the mean effect	`True`
`centering`	`Union[bool, str]`	whether to center the plot If `centering` is `False`, the SHAP curve is not centered If `centering` is `True` or `zero_integral`, the SHAP curve is centered around the `y` axis. If `centering` is `zero_start`, the SHAP curve starts from `y=0`.	`True`

Returns:

Type	Description
`Union[ndarray, Tuple[ndarray, ndarray]]`	the mean effect `y`, if `heterogeneity=False` (default) or a tuple `(y, std, estimator_var)` otherwise

Source code in effector/global_effect_shap.py

def eval(
    self,
    feature: int,
    xs: np.ndarray,
    heterogeneity: bool = True,
    centering: typing.Union[bool, str] = True,
) -> typing.Union[np.ndarray, typing.Tuple[np.ndarray, np.ndarray]]:
    """Evaluate the effect of the s-th feature at positions `xs`.

    Args:
        feature: index of feature of interest
        xs: the points along the s-th axis to evaluate the FE plot

          - `np.ndarray` of shape `(T,)`
        heterogeneity: whether to return the heterogeneity measures.

              - if `heterogeneity=False`, the function returns the mean effect at the given `xs`
              - If `heterogeneity=True`, the function returns `(y, std)` where `y` is the mean effect and `std` is the standard deviation of the mean effect

        centering: whether to center the plot

            - If `centering` is `False`, the SHAP curve is not centered
            - If `centering` is `True` or `zero_integral`, the SHAP curve is centered around the `y` axis.
            - If `centering` is `zero_start`, the SHAP curve starts from `y=0`.

    Returns:
        the mean effect `y`, if `heterogeneity=False` (default) or a tuple `(y, std, estimator_var)` otherwise
    """
    centering = helpers.prep_centering(centering)

    if self.requires_refit(feature, centering):
        self.fit(features=feature, centering=centering)

    # Check if the lower bound is less than the upper bound
    assert self.axis_limits[0, feature] < self.axis_limits[1, feature]

    yy = self.feature_effect["feature_" + str(feature)]["spline_mean"](xs)

    if centering is not False:
        norm_const = self.feature_effect["feature_" + str(feature)]["norm_const"]
        yy = yy - norm_const

    if heterogeneity:
        yy_var = self.feature_effect["feature_" + str(feature)]["spline_std"](xs)
        return yy, yy_var
    else:
        return yy

`plot(feature, heterogeneity='shap_values', centering=True, nof_points=30, scale_x=None, scale_y=None, nof_shap_values='all', show_avg_output=False, y_limits=None, only_shap_values=False, show_plot=True)`

Plot the SHAP Dependence Plot (SDP) of the s-th feature.

Parameters:

Name	Type	Description	Default
`feature`	`int`	index of the plotted feature	required
`heterogeneity`	`Union[bool, str]`	whether to output the heterogeneity of the SHAP values If `heterogeneity` is `False`, no heterogeneity is plotted If `heterogeneity` is `True` or `"std"`, the standard deviation of the shap values is plotted If `heterogeneity` is `"shap_values"`, the shap values are scattered on top of the SHAP curve	`'shap_values'`
`centering`	`Union[bool, str]`	whether to center the SDP If `centering` is `False`, the SHAP curve is not centered If `centering` is `True` or `zero_integral`, the SHAP curve is centered around the `y` axis. If `centering` is `zero_start`, the SHAP curve starts from `y=0`.	`True`
`nof_points`	`int`	number of points to evaluate the SDP plot	`30`
`scale_x`	`Optional[dict]`	dictionary with keys "mean" and "std" for scaling the x-axis	`None`
`scale_y`	`Optional[dict]`	dictionary with keys "mean" and "std" for scaling the y-axis	`None`
`nof_shap_values`	`Union[int, str]`	number of shap values to show on top of the SHAP curve	`'all'`
`show_avg_output`	`bool`	whether to show the average output of the model	`False`
`y_limits`	`Optional[List]`	limits of the y-axis	`None`
`only_shap_values`	`bool`	whether to plot only the shap values	`False`
`show_plot`	`bool`	whether to show the plot	`True`

Source code in effector/global_effect_shap.py

def plot(
    self,
    feature: int,
    heterogeneity: Union[bool, str] = "shap_values",
    centering: Union[bool, str] = True,
    nof_points: int = 30,
    scale_x: Optional[dict] = None,
    scale_y: Optional[dict] = None,
    nof_shap_values: Union[int, str] = "all",
    show_avg_output: bool = False,
    y_limits: Optional[List] = None,
    only_shap_values: bool = False,
    show_plot: bool = True,
) -> Union[Tuple, None]:
    """
    Plot the SHAP Dependence Plot (SDP) of the s-th feature.

    Args:
        feature: index of the plotted feature
        heterogeneity: whether to output the heterogeneity of the SHAP values

            - If `heterogeneity` is `False`, no heterogeneity is plotted
            - If `heterogeneity` is `True` or `"std"`, the standard deviation of the shap values is plotted
            - If `heterogeneity` is `"shap_values"`, the shap values are scattered on top of the SHAP curve

        centering: whether to center the SDP

            - If `centering` is `False`, the SHAP curve is not centered
            - If `centering` is `True` or `zero_integral`, the SHAP curve is centered around the `y` axis.
            - If `centering` is `zero_start`, the SHAP curve starts from `y=0`.

        nof_points: number of points to evaluate the SDP plot
        scale_x: dictionary with keys "mean" and "std" for scaling the x-axis
        scale_y: dictionary with keys "mean" and "std" for scaling the y-axis
        nof_shap_values: number of shap values to show on top of the SHAP curve
        show_avg_output: whether to show the average output of the model
        y_limits: limits of the y-axis
        only_shap_values: whether to plot only the shap values
        show_plot: whether to show the plot
    """
    heterogeneity = helpers.prep_confidence_interval(heterogeneity)

    x = np.linspace(
        self.axis_limits[0, feature], self.axis_limits[1, feature], nof_points
    )

    # get the SHAP curve
    y = self.eval(feature, x, heterogeneity=False, centering=centering)
    y_std = (
        np.sqrt(self.feature_effect["feature_" + str(feature)]["spline_std"](x))
        if heterogeneity == "std" or True
        else None
    )

    # get some SHAP values
    _, ind = helpers.prep_nof_instances(nof_shap_values, self.data.shape[0])
    yy = (
        self.feature_effect["feature_" + str(feature)]["yy"][ind]
        if heterogeneity == "shap_values"
        else None
    )
    if yy is not None and centering is not False:
        yy = yy - self.feature_effect["feature_" + str(feature)]["norm_const"]
    xx = (
        self.feature_effect["feature_" + str(feature)]["xx"][ind]
        if heterogeneity == "shap_values"
        else None
    )

    if show_avg_output:
        avg_output = helpers.prep_avg_output(
            self.data, self.model, self.avg_output, scale_y
        )
    else:
        avg_output = None

    ret = vis.plot_shap(
        x,
        y,
        xx,
        yy,
        y_std,
        feature,
        heterogeneity=heterogeneity,
        scale_x=scale_x,
        scale_y=scale_y,
        avg_output=avg_output,
        feature_names=self.feature_names,
        target_name=self.target_name,
        y_limits=y_limits,
        only_shap_values=only_shap_values,
        show_plot=show_plot,
    )

    return ret