How to create Custom Parameterized Expectations
This guide will walk you through the process of creating Parameterized ExpectationsA verifiable assertion about data. - very quickly. This method is only available using the new Modular Expectations API in 0.13.
Prerequisites
A Parameterized ExpectationA verifiable assertion about data. is a capability unlocked by Modular Expectations. Now that Expectations are structured in class form, it is easy to inherit from these classes and build similar Expectations that are adapted to your own needs.
Steps
1. Select an Expectation to inherit from
For the purpose of this exercise, we will implement
the Expectations
expect_column_mean_to_be_positive
and
expect_column_values_to_be_two_letter_country_code
- realistic Expectations of the data that can easily
inherit from
expect_column_mean_to_be_between
and
expect_column_values_to_be_in_set
respectively.
2. Select default values for your class
Our first implementation will be
expect_column_mean_to_be_positive
.
As can be seen in the implementation below, we have
chosen to keep our default minimum value at 0, given
that we are validating that all our values are
positive. Setting the upper bound to
None
means that no upper bound will be
checked – effectively setting the threshold at ∞ and
allowing any positive value.
Notice that we do not need to set
default_kwarg_values
for all kwargs: it
is sufficient to set them only for ones for which we
would like to set a default value. To keep our
implementation simple, we do not override the
metric_dependencies
or
success_keys
.
class ExpectColumnMeanToBePositive(ExpectColumnMeanToBeBetween):
"""Expects the mean of values in this column to be positive"""
default_kwarg_values = {
"min_value": 0,
"strict_min": True,
}
We could also explicitly override our parent methods to modify the behavior of our new Expectation, for example by updating the configuration validation to require the values we set as defaults not be altered.
def validate_configuration(self, configuration):
super().validate_configuration(configuration)
assert "min_value" not in configuration.kwargs, "min_value cannot be altered"
assert "max_value" not in configuration.kwargs, "max_value cannot be altered"
assert "strict_min" not in configuration.kwargs, "strict_min cannot be altered"
assert "strict_max" not in configuration.kwargs, "strict_max cannot be altered"
For another example, let's take a look at
expect_column_values_to_be_in_set
.
In this case, we will only be changing our
value_set
:
class ExpectColumnValuesToBeTwoLetterCountryCode(ExpectColumnValuesToBeInSet):
default_kwarg_values = {
"value_set": ["FR", "DE", "CH", "ES", "IT", "BE", "NL", "PL"],
}
That's all there is to it - really!
Congratulations!
🎉 You've just built
your first Parameterized Custom Expectation! 🎉
3. Contribution (Optional)
If you plan to contribute your Expectation to the
public open source project, you should include a
library_metadata
object. For example:
library_metadata = {"tags": ["basic stats"], "contributors": ["@joegargery"]}
This is particularly important because we want to make sure that you get credit for all your hard work!
Additionally, you will need to implement some basic examples and test cases before your contribution can be accepted. For guidance on examples and testing, see our guide on implementing examples and test cases.
For more information on our code standards and contribution, see our guide on Levels of Maturity for Expectations.