Edit an existing Expectation Suite
Use the information provided here to learn how to edit an Expectation Suite. Editing Expectations does not edit or alter the Batch data.
All the code used in the examples is available in GitHub at this location: how_to_edit_an_expectation_suite.py.
Prerequisites
- A working installation of Great Expectations
- A Filesystem Data Context for your Expectations
- A Data Source from which to request a Batch of data for introspection
- An Expectation Suite
Import the Great Expectations module and instantiate a Data Context
Run the following code to create a new Data Context
with the get_context()
method:
import great_expectations as gx
context = gx.get_context()
Create a Validator from Data
Run the following code to connect to
.csv
data stored in the
great_expectations
GitHub repository:
validator = context.sources.pandas_default.read_csv(
"https://raw.githubusercontent.com/great-expectations/gx_tutorials/main/data/yellow_tripdata_sample_2019-01.csv"
)
Retrieve an existing Expectation Suite
Run the following code to retrieve an Expectation Suite:
my suite = context.get_expectation_suite("expectation_suite_name")
Replace expectation_suite_name
with the
name of your Expectation Suite.
View the Expectations in the Expectation Suite
Run the following code to print the Suite to console
or Jupyter Notebook the
show_expectations_by_expectation_type()
method:
my_suite.show_expectations_by_expectation_type()
The output appears similar to the following example:
[ { 'expect_column_values_to_be_between': { 'auto': True,
'column': 'passenger_count',
'domain': 'column',
'max_value': 6,
'min_value': 1,
'mostly': 1.0,
'strict_max': False,
'strict_min': False}},
{ 'expect_column_values_to_not_be_null': { 'column': 'pickup_datetime',
'domain': 'column'}}]
Instantiate ExpectationConfiguration
From the Expectation Suite, you can create an ExpectationConfiguration object using the output from `show_expectations_by_expectation_type(). The following is the example output from the first Expectation in the Expectation Suite.
It runs the
expect_column_values_to_be_between
Expectation on the passenger_count
column
and expects the min and max values to be
1
and 6
respectively.
{
"expect_column_values_to_be_between": {
"auto": True,
"column": "passenger_count",
"domain": "column",
"max_value": 6,
"min_value": 1,
"mostly": 1.0,
"strict_max": False,
"strict_min": False,
}
}
The following is the same configuration with an
ExpectationConfiguration
object:
from great_expectations.core.expectation_suite import ExpectationConfiguration
config = ExpectationConfiguration(
expectation_type="expect_column_values_to_be_between",
kwargs={
"auto": True,
"column": "passenger_count",
"domain": "column",
"max_value": 6,
"min_value": 1,
"mostly": 1.0,
"strict_max": False,
"strict_min": False,
},
)
Update the Configuration and Expectation Suite
In the following example, the
max_value
of the Expectation is adjusted
from 4
to 6
with a new
ExpectationConfiguration
:
updated_config = ExpectationConfiguration(
expectation_type="expect_column_values_to_be_between",
kwargs={
"auto": True,
"column": "passenger_count",
"domain": "column",
"min_value": 1,
"max_value": 4,
#'max_value': 6,
"mostly": 1.0,
"strict_max": False,
"strict_min": False,
},
)
To update the Expectation Suite you use the
add_expectation()
function. For example:
my_suite.add_expectation(updated_config)
The add_expectation()
function performs
an 'upsert' into the
ExpectationSuite
and updates the existing
Expectation, or adds a new one if it doesn't.
To check that the Expectation Suite has been updated,
you can run the
show_expectations_by_expectation_type()
function again, or run
find_expectation()
and then confirm that
the expected Expectation exists in the suite. For
example:
config_to_search = ExpectationConfiguration(
expectation_type="expect_column_values_to_be_between",
kwargs={"column": "passenger_count"},
)
found_expectation = my_suite.find_expectations(config_to_search, match_type="domain")
# This assertion will succeed because the ExpectationConfiguration has been updated.
assert found_expectation == [updated_config]
You'll need to perform the search with a new
ExpectationConfiguration
, but you
don't need to include all the
kwarg
values.
Remove the ExpectationConfiguration (Optional)
To remove an ExpectationConfiguration
,
you can use the
remove_configuration()
function. Similar
to find_expectation()
, you call the
remove_configuration()
function with
ExpectationConfiguration
. For example:
config_to_remove = ExpectationConfiguration(
expectation_type="expect_column_values_to_be_between",
kwargs={"column": "passenger_count"},
)
my_suite.remove_expectation(
config_to_remove, match_type="domain", remove_multiple_matches=False
)
found_expectation = my_suite.find_expectations(config_to_remove, match_type="domain")
# This assertion will fail because the ExpectationConfiguration has been removed.
assert found_expectation != [updated_config]
my_suite.show_expectations_by_expectation_type()
The output of
show_expectations_by_expectation_type()
should appear similar to this example:
[
{ 'expect_column_values_to_not_be_null': { 'column': 'pickup_datetime',
'domain': 'column'}}]
Save Expectation Suite changes
After editing an Expectation Suite, you can use the
save_suite()
function to save it to your
Data Context. For example:
context.save_expectation_suite(my_suite)
To make sure your Expectation Suite changes are
reflected in the Validator, use
context.get_validator()
to overwrite the
validator
, or create a new one from the
updated Data Context.
Related documentation
-
To learn more about the functions available with
Expectation Suites, see the
ExpectationSuite
API Documentation.