How to Edit an Expectation Suite
In this guide, you'll learn how to create Expectations and interactively edit the resulting Expectation Suite.
No. The interactive method used to create and edit Expectations does not edit or alter the Batch data.
Prerequisites
- Great Expectations installed in a Python environment
- A Filesystem Data Context for your Expectations
- Created a Datasource from which to request a Batch of data for introspection
If you haven't set up Great Expectations
Steps
1. Import the Great Expectations module and instantiate a Data Context
The simplest way to create a new Data Context is by
using the get_context()
method.
import great_expectations as gx
context = gx.get_context()
2. Create a Validator from Data
Run the following command to connect to
.csv
data stored in the
great_expectations
GitHub repository:
validator = context.sources.pandas_default.read_csv(
"https://raw.githubusercontent.com/great-expectations/gx_tutorials/main/data/yellow_tripdata_sample_2019-01.csv"
)
3. Create Expectations with Validator
Run the following commands to create two Expectations.
The first Expectation uses domain knowledge (the
pickup_datetime
shouldn't be null),
and the second Expectation uses
auto=True
to detect a range of values in
the passenger_count
column.
validator.expect_column_values_to_not_be_null("pickup_datetime")
validator.expect_column_values_to_be_between("passenger_count", auto=True)
Under the hood, the Validator will be creating and updating an Expectation Suite, which we can view next.
4. View the Expectations in the Expectation Suite
There are a number of different ways that this can be
done, with one way being using the
show_expectations_by_expectation_type()
function, which will use prettyprint
to
print the Suite to the console in a way that can be
easily visualized.
First load the ExpectationSuite
from the
Validator
:
my_suite = validator.get_expectation_suite()
Now use the
show_expectations_by_expectation_type()
to print the Suite to console or Jupyter Notebook.
my_suite.show_expectations_by_expectation_type()
Your output will look something similar to this:
[ { 'expect_column_values_to_be_between': { 'auto': True,
'column': 'passenger_count',
'domain': 'column',
'max_value': 6,
'min_value': 1,
'mostly': 1.0,
'strict_max': False,
'strict_min': False}},
{ 'expect_column_values_to_not_be_null': { 'column': 'pickup_datetime',
'domain': 'column'}}]
5. Instantiate ExpectationConfiguration
From the Expectation Suite, you will be able to create
an ExpectationConfiguration object using the output
from
show_expectations_by_expectation_type()
Here is the example output of the first Expectation in
our suite.
It runs the
expect_column_values_to_be_between
Expectation on the passenger_count
column
and expects the min and max values to be
1
and 6
respectively.
{
"expect_column_values_to_be_between": {
"auto": True,
"column": "passenger_count",
"domain": "column",
"max_value": 6,
"min_value": 1,
"mostly": 1.0,
"strict_max": False,
"strict_min": False,
}
}
Here is the same configuration, but this time as a
ExpectationConfiguration
object.
from great_expectations.core.expectation_suite import ExpectationSuite
config = ExpectationConfiguration(
expectation_type="expect_column_values_to_be_between",
kwargs={
"auto": True,
"column": "passenger_count",
"domain": "column",
"max_value": 6,
"min_value": 1,
"mostly": 1.0,
"strict_max": False,
"strict_min": False,
},
)
6. Update Configuration and ExpectationSuite
Let's say that you are interested in adjusting
the max_value
of the Expectation to be
4
instead of 6
. Then you
could create a new
ExpectationConfiguration
with the new
value:
updated_config = ExpectationConfiguration(
expectation_type="expect_column_values_to_be_between",
kwargs={
"auto": True,
"column": "passenger_count",
"domain": "column",
"min_value": 1,
"max_value": 4,
#'max_value': 6,
"mostly": 1.0,
"strict_max": False,
"strict_min": False,
},
)
And update the ExpectationSuite by calling
add_expectation()
. The
add_expectation()
function will perform
an 'upsert' into the
ExpectationSuite
, meaning it will update
an existing Expectation if it already exists, or add a
new one if it doesn't.
my_suite.add_expectation(updated_config)
You can check that the ExpectationSuite has been
correctly updated by either running the
show_expectations_by_expectation_type()
function again, or by running
find_expectation()
and confirming that
the expected Expectation exists in the suite. The
search will need to be performed with a new
ExpectationConfiguration
, but will not
need to inclued all of the kwarg
values.
config_to_search = ExpectationConfiguration(
expectation_type="expect_column_values_to_be_between",
kwargs={"column": "passenger_count"},
)
found_expectation = my_suite.find_expectations(config_to_search, match_type="domain")
# This assertion will succeed because the ExpectationConfiguration has been updated.
assert found_expectation == [updated_config]
7. (Optional) Remove Configuration
If you would like to remove an
ExpectationConfiguration, you can use the
remove_configuration()
function.
Similar to find_expectation()
, the
remove_configuration()
function needs to
be called with an
ExpectationConfiguration
.
config_to_remove = ExpectationConfiguration(
expectation_type="expect_column_values_to_be_between",
kwargs={"column": "passenger_count"},
)
my_suite.remove_expectation(
config_to_remove, match_type="domain", remove_multiple_matches=False
)
found_expectation = my_suite.find_expectations(config_to_remove, match_type="domain")
# This assertion will fail because the ExpectationConfiguration has been removed.
assert found_expectation != [updated_config]
my_suite.show_expectations_by_expectation_type()
The output of
show_expectations_by_expectation_type()
should now look like this:
[
{ 'expect_column_values_to_not_be_null': { 'column': 'pickup_datetime',
'domain': 'column'}}]
8. Save ExpectationSuite
Finally, when you are done editing the
ExpectationSuite
, you can save it to your
Data Context by using the
save_suite()
function.
context.save_expectation_suite(my_suite)
Related Documentation
-
If you would like to learn more about the functions available at the Expectation Suite-level, please refer to our API Documentation for
ExpectationSuite
. -
To view the full script used for example code on this page, see it on GitHub: how_to_edit_an_expectation_suite.py