Skip to main content
Version: 0.16.16

How to Edit an Expectation Suite

In this guide, you'll learn how to create Expectations and interactively edit the resulting Expectation Suite.

Does this process edit my data?

No. The interactive method used to create and edit Expectations does not edit or alter the Batch data.

Prerequisites

  • Great Expectations installed in a Python environment
  • A Filesystem Data Context for your Expectations
  • Created a Datasource from which to request a Batch of data for introspection

If you haven't set up Great Expectations

Steps

1. Import the Great Expectations module and instantiate a Data Context

The simplest way to create a new Data Context is by using the get_context() method.

import great_expectations as gx

context = gx.get_context()

2. Create a Validator from Data

Run the following command to connect to .csv data stored in the great_expectations GitHub repository:

validator = context.sources.pandas_default.read_csv(
"https://raw.githubusercontent.com/great-expectations/gx_tutorials/main/data/yellow_tripdata_sample_2019-01.csv"
)

3. Create Expectations with Validator

Run the following commands to create two Expectations. The first Expectation uses domain knowledge (the pickup_datetime shouldn't be null), and the second Expectation uses auto=True to detect a range of values in the passenger_count column.

validator.expect_column_values_to_not_be_null("pickup_datetime")
validator.expect_column_values_to_be_between("passenger_count", auto=True)

Under the hood, the Validator will be creating and updating an Expectation Suite, which we can view next.

4. View the Expectations in the Expectation Suite

There are a number of different ways that this can be done, with one way being using the show_expectations_by_expectation_type() function, which will use prettyprint to print the Suite to the console in a way that can be easily visualized.

First load the ExpectationSuite from the Validator:

my_suite = validator.get_expectation_suite()

Now use the show_expectations_by_expectation_type() to print the Suite to console or Jupyter Notebook.

my_suite.show_expectations_by_expectation_type()

Your output will look something similar to this:

[ { 'expect_column_values_to_be_between': { 'auto': True,
'column': 'passenger_count',
'domain': 'column',
'max_value': 6,
'min_value': 1,
'mostly': 1.0,
'strict_max': False,
'strict_min': False}},
{ 'expect_column_values_to_not_be_null': { 'column': 'pickup_datetime',
'domain': 'column'}}]

5. Instantiate ExpectationConfiguration

From the Expectation Suite, you will be able to create an ExpectationConfiguration object using the output from show_expectations_by_expectation_type() Here is the example output of the first Expectation in our suite.

It runs the expect_column_values_to_be_between Expectation on the passenger_count column and expects the min and max values to be 1 and 6 respectively.

{
"expect_column_values_to_be_between": {
"auto": True,
"column": "passenger_count",
"domain": "column",
"max_value": 6,
"min_value": 1,
"mostly": 1.0,
"strict_max": False,
"strict_min": False,
}
}

Here is the same configuration, but this time as a ExpectationConfiguration object.

from great_expectations.core.expectation_suite import ExpectationSuite
config = ExpectationConfiguration(
expectation_type="expect_column_values_to_be_between",
kwargs={
"auto": True,
"column": "passenger_count",
"domain": "column",
"max_value": 6,
"min_value": 1,
"mostly": 1.0,
"strict_max": False,
"strict_min": False,
},
)

6. Update Configuration and ExpectationSuite

Let's say that you are interested in adjusting the max_value of the Expectation to be 4 instead of 6. Then you could create a new ExpectationConfiguration with the new value:

updated_config = ExpectationConfiguration(
expectation_type="expect_column_values_to_be_between",
kwargs={
"auto": True,
"column": "passenger_count",
"domain": "column",
"min_value": 1,
"max_value": 4,
#'max_value': 6,
"mostly": 1.0,
"strict_max": False,
"strict_min": False,
},
)

And update the ExpectationSuite by calling add_expectation(). The add_expectation() function will perform an 'upsert' into the ExpectationSuite, meaning it will update an existing Expectation if it already exists, or add a new one if it doesn't.

my_suite.add_expectation(updated_config)

You can check that the ExpectationSuite has been correctly updated by either running the show_expectations_by_expectation_type() function again, or by running find_expectation() and confirming that the expected Expectation exists in the suite. The search will need to be performed with a new ExpectationConfiguration, but will not need to inclued all of the kwarg values.

config_to_search = ExpectationConfiguration(
expectation_type="expect_column_values_to_be_between",
kwargs={"column": "passenger_count"},
)
found_expectation = my_suite.find_expectations(config_to_search, match_type="domain")

# This assertion will succeed because the ExpectationConfiguration has been updated.
assert found_expectation == [updated_config]

7. (Optional) Remove Configuration

If you would like to remove an ExpectationConfiguration, you can use the remove_configuration() function.

Similar to find_expectation(), the remove_configuration() function needs to be called with an ExpectationConfiguration.

config_to_remove = ExpectationConfiguration(
expectation_type="expect_column_values_to_be_between",
kwargs={"column": "passenger_count"},
)
my_suite.remove_expectation(
config_to_remove, match_type="domain", remove_multiple_matches=False
)

found_expectation = my_suite.find_expectations(config_to_remove, match_type="domain")

# This assertion will fail because the ExpectationConfiguration has been removed.
assert found_expectation != [updated_config]
my_suite.show_expectations_by_expectation_type()

The output of show_expectations_by_expectation_type() should now look like this:

[ 
{ 'expect_column_values_to_not_be_null': { 'column': 'pickup_datetime',
'domain': 'column'}}]

8. Save ExpectationSuite

Finally, when you are done editing the ExpectationSuite, you can save it to your Data Context by using the save_suite() function.

context.save_expectation_suite(my_suite)