How to add validations data or suites to a Checkpoint
This guide will help you add validation data or Expectation SuitesA collection of verifiable assertions about data. to an existing CheckpointThe primary means for validating data in a production deployment of Great Expectations.. This is useful if you want to aggregate individual validations (across Expectation Suites or DatasourcesProvides a standard API for accessing and interacting with data from a wide variety of source systems.) into a single Checkpoint.
Prerequisites
- Completion of the Quickstart guide.
- A working installation of Great Expectations.
- A Data Context.
- An Expectations Suite.
- A Checkpoint.
Steps
1. Open your existing Checkpoint in a text editor
It will look similar to this:
name: my_checkpoint
config_version: 1
class_name: Checkpoint
run_name_template: "%Y-%m-foo-bar-template-$VAR"
validations:
- batch_request:
datasource_name: my_datasource
data_asset_name: users
expectation_suite_name: users.warning
action_list:
- name: store_validation_result
action:
class_name: StoreValidationResultAction
- name: store_evaluation_params
action:
class_name: StoreEvaluationParametersAction
- name: update_data_docs
action:
class_name: UpdateDataDocsAction
evaluation_parameters:
param1: "$MY_PARAM"
param2: 1 + "$OLD_PARAM"
runtime_configuration:
result_format:
result_format: BASIC
partial_unexpected_count: 20
2. Edit the existing Checkpoint configuration to add an Expectation Suite
To add a second Expectation Suite (in this example we
add users.error
) to your Checkpoint
configuration, modify the file to add an additional
batch_request
key and corresponding
information, including
evaluation_parameters
,
action_list
,
runtime_configuration
, and
expectation_suite_name
. In fact, the
simplest way to run a different Expectation Suite on
the same
BatchA selection of records from a Data Asset.
of data is to make a copy of the original
batch_request
entry and then edit the
expectation_suite_name
value to
correspond to a different Expectation Suite. The
resulting configuration will look like this:
name: my_checkpoint
config_version: 1
class_name: Checkpoint
run_name_template: "%Y-%m-foo-bar-template-$VAR"
validations:
- batch_request:
datasource_name: my_datasource
data_asset_name: users
expectation_suite_name: users.warning
action_list:
- name: store_validation_result
action:
class_name: StoreValidationResultAction
- name: store_evaluation_params
action:
class_name: StoreEvaluationParametersAction
- name: update_data_docs
action:
class_name: UpdateDataDocsAction
evaluation_parameters:
param1: "$MY_PARAM"
param2: 1 + "$OLD_PARAM"
runtime_configuration:
result_format:
result_format: BASIC
partial_unexpected_count: 20
- batch_request:
datasource_name: my_datasource
data_connector_name: my_data_connector
data_asset_name: users
data_connector_query:
index: -1
expectation_suite_name: users.error
action_list:
- name: store_validation_result
action:
class_name: StoreValidationResultAction
- name: store_evaluation_params
action:
class_name: StoreEvaluationParametersAction
- name: update_data_docs
action:
class_name: UpdateDataDocsAction
evaluation_parameters:
param1: "$MY_PARAM"
param2: 1 + "$OLD_PARAM"
runtime_configuration:
result_format:
result_format: BASIC
partial_unexpected_count: 20
3. Edit the existing Checkpoint configuration to add new validation data
In the above example, the entry we added with our Expectation Suite was paired with the same Batch of data as the original Expectation Suite. However, you may also specify different Batch RequestsProvided to a Datasource in order to create a Batch. (and thus different Batches of data) when you add an Expectation Suite. The flexibility of easily adding multiple Validations of Batches of data with different Expectation Suites and specific ActionsA Python class with a run method that takes a Validation Result and does something with it can be demonstrated using the following example of a Checkpoint configuration file:
name: my_fancy_checkpoint
config_version: 1
class_name: Checkpoint
run_name_template: "%Y-%m-foo-bar-template-$VAR"
expectation_suite_name: users.delivery
action_list:
- name: store_validation_result
action:
class_name: StoreValidationResultAction
- name: store_evaluation_params
action:
class_name: StoreEvaluationParametersAction
- name: update_data_docs
action:
class_name: UpdateDataDocsAction
validations:
- batch_request:
datasource_name: my_datasource
data_asset_name: users
expectation_suite_name: users.warning
- batch_request:
datasource_name: my_datasource
data_asset_name: users
expectation_suite_name: users.error
- batch_request:
datasource_name: my_datasource
data_asset_name: users
options:
name: Titanic
action_list:
- name: quarantine_failed_data
action:
class_name: CreateQuarantineData
- name: advance_passed_data
action:
class_name: CreateQuarantineData
evaluation_parameters:
param1: "$MY_PARAM"
param2: 1 + "$OLD_PARAM"
runtime_configuration:
result_format:
result_format: BASIC
partial_unexpected_count: 20
According to this configuration, the locally-specified
Expectation Suite users.warning
is run
against the batch_request
that employs
my_data_connector
with the results
processed by the Actions specified in the top-level
action_list
. Similarly, the
locally-specified Expectation Suite
users.error
is run against the
batch_request
that employs
my_special_data_connector
with the
results also processed by the actions specified in the
top-level action_list
. In addition, the
top-level Expectation Suite
users.delivery
is run against the
batch_request
that employs
my_other_data_connector
with the results
processed by the union of actions in the
locally-specified action_list
and in the
top-level action_list
.
Please see How to configure a new Checkpoint using test_yaml_config for additional Checkpoint configuration examples (including the convenient templating mechanism).
Additional notes
This is a good way to aggregate Validations in a complex pipeline. You could use this feature to ValidateThe act of applying an Expectation Suite to a Batch. multiple source files before and after their ingestion into your data lake.