Add validation data or Expectation suites to a Checkpoint
Add validation data or Expectation SuitesA collection of verifiable assertions about data. to an existing CheckpointThe primary means for validating data in a production deployment of Great Expectations. to aggregate individual validations across Expectation Suites or DatasourcesProvides a standard API for accessing and interacting with data from a wide variety of source systems. into a single Checkpoint. You can also use this process to ValidateThe act of applying an Expectation Suite to a Batch. multiple source files before and after their ingestion into your data lake.
Prerequisites
Open your existing Checkpoint
Open your Checkpoint in a text editor. Your Checkpoint should appear similar to the following example:
name: my_checkpoint
config_version: 1
class_name: Checkpoint
run_name_template: "%Y-%m-foo-bar-template-$VAR"
validations:
- batch_request:
datasource_name: my_datasource
data_asset_name: users
expectation_suite_name: users.warning
action_list:
- name: store_validation_result
action:
class_name: StoreValidationResultAction
- name: store_evaluation_params
action:
class_name: StoreEvaluationParametersAction
- name: update_data_docs
action:
class_name: UpdateDataDocsAction
evaluation_parameters:
param1: "$MY_PARAM"
param2: 1 + "$OLD_PARAM"
runtime_configuration:
result_format:
result_format: BASIC
partial_unexpected_count: 20
Add an Expectation Suite to the Checkpoint
To add a second Expectation Suite (in this example we
add users.error) to your Checkpoint
configuration, modify the file to add an additional
batch_request key and corresponding
information, including
evaluation_parameters,
action_list,
runtime_configuration, and
expectation_suite_name. In fact, the
simplest way to run a different Expectation Suite on
the same
BatchA selection of records from a Data Asset.
of data is to make a copy of the original
batch_request entry and then edit the
expectation_suite_name value to
correspond to a different Expectation Suite. The
resulting configuration will look like this:
name: my_checkpoint
config_version: 1
class_name: Checkpoint
run_name_template: "%Y-%m-foo-bar-template-$VAR"
validations:
- batch_request:
datasource_name: my_datasource
data_asset_name: users
expectation_suite_name: users.warning
action_list:
- name: store_validation_result
action:
class_name: StoreValidationResultAction
- name: store_evaluation_params
action:
class_name: StoreEvaluationParametersAction
- name: update_data_docs
action:
class_name: UpdateDataDocsAction
evaluation_parameters:
param1: "$MY_PARAM"
param2: 1 + "$OLD_PARAM"
runtime_configuration:
result_format:
result_format: BASIC
partial_unexpected_count: 20
- batch_request:
datasource_name: my_datasource
data_connector_name: my_data_connector
data_asset_name: users
data_connector_query:
index: -1
expectation_suite_name: users.error
action_list:
- name: store_validation_result
action:
class_name: StoreValidationResultAction
- name: store_evaluation_params
action:
class_name: StoreEvaluationParametersAction
- name: update_data_docs
action:
class_name: UpdateDataDocsAction
evaluation_parameters:
param1: "$MY_PARAM"
param2: 1 + "$OLD_PARAM"
runtime_configuration:
result_format:
result_format: BASIC
partial_unexpected_count: 20
Add validation data to the Checkpoint
In the prvious example, the entry you added with your Expectation Suite was paired with the same Batch of data as the original Expectation Suite. However, you may also specify different Batch RequestsProvided to a Datasource in order to create a Batch. (and thus different Batches of data) when you add an Expectation Suite. The flexibility of easily adding multiple Validations of Batches of data with different Expectation Suites and specific ActionsA Python class with a run method that takes a Validation Result and does something with it can be demonstrated using the following example of a Checkpoint configuration file:
name: my_fancy_checkpoint
config_version: 1
class_name: Checkpoint
run_name_template: "%Y-%m-foo-bar-template-$VAR"
expectation_suite_name: users.delivery
action_list:
- name: store_validation_result
action:
class_name: StoreValidationResultAction
- name: store_evaluation_params
action:
class_name: StoreEvaluationParametersAction
- name: update_data_docs
action:
class_name: UpdateDataDocsAction
validations:
- batch_request:
datasource_name: my_datasource
data_asset_name: users
expectation_suite_name: users.warning
- batch_request:
datasource_name: my_datasource
data_asset_name: users
expectation_suite_name: users.error
- batch_request:
datasource_name: my_datasource
data_asset_name: users
options:
name: Titanic
action_list:
- name: quarantine_failed_data
action:
class_name: CreateQuarantineData
- name: advance_passed_data
action:
class_name: CreateQuarantineData
evaluation_parameters:
param1: "$MY_PARAM"
param2: 1 + "$OLD_PARAM"
runtime_configuration:
result_format:
result_format: BASIC
partial_unexpected_count: 20
According to this configuration, the locally-specified
Expectation Suite users.warning is run
against the batch_request that employs
my_data_connector with the results
processed by the Actions specified in the top-level
action_list. Similarly, the
locally-specified Expectation Suite
users.error is run against the
batch_request that employs
my_special_data_connector with the
results also processed by the actions specified in the
top-level action_list. In addition, the
top-level Expectation Suite
users.delivery is run against the
batch_request that employs
my_other_data_connector with the results
processed by the union of actions in the
locally-specified action_list and in the
top-level action_list.
For additional Checkpoint configuration information, see Manage Checkpoints.