How to create and edit Expectations with instant feedback from a sample Batch of data
This guide will take you through the process of creating ExpectationsA verifiable assertion about data. in Interactive Mode. The term "Interactive Mode" denotes the fact that you are interacting with your data as you work. In other words, you have access to a DatasourceProvides a standard API for accessing and interacting with data from a wide variety of source systems. and can specify a BatchA selection of records from a Data Asset. of data to be used to create Expectations against. Working in interactive mode will not edit your data: you are only using it to run your Expectations against.
Prerequisites: This how-to guide assumes you have:
- Completed the Getting Started Tutorial
- A working installation of Great Expectations
- Configured a Data Context.
- Created a Datasource.
Steps
1. Use the CLI to begin the interactive process of creating Expectations
From the root folder of your Data Context, enter the command:
> great_expectations suite new
This will bring up the following prompt:
Using v3 (Batch Request) API
How would you like to create your Expectation Suite?
1. Manually, without interacting with a sample Batch of data (default)
2. Interactively, with a sample Batch of data
3. Automatically, using a Data Assistant
:
To start the Interactive Mode workflow, enter
2
.
You can skip this prompt by including the flag
--interactive
in your command line
input, like so:
> great_expectations suite new --interactive
2. Specify a Datasource (if multiple are available)
Next, the CLI will determine which Data Asset you intend to run your new Expectations against. If you only have one configured Data Asset, it will do so automatically. If there are multiple options, you will be prompted to choose one. If you do not have a Datasource configured, the CLI exit after alerting you to the issue. You need to have a Datasource configured to continue the Interactive Mode workflow!
3. Specify the name of your new Expectation Suite
Once your Datasource has been determined, you will be prompted with something similar to the following:
Name the new Expectation Suite [{DEFAULT_NAME_BASED_OFF_OF_DATASOURCE_NAME}]:
You may either enter the name that you wish to use, or
press enter
to proceed with the default
name.
You may skip this prompt by specifying an
Expectation Suite name from the command line. To
do this, include the CLI's
--expectation-suite
flag when you
enter the command to start the process, like so:
> great_expectations suite new --expectation-suite {EXPECTATION_SUITE_NAME}
If you provide the name of an Expectation Suite that already exists, the CLI will alert you that the specified Expectation Suite already exists, and will exit.
4. Continue the workflow within a Jupyter Notebook
At this point the CLI will create an empty Expectation Suite according to the specifications you provided, and then open a Jupyter Notebook that contains the rest of the Interactive Mode workflow.
The code provided in the Jupyter Notebook will do two things.
In the first cell, it will perform all the necessary
steps to provide you with a
validator
object that is set up to work
with your specified Expectation Suite and a Batch
Request for your specified Data Asset.
In the final code cell, it will save any edits you
have made to your Expectation Suite and then test the
Expectation Suite by running a
SimpleCheckpoint
. The results of that
Checkpoint will then be displayed in Data Docs.
In between those two code cells, you will insert the
code to create new Expectations for your Expectation
Suite. Simply insert an empty cell and create new
Expectations by calling specific Expectation methods
on the validator
object that was created
in the first cell.
🚀🚀 Congratulations! 🚀🚀
You have used the Interactive Mode to create and edit a new Expectation Suite!
Optional alternative Interactive Mode workflows
1. (Optional) Edit an existing Expectation Suite in Interactive Mode
If you have an existing Expectation Suite that you wish to edit in Interactive Mode, you will need to use the following command:
> great_expectations suite edit {NAME_OF_YOUR_EXPECTATION_SUITE} --interactive
This will open a Jupyter Notebook that will show the Expectations currently configured for your Expectation Suite. You can then add, delete, or otherwise edit these Expectations. When you are done, you will simply save the Expectation Suite and overwrite the old Expectations with the new ones you have executed.
2. (Optional) Profile your data to generate Expectations, then edit them in Interactive Mode.
One of the easiest ways to get starting in the
interactive mode is to take advantage of the
--profile
flag (please see
How to create and edit Expectations with a
Profiler).
Following this workflow will result in your new Expectation Suite being pre-populated with Expectations based on the Profiler's results. After using the Profiler to create your new Expectations, you can then edit them in Interactive Mode as described above.
Additional tips and tricks
1. Save a Batch Request to reuse when editing an Expectation Suite in Interactive Mode
When in the Interactive Mode, the initialization cell
of your Jupyter Notebook will contain the
batch_request
dictionary. You can convert
it to JSON and save that JSON in a file for future
use. The contents of this file would look something
like this: :::
{
"datasource_name": "my_datasource",
"data_connector_name": "my_data_connector",
"data_asset_name": "my_asset"
}
You can then utilize this saved
batch_request
(containing any refinements
you may have made to it in your notebook) and skip the
steps of selecting its components:
> great_expectations suite new --interactive --batch-request my_saved_batch_request_file.json
Unless you specify the name of the Expectation Suite
on the command line (using the
--expectation_suite MY_SUITE
syntax), the
command will ask you to name your new Expectation
Suite and offer you a default name for you to simply
accept, or provide your own.
You can extend the previous example to specify the name of the Expectation Suite on the command line as follows:
> great_expectations suite new --expectation-suite my_suite --interactive --batch-request my_saved_batch_request.json
2. Use the built-in help to review the CLI's
suite new
optional flags
To check the syntax for optional flags that the
great_expectations suite new
CLI command
accepts, you can always run the following command in
the root directory of your project (where the
great_expectations init
command created
the great_expectations
subdirectory:
> great_expectations suite new --help