Deploy Great Expectations in hosted environments without a file system
The components in the
great_expectations.yml
file define the
Validation Results Stores, Datasource connections, and
Data Docs hosts for a Data Context. These components
might be inaccessible in hosted environments, such as
Databricks, Amazon EMR, and Google Cloud Composer. The
information provided here is intended to help you use
Great Expectations in hosted environments.
Configure your Data Context
To use code to create a Data Context, see How to instantiate an Ephemeral Data Context.
To configure a Data Context for a specific environment, see one of the following resources:
- How to instantiate a Data Context on an EMR Spark cluster
- How to use Great Expectations in Databricks
Create Expectation Suites and add Expectations
To add a Datasource and an Expectation Suite, see How to connect to a PostgreSQL database.
To add Expectations to your Suite individually, use the following code:
validator.expect_column_values_to_not_be_null("my_column")
validator.save_expectation_suite(discard_failed_expectations=False)
To configure your Expectation store to load a Suite at a later time, see one of the following resources:
- How to configure an Expectation store to use Amazon S3
- How to configure an Expectation store to use Azure Blob Storage
- How to configure an Expectation store to use GCS
- How to configure an Expectation store to use a filesystem
- How to configure an Expectation store to use PostgreSQL
Run validation
To use an Expectation Suite you've created to validate data, see How to validate data without a Checkpoint.
Use Data Docs
To build and view Data Docs in your environment, see Options for hosting Data Docs.