Configure Validation Result Stores
                          A Validation Results Store is a connector that is used
                          to store and retrieve information about objects
                          generated when data is Validated against an
                          Expectation. By default, Validation Results are stored
                          in JSON format in the
                          uncommitted/validations/ subdirectory of
                          your gx/ folder. Use the information
                          provided here to configure a store for your Validation
                          Results.
                        
Validation Results can include sensitive or regulated data that should not be committed to a source control system.
- Amazon S3
- Microsoft Azure Blob Storage
- Google Cloud Service
- Filesystem
- PostgreSQL
Amazon S3
Use the information provided here to configure a new storage location for Validation Results in Amazon S3.
Prerequisites
- Completion of the Quickstart guide.
- A working installation of Great Expectations.
- A Data Context.
- An Expectations Suite.
- A Checkpoint.
- Permissions to install boto3 in your local environment.
- An S3 bucket and prefix for the Validation Results.
Install boto3 in your local environment
                                Python interacts with AWS through the
                                boto3 library. Great Expectations
                                makes use of this library in the background when
                                working with AWS. Although you won't use
                                boto3 directly, you'll need to
                                install it in your virtual environment.
                              
                                Run one of the following pip commands to install
                                boto3 in your virtual environment:
                              
python -m pip install boto3
or
python3 -m pip install boto3
                                To set up
                                boto3
                                with AWS, and use boto3 within
                                Python, see the
                                Boto3 documentation.
                              
Verify your AWS credentials are properly configured
Run the following command in the AWS CLI to verify that your AWS credentials are properly configured:
aws sts get-caller-identity
                                When your credentials are properly configured,
                                your UserId, Account,
                                and Arn are returned. If your
                                credentials are not configured correctly, an
                                error message appears. If you received an error
                                message, or you couldn't verify your
                                credentials, see
                                Configuring the AWS CLI.
                              
Identify your Data Context Validation Results Store
Your Validation Results StoreA connector to store and retrieve information about objects generated when data is Validated against an Expectation Suite. configuration is in your Data ContextThe primary entry point for a Great Expectations deployment, with configurations and methods for all supporting components..
                                The following section in your
                                Data ContextThe primary entry point for a Great
                                    Expectations deployment, with configurations
                                    and methods for all supporting
                                    components.
                                great_expectations.yml file tells
                                Great Expectations to look for Validation
                                Results in a Store named
                                validations_store. It also creates
                                a ValidationsStore named
                                validations_store that is backed by
                                a Filesystem and stores Validation Results under
                                the base_directory
                                uncommitted/validations (the
                                default).
                              
stores:
  validations_store:
    class_name: ValidationsStore
    store_backend:
      class_name: TupleFilesystemStoreBackend
      base_directory: uncommitted/validations/
validations_store_name: validations_store
Update your configuration file to include a new Store for Validation Results
                                To manually add a Validation Results Store, add
                                the following configuration to the
                                stores section of your
                                great_expectations.yml file:
                              
stores:
  validations_S3_store:
    class_name: ValidationsStore
    store_backend:
      class_name: TupleS3StoreBackend
      bucket: '<your>'
      prefix: '<your>'  # Bucket and prefix in combination must be unique across all stores
                                As shown in the previous example, you need to
                                change the default
                                store_backend settings to make the
                                Store work with S3. The
                                class_name is set to
                                TupleS3StoreBackend,
                                bucket is the address of your S3
                                bucket, and prefix is the folder in
                                your S3 bucket where Validation Results are
                                located.
                              
                                The following example shows the additional
                                options that are available to customize
                                TupleS3StoreBackend:
                              
class_name: ValidationsStore
store_backend:
  class_name: TupleS3StoreBackend
  bucket: '<your_s3_bucket_name>'
  prefix: '<your_s3_bucket_folder_name>'  # Bucket and prefix in combination must be unique across all stores
  boto3_options:
    endpoint_url: ${S3_ENDPOINT} # Uses the S3_ENDPOINT environment variable to determine which endpoint to use.
    region_name: '<your_aws_region_name>'
                                In the previous example, the Store name is
                                validations_S3_store. If you use a
                                personalized Store name, you must also update
                                the value of the
                                validations_store_name key to match
                                the Store name. For example:
                              
validations_store_name: validations_S3_store
                                When you update the
                                validations_store_name key value,
                                Great Expectations uses the new Store for
                                Validation Results.
                              
                                Add the following code to
                                great_expectations.yml to configure
                                the IAM user:
                              
class_name: ValidationsStore
store_backend:
  class_name: TupleS3StoreBackend
  bucket: '<your_s3_bucket_name>'
  prefix: '<your_s3_bucket_folder_name>' # Bucket and prefix in combination must be unique across all stores
  boto3_options:
    aws_access_key_id: ${AWS_ACCESS_KEY_ID} # Uses the AWS_ACCESS_KEY_ID environment variable to get aws_access_key_id.
    aws_secret_access_key: ${AWS_ACCESS_KEY_ID}
    aws_session_token: ${AWS_ACCESS_KEY_ID}
                                Add the following code to
                                great_expectations.yml to configure
                                the IAM Assume Role:
                              
class_name: ValidationsStore
store_backend:
  class_name: TupleS3StoreBackend
  bucket: '<your_s3_bucket_name>'
  prefix: '<your_s3_bucket_folder_name>' # Bucket and prefix in combination must be unique across all stores
  boto3_options:
    assume_role_arn: '<your_role_to_assume>'
    region_name: '<your_aws_region_name>'
    assume_role_duration: session_duration_in_seconds
                                    If you are also storing
                                    ExpectationsA verifiable assertion about
                                        data.
                                    in S3
                                    How to configure an Expectation store to
                                      use Amazon S3, or DataDocs in S3
                                    How to host and share Data Docs, then make sure the
                                    prefix values are disjoint and
                                    one is not a substring of the other.
                                  
Copy existing Validation results to the S3 bucket (Optional)
If you are converting an existing local Great Expectations deployment to one that works in AWS, you might have Validation Results saved that you want to transfer to your S3 bucket.
                                To copy Validation Results into Amazon S3, use
                                the aws s3 sync command as shown in
                                the following example:
                              
aws s3 sync '<base_directory>' s3://'<your_s3_bucket_name>'/'<your_s3_bucket_folder_name>'
                                The base_directory is set to
                                uncommitted/validations/ by
                                default.
                              
                                In the following example, the Validation Results
                                Validation1 and
                                Validation2 are copied to Amazon S3
                                and a confirmation message is returned:
                              
upload: uncommitted/validations/val1/val1.json to s3://'<your_s3_bucket_name>'/'<your_s3_bucket_folder_name>'/val1.json
upload: uncommitted/validations/val2/val2.json to s3://'<your_s3_bucket_name>'/'<your_s3_bucket_folder_name>'/val2.json
Confirm the configuration
Run a Checkpoint to store results in the new Validation Results Store on S3 then visualize the results by re-building Data Docs.
Microsoft Azure Blob Storage
Use the information provided here to configure a new storage location for Validation Results in Azure Blob Storage.
Prerequisites
- A Data Context.
- An Expectations Suite.
- A Checkpoint.
- An Azure Storage account and get the connection string.
- 
                                    An Azure Blob container. If you want to
                                    host and share Data Docs on Azure Blob
                                      Storage, you can set this up first and then use
                                    the $webexisting container to store your ExpectationsA verifiable assertion about data..
- A prefix (folder) to store Validation Results. You don't need to create the folder, the prefix is just part of the Blob name.
                                Configure the
                                config_variables.yml file with your
                                Azure Storage credentials
                              
                              
                                GX recommends that you store Azure Storage
                                credentials in the
                                config_variables.yml file, which is
                                located in the uncommitted/ folder
                                by default, and is not part of source control.
                                The following code adds Azure Storage
                                credentials under the key
                                AZURE_STORAGE_CONNECTION_STRING:
                              
AZURE_STORAGE_CONNECTION_STRING: "DefaultEndpointsProtocol=https;EndpointSuffix=core.windows.net;AccountName=<YOUR-STORAGE-ACCOUNT-NAME>;AccountKey=<YOUR-STORAGE-ACCOUNT-KEY==>"
                                To learn more about the additional options for
                                configuring the
                                config_variables.yml file, or
                                additional environment variables, see
                                How to configure credentials
                              
Identify your Validation Results Store
                                Your
                                Validation Results StoreA connector to store and retrieve
                                    information about objects generated when
                                    data is Validated against an Expectation
                                    Suite.
                                configuration is provided in your
                                Data ContextThe primary entry point for a Great
                                    Expectations deployment, with configurations
                                    and methods for all supporting
                                    components.. Open great_expectations.yml and
                                find the following entry:
                              
validations_store_name: validations_store
stores:
  validations_store:
      class_name: ValidationsStore
      store_backend:
          class_name: TupleFilesystemStoreBackend
          base_directory: uncommitted/validations/
                                This configuration tells Great Expectations to
                                look for Validation Results in a Store named
                                validations_store. The default
                                base_directory for
                                validations_store is
                                uncommitted/validations/.
                              
Update your configuration file to include a new Store for Validation Results on Azure Storage account
                                In the following example,
                                validations_store_name is set to
                                validations_AZ_store, but it can be
                                personalized. You also need to change the
                                store_backend settings. The
                                class_name is
                                TupleAzureBlobStoreBackend,
                                container is the name of your blob
                                container where Validation Results are stored,
                                prefix is the folder in the
                                container where Validation Result files are
                                located, and connection_string is
                                ${AZURE_STORAGE_CONNECTION_STRING}to reference the corresponding key in the
                                config_variables.yml file.
                              
validations_store_name: validations_AZ_store
stores:
  validations_AZ_store:
      class_name: ValidationsStore
      store_backend:
          class_name: TupleAzureBlobStoreBackend
          container: <blob-container>
          prefix: validations
          connection_string: ${AZURE_STORAGE_CONNECTION_STRING}
                                    If the container for
                                    hosting and sharing Data Docs on Azure
                                      Blob Storage
                                    is named $web, use
                                    container: \$web to allow
                                    access to the $webcontainer.
                                  
Additional authentication and configuration options are available. See Host and Share Data Docs on Azure Blob Storage.
Copy existing Validation Results JSON files to the Azure blob (Optional)
                                You can use the
                                az storage blob upload command to
                                copy Validation Results into Azure Blob Storage.
                                The following command copies one Validation
                                Result from a local folder to the Azure blob:
                              
export AZURE_STORAGE_CONNECTION_STRING="DefaultEndpointsProtocol=https;EndpointSuffix=core.windows.net;AccountName=<YOUR-STORAGE-ACCOUNT-NAME>;AccountKey=<YOUR-STORAGE-ACCOUNT-KEY==>"
az storage blob upload -f <local/path/to/validation.json> -c <GREAT-EXPECTATION-DEDICATED-AZURE-BLOB-CONTAINER-NAME> -n <PREFIX>/<validation.json>
example with a validation related to the exp1 expectation:
az storage blob upload -f gx/uncommitted/validations/exp1/20210306T104406.877327Z/20210306T104406.877327Z/8313fb37ca59375eb843adf388d4f882.json -c <blob-container> -n validations/exp1/20210306T104406.877327Z/20210306T104406.877327Z/8313fb37ca59375eb843adf388d4f882.json
Finished[#############################################################]  100.0000%
{
"etag": "\"0x8D8E09F894650C7\"",
"lastModified": "2021-03-06T12:58:28+00:00"
}
To learn more about other methods that are available to copy Validation Result JSON files into Azure Blob Storage, see Quickstart: Upload, download, and list blobs with the Azure portal.
Reference the new configuration
                                To make Great Expectations look for Validation
                                Results on the Azure store, set the
                                validations_store_name variable to
                                the name of your Azure Validations Store. In the
                                previous example this was
                                validations_AZ_store.
                              
Confirm that the Validation Results Store has been correctly configured
Run a Checkpoint to store results in the new Validation Results Store on Azure Blob and then visualize the results by re-building Data Docs.
GCS
Use the information provided here to configure a new storage location for Validation Results in GCS.
To view all the code used in this topic, see how_to_configure_a_validation_result_store_in_gcs.py.
Prerequisites
- A Data Context.
- An Expectations Suite.
- A Checkpoint.
- A GCP service account with credentials that allow access to GCP resources such as Storage Objects.
- A GCP project, GCS bucket, and prefix to store Validation Results.
Configure your GCP credentials
Confirm that your environment is configured with the appropriate authentication credentials needed to connect to the GCS bucket where Validation Results will be stored. This includes the following:
- A GCP service account.
- 
                                  Setting the
                                  GOOGLE_APPLICATION_CREDENTIALSenvironment variable.
- Verifying authentication by running a Google Cloud Storage client library script.
For more information about validating your GCP authentication credentials, see Authenticate to Cloud services using client libraries.
Identify your Data Context Validation Results Store
                                The configuration for your
                                Validation Results StoreA connector to store and retrieve
                                    information about objects generated when
                                    data is Validated against an Expectation
                                    Suite.
                                is available in your
                                Data ContextThe primary entry point for a Great
                                    Expectations deployment, with configurations
                                    and methods for all supporting
                                    components.. Open great_expectations.ymland
                                find the following entry:
                              
stores:
  validations_store:
    class_name: ValidationsStore
    store_backend:
      class_name: TupleFilesystemStoreBackend
      base_directory: uncommitted/validations/
validations_store_name: validations_store
                                This configuration tells Great Expectations to
                                look for Validation Results in the
                                validations_store Store. The
                                default base_directory for
                                validations_store is
                                uncommitted/validations/.
                              
Update your configuration file to include a new Store for Validation Results
                                In the following example,
                                validations_store_name is set to
                                validations_GCS_store, but it can
                                be personalized. You also need to change the
                                store_backend settings. The
                                class_name is
                                TupleGCSStoreBackend,
                                project is your GCP project,
                                bucket is the address of your GCS
                                bucket, and prefix is the folder on
                                GCS where Validation Result files are stored.
                              
stores:
  validations_GCS_store:
    class_name: ValidationsStore
    store_backend:
      class_name: TupleGCSStoreBackend
      project: <your>
      bucket: <your>
      prefix: <your>
validations_store_name: validations_GCS_store
                                    If you are also storing
                                    Expectations in GCS
                                    or
                                    DataDocs in GCS, make sure that the
                                    prefix values are disjoint and
                                    one is not a substring of the other.
                                  
Copy existing Validation Results to the GCS bucket (Optional)
                                Use the gsutil cp command to copy
                                Validation Results into GCS. For example, the
                                following command copies the Validation results
                                validation_1 and
                                validation_2into a GCS bucket:
                              
gsutil cp uncommitted/validations/my_expectation_suite/validation_1.json gs://<your>/<your>/validation_1.json
gsutil cp uncommitted/validations/my_expectation_suite/validation_2.json gs://<your>/<your>/validation_2.json
The following confirmation message is returned:
Operation completed over 2 objects
Additional methods for copying Validation Results into GCS are available. See Upload objects from a filesystem.
Reference the new configuration
                                To make Great Expectations look for Validation
                                Results on the GCS store, set the
                                validations_store_name variable to
                                the name of your GCS Validations Store. In the
                                previous example this was
                                validations_GCS_store.
                              
Confirm that the Validation Results Store has been correctly configured
Run a Checkpoint to store results in the new Validation Results Store on GCS, and then visualize the results by re-building Data Docs.
Filesystem
Use the information provided here to configure a new storage location for Validation Results in your filesystem. You'll learn how to use an ActionA Python class with a run method that takes a Validation Result and does something with it to update Data DocsHuman readable documentation generated from Great Expectations metadata detailing Expectations, Validation Results, etc. sites with new Validation Results from CheckpointThe primary means for validating data in a production deployment of Great Expectations. runs.
Prerequisites
- A Data Context.
- An Expectation Suite .
- A Checkpoint.
- A new storage location to store Validation Results. This can be a local path, or a path to a secure network filesystem.
Create a new folder for Validation Results
Run the following command to create a new folder for your Validation Results and move your existing Validation Results to the new folder:
# in the gx/ folder
mkdir shared_validations
mv uncommitted/validations/npi_validations/ uncommitted/shared_validations/
                                In this example, the name of the Validation
                                Result is npi_validations and the
                                path to the new storage location is
                                shared_validations/.
                              
Identify your Data Context Validation Results Store
                                The configuration for your
                                Validation Results StoreA connector to store and retrieve
                                    information about objects generated when
                                    data is Validated against an Expectation
                                    Suite.
                                is available in your
                                Data ContextThe primary entry point for a Great
                                    Expectations deployment, with configurations
                                    and methods for all supporting
                                    components.. Open great_expectations.ymland
                                find the following entry:
                              
validations_store_name: validations_store
stores:
   validations_store:
       class_name: ValidationsStore
       store_backend:
           class_name: TupleFilesystemStoreBackend
           base_directory: uncommitted/validations/
                                This configuration tells Great Expectations to
                                look for Validation Results in the
                                validations_store Store. The
                                default base_directory for
                                validations_store is
                                uncommitted/validations/.
                              
Update your configuration file to include a new Store for Validation results
                                In the following example,
                                validations_store_name is set to
                                shared_validations_filesystem_store, but it can be personalized. Also,
                                base_directory is set to
                                uncommitted/shared_validations/,
                                but you can set it to another path that is
                                accessible by Great Expectations.
                              
validations_store_name: shared_validations_filesystem_store
stores:
   shared_validations_filesystem_store:
       class_name: ValidationsStore
       store_backend:
           class_name: TupleFilesystemStoreBackend
           base_directory: uncommitted/shared_validations/
Confirm that the Validation Results Store has been correctly configured
Run a Checkpoint to store results in the new Validation Results Store in your new location, and then visualize the results by re-building Data Docs.
PostgreSQL
Use the information provided here to configure Great Expectations to store Validation Results in a PostgreSQL database.
Prerequisites
- A Data Context.
- An Expectations Suite.
- A Checkpoint.
- A PostgreSQL database with appropriate credentials.
                                Configure the
                                config_variables.yml file with your
                                database credentials
                              
                              
                                GX recommends storing database credentials in
                                the config_variables.yml file,
                                which is located in the
                                uncommitted/ folder by default, and
                                not part of source control.
                              
- 
                                  To add database credentials, open config_variables.ymland add the following entry below thedb_credskey:db_creds:
 drivername: postgresql
 host: '<your_host_name>'
 port: '<your_port>'
 username: '<your_username>'
 password: '<your_password>'
 database: '<your_database_name>'To configure the config_variables.ymlfile, or additional environment variables, see How to configure credentials.
- 
                                  Optional. To use a specific schema as the backend, specify schemaas an additional keyword argument. For example:db_creds:
 drivername: postgresql
 host: '<your_host_name>'
 port: '<your_port>'
 username: '<your_username>'
 password: '<your_password>'
 database: '<your_database_name>'
 schema: '<your_schema_name>'
Identify your Data Context Validation Results Store
                                The configuration for your
                                Validation Results StoreA connector to store and retrieve
                                    information about objects generated when
                                    data is Validated against an Expectation
                                    Suite.
                                is available in your
                                Data ContextThe primary entry point for a Great
                                    Expectations deployment, with configurations
                                    and methods for all supporting
                                    components.. Open great_expectations.ymland
                                find the following entry:
                              
validations_store_name: validations_store
stores:
  validations_store:
      class_name: ValidationsStore
      store_backend:
          class_name: TupleFilesystemStoreBackend
          base_directory: uncommitted/validations/
                                This configuration tells Great Expectations to
                                look for Validation Results in the
                                validations_store Store. The
                                default base_directory for
                                validations_store is
                                uncommitted/validations/.
                              
Update your configuration file to include a new Validation Results Store
                                Add the following entry to your
                                great_expectations.yml:
                              
validations_store_name: validations_postgres_store
stores:
  validations_postgres_store:
      class_name: ValidationsStore
      store_backend:
          class_name: DatabaseStoreBackend
          credentials: ${db_creds}
                                In the previous example,
                                validations_store_name is set to
                                validations_postgres_store, but it
                                can be personalized. Also,
                                class_name is set to
                                DatabaseStoreBackend, and
                                credentials is set to
                                ${db_creds}, which references the
                                corresponding key in the
                                config_variables.yml file.
                              
Confirm the addition of the new Validation Results Store
                                In the previous example, a
                                validations_store on the local
                                filesystem and a
                                validations_postgres_store are
                                configured. Great Expectations looks for
                                Validation Results in PostgreSQL when the
                                validations_store_name variable is
                                set to validations_postgres_store.
                                Run the following command to remove
                                validations_store and confirm the
                                validations_postgres_store
                                configuration:
                              
great_expectations store list
- name: validations_store
class_name: ValidationsStore
store_backend:
  class_name: TupleFilesystemStoreBackend
  base_directory: uncommitted/validations/
- name: validations_postgres_store
class_name: ValidationsStore
store_backend:
  class_name: DatabaseStoreBackend
  credentials:
      database: '<your_db_name>'
      drivername: postgresql
      host: '<your_host_name>'
      password: ******
      port: '<your_port>'
      username: '<your_username>'
Confirm the Validation Results Store is configured correctly
Run a Checkpoint to store results in the new Validation Results store in PostgreSQL, and then visualize the results by re-building Data Docs.
                                Great Expectations creates a new table in your
                                database named
                                ge_validations_store, and populates
                                the fields with information from the Validation
                                Results.