Skip to main content
Version: 0.14.13

How to create and edit Expectations in bulk

The JsonSchemaProfiler helps you quickly create Expectation SuitesA collection of verifiable assertions about data. from jsonschema files.

Prerequisites: This how-to guide assumes you have:
  • Completed the Getting Started Tutorial
  • Have a working installation of Great Expectations
  • Have a valid jsonschema file that has top level object of type object.
danger

This implementation does not traverse any levels of nesting.

Steps

1.Set a filename and a suite name

jsonschema_file = versioned_code/version-0.14.13/"YOUR_JSON_SCHEMA_FILE.json"
suite_name = "YOUR_SUITE_NAME"

2. Load a DataContext

context = ge.data_context.DataContext()

3. Load the jsonschema file

with open(jsonschema_file, "r") as f:
schema = json.load(f)

4. Instantiate the profiler

profiler = JsonSchemaProfiler()

5. Create the suite

suite = profiler.profile(schema, suite_name)

6. Save the suite

context.save_expectation_suite(suite)

7. Optionally, generate Data Docs and review the results there.

Data DocsHuman readable documentation generated from Great Expectations metadata detailing Expectations, Validation Results, etc. provides a concise and useful way to review the Expectation Suite that has been created.

In python, this is done by calling the build_data_docs() method of your Data ContextThe primary entry point for a Great Expectations deployment, with configurations and methods for all supporting components..

context.build_data_docs()

You can also review and update the Expectations created by the ProfilerGenerates Metrics and candidate Expectations from data. to get to the Expectation Suite you want using:

great_expectations suite edit

Additional notes

info

Note that JsonSchemaProfiler generates Expectation Suites using column map ExpectationsA verifiable assertion about data., which assumes a tabular data structure, because Great Expectations does not currently support nested data structures.

The full example script is here:

import json
import great_expectations as ge
from great_expectations.profile.json_schema_profiler import JsonSchemaProfiler

jsonschema_file = versioned_code/version-0.14.13/"YOUR_JSON_SCHEMA_FILE.json"
suite_name = "YOUR_SUITE_NAME"

context = ge.data_context.DataContext()

with open(jsonschema_file, "r") as f:
raw_json = f.read()
schema = json.loads(raw_json)

print("Generating suite...")
profiler = JsonSchemaProfiler()
suite = profiler.profile(schema, suite_name)
context.save_expectation_suite(suite)