How to instantiate an Ephemeral Data Context
An Ephemeral Data Context is a temporary, in-memory Data Context. They are ideal for doing data exploration and initial analysis when you do not want to save anything to an existing project, or for when you need to work in a hosted environment such as an EMR Spark Cluster.
Prerequisites
- A Great Expectations instance. See Setup: Overview.
Steps
1. Import necessary classes for instantiating an Ephemeral Data Context
To create our Data Context, we will create a
configuration that uses in-memory Metadata Stores.
This will require two classes from the Great
Expectations module: the
DataContextConfig
class and the
InMemoryStoreBackendDefaults
class. These
can be imported with the code:
from great_expectations.data_context.types.base import (
DataContextConfig,
InMemoryStoreBackendDefaults,
)
We will also need to import the
EphemeralDataContext
class that we will
be creating an instance of:
from great_expectations.data_context import EphemeralDataContext
2. Create the Data Context configuration
To create a Data Context configuration that specifies
the use of in-memory Metadata Stores we will pass in
an instance of the
InMemoryStoreBackendDefaults
class as a
parameter when initializing an instance of the
DataContextConfig
class:
project_config = DataContextConfig(
store_backend_defaults=InMemoryStoreBackendDefaults()
)
3. Instantiate an Ephemeral Data Context
To create our Ephemeral Data Context instance, we
initialize the EphemeralDataContext
class
while passing in the
DataContextConfig
instance we previously
created as the value of the
project_config
parameter.
context = EphemeralDataContext(project_config=project_config)
We now have an Ephemeral Data Context to use for the rest of this Python session.
An Ephemeral Data Context is an in-memory Data Context that is not intended to persist beyond the current Python session. However, if you decide that you would like to save its contents for future use you can do so by converting it to a Filesystem Data Context:
context = context.convert_to_file_context()
This method will initialize a Filesystem Data Context in the current working directory of the Python process that contains the Ephemeral Data Context. For more detailed explanation of this method, please see our guide on how to convert an ephemeral data context to a filesystem data context
Next steps
Connecting GX to source data systems
Now that you have an Ephemeral Data Context you will want to connect GX to your data. For this, please see the appropriate guides from the following:
Connecting GX to filesystem source data
Local Filesystems
- How to quickly connect to a single file using Pandas
- How to connect to one or more files using Pandas
- How to connect to one or more files using Spark
Google Cloud Storage
Azure Blob Storage
- How to connect to data on Azure Blob Storage using Pandas
- How to connect to data on Azure Blob Storage using Spark
Amazon Web Services
Connecting GX to in-memory source data
Connecting GX to SQL source data
General SQL Datasources
Specific SQL dialects
Preserving the contents of an Ephemeral Data Context
An Ephemeral Data Context is a temporary, in-memory object. It will not persist beyond the current Python session. If you decide that you would like to keep the contents of your Ephemeral Data Context for future use, please see: