How to convert an Ephemeral Data Context to a Filesystem Data Context
An Ephemeral Data Context is a temporary, in-memory Data Context that will not persist beyond the current Python session. However, if you decide you would like to save the contents of an Ephemeral Data Context for future use you can do so by converting it to a Filesystem Data Context.
Prerequisites
- A working installation of Great Expectations
- An Ephemeral Data Context instance
If you still need to set up and install GX...
If you still need to create a Data Context...
The get_context()
method will
return an Ephemeral Data Context if your system
is not set up to work with GX Cloud and a
Filesystem Data Context cannot be found. For
more information, see:
You can also instantiate an Ephemeral Data Context (for those occasions when your system is set up to work with GX Cloud or you do have a previously initialized Filesystem Data Context). For more information, see:
If you aren't certain that your Data Context
is Ephemeral...
You can easily check to see if you are working
with an Ephemeral Data Context with the
following code (in this example, we are assuming
your Data Context is stored in the variable
context
):
from great_expectations.data_context import EphemeralDataContext
# ...
if isinstance(context, EphemeralDataContext):
print("It's Ephemeral!")
Steps
1. Verify that your current working directory does not already contain a GX Filesystem Data Context
The method for converting an Ephemeral Data Context to a Filesystem Data Context initializes the new Filesystem Data Context in the current working directory of the Python process that is being executed. If a Filesystem Data Context already exists at that location, the process will fail.
You can determine if your current working directory
already has a Filesystem Data Context by looking for a
great_expectations.yml
file. The presence
of that file indicates that a Filesystem Data Context
has already been initialized in the corresponding
directory.
2. Convert the Ephemeral Data Context into a Filesystem Data Context
Converting an Ephemeral Data Context into a Filesystem Data Context can be done with one line of code:
context = context.convert_to_file_context()
The convert_to_file_context()
method
does not change the Ephemeral Data Context itself.
Rather, it initializes a new Filesystem Data
Context with the contents of the Ephemeral Data
Context and then returns an instance of the new
Filesystem Data Context. If you do not replace the
Ephemeral Data Context instance with the
Filesystem Data Context instance, it will be
possible for you to continue using the Ephemeral
Data Context.
If you do this, it is important to note that
changes to the Ephemeral Data Context
will not be reflected in the
Filesystem Data Context. Moreover,
convert_to_file_context()
does not
support merge operations. This means you will not
be able to save any additional changes you have
made to the content of the Ephemeral Data Context.
Neither will you be able to use
convert_to_file_context()
to replace
the Filesystem Data Context you had previously
created:
convert_to_file_context()
will fail
if a Filesystem Data Context already exists in the
current working directory.
For these reasons, it is strongly advised that once you have converted your Ephemeral Data Context to a Filesystem Data Context you cease working with the Ephemeral Data Context instance and begin working with the Filesystem Data Context instance instead.
Next steps
Customizing configurations in a Data Context
Configuring credentials
While some source data systems provide their own means of configuring credentials through environment variables, you can also configure GX to populate credentials from either a YAML file or a secret manager. For more information, please see:
Configuring Expectation Stores
Configuring Validation Results Stores
Configuring Metric Stores
Configuring Data Docs
Connecting GX to source data systems
Connecting GX to filesystem source data
Local Filesystems
- How to quickly connect to a single file using Pandas
- How to connect to one or more files using Pandas
- How to connect to one or more files using Spark
Google Cloud Storage
Azure Blob Storage
- How to connect to data on Azure Blob Storage using Pandas
- How to connect to data on Azure Blob Storage using Spark
Amazon Web Services
Connecting GX to in-memory source data
Connecting GX to SQL source data
General SQL Datasources
Specific SQL dialects