How to configure credentials
This guide will explain how to configure your
great_expectations.yml
project config to
populate credentials from either a YAML file or a
secret manager.
If your Great Expectations deployment is in an environment without a file system, refer to How to instantiate a Data Context without a yml file for credential configuration examples.
- YAML
- Secret Manager
Prerequisites: This how-to guide assumes you have:
- Completed the Getting Started Tutorial
- Have a working installation of Great Expectations
Steps
1. Save credentials and config
Decide where you would like to save the desired credentials or config values - in a YAML file, environment variables, or a combination - then save the values.
In most cases, we suggest using a config variables YAML file. YAML files make variables more visible, easily editable, and allow for modularization (e.g. one file for dev, another for prod).
-
In the
great_expectations.yml
config file, environment variables take precedence over variables defined in a config variables YAML -
Environment variable substitution is
supported in both the
great_expectations.yml
and config variablesconfig_variables.yml
config file.
If using a YAML file, save desired credentials
or config values to
great_expectations/uncommitted/config_variables.yml
or another YAML file of your choosing:
my_postgres_db_yaml_creds:
drivername: postgresql
host: localhost
port: 5432
username: postgres
password: ${MY_DB_PW}
database: postgres
-
If you wish to store values that include
the dollar sign character
$
, please escape them using a backslash\
so substitution is not attempted. For example in the above example for Postgres credentials you could setpassword: pa\$sword
if your password ispa$sword
. Say that 5 times fast, and also please choose a more secure password! -
When you save values via the
CLICommand Line Interface, they are automatically escaped if they
contain the
$
character. -
You can also have multiple substitutions
for the same item, e.g.
database_string: ${USER}:${PASSWORD}@${HOST}:${PORT}/${DATABASE}
If using environment variables, set values by
entering
export ENV_VAR_NAME=env_var_value
in the terminal or adding the commands to your
~/.bashrc
file:
export POSTGRES_DRIVERNAME=postgresql
export POSTGRES_HOST=localhost
export POSTGRES_PORT=5432
export POSTGRES_USERNAME=postgres
export POSTGRES_PW=
export POSTGRES_DB=postgres
export MY_DB_PW=password
2. Set config_variables_file_path
If using a YAML file, set the
config_variables_file_path
key in
your great_expectations.yml
or
leave the default.
config_variables_file_path: uncommitted/config_variables.yml
3. Replace credentials with placeholders
Replace credentials or other values in your
great_expectations.yml
with
${}
-wrapped variable names (i.e.
${ENVIRONMENT_VARIABLE}
or
${YAML_KEY}
).
datasources:
my_postgres_db:
class_name: Datasource
module_name: great_expectations.datasource
execution_engine:
module_name: great_expectations.execution_engine
class_name: SqlAlchemyExecutionEngine
credentials: ${my_postgres_db_yaml_creds}
data_connectors:
default_inferred_data_connector_name:
class_name: InferredAssetSqlDataConnector
my_other_postgres_db:
class_name: Datasource
module_name: great_expectations.datasource
execution_engine:
module_name: great_expectations.execution_engine
class_name: SqlAlchemyExecutionEngine
credentials:
drivername: ${POSTGRES_DRIVERNAME}
host: ${POSTGRES_HOST}
port: ${POSTGRES_PORT}
username: ${POSTGRES_USERNAME}
password: ${POSTGRES_PW}
database: ${POSTGRES_DB}
data_connectors:
default_inferred_data_connector_name:
class_name: InferredAssetSqlDataConnector
Additional Notes
-
The default
config_variables.yml
file located atgreat_expectations/uncommitted/config_variables.yml
applies to deployments created usinggreat_expectations init
. - To view the full script used in this page, see it on GitHub: how_to_configure_credentials.py
Choose which secret manager you are using:
- AWS Secrets Manager
- GCP Secret Manager
- Azure Key Vault
This guide will explain how to configure
your
great_expectations.yml
project config to substitute variables
from AWS Secrets Manager.
Prerequisites: This how-to guide assumes you have:
- Completed the Getting Started Tutorial
- Have a working installation of Great Expectations
- Configured a secret manager and secrets in the cloud with AWS Secrets Manager
Secrets store substitution uses the
configurations from your
great_expectations.yml
project config
after all other types
of substitution are applied (from
environment variables or from the
config_variables.yml
config file)
The secrets store substitution works based on keywords. It tries to retrieve secrets from the secrets store for the following values :
-
AWS: values starting with
secret|arn:aws:secretsmanager
if the values you provide don't match with the keywords above, the values won't be substituted.
Setup
To use AWS Secrets Manager, you may need
to install the
great_expectations
package
with its aws_secrets
extra
requirement:
pip install great_expectations[aws_secrets]
In order to substitute your value by a
secret in AWS Secrets Manager, you need to
provide an arn of the secret like this
one:
secret|arn:aws:secretsmanager:123456789012:secret:my_secret-1zAyu6
The last 7 characters of the arn are
automatically generated by AWS and are
not mandatory to retrieve the secret,
thus
secret|arn:aws:secretsmanager:region-name-1:123456789012:secret:my_secret
will retrieve the same secret.
You will get the latest version of the secret by default.
You can get a specific version of the
secret you want to retrieve by specifying
its version UUID like this:
secret|arn:aws:secretsmanager:region-name-1:123456789012:secret:my_secret:00000000-0000-0000-0000-000000000000
If your secret value is a JSON string, you
can retrieve a specific value like this:
secret|arn:aws:secretsmanager:region-name-1:123456789012:secret:my_secret|key
Or like this:
secret|arn:aws:secretsmanager:region-name-1:123456789012:secret:my_secret:00000000-0000-0000-0000-000000000000|key
Example great_expectations.yml:
datasources:
dev_postgres_db:
class_name: SqlAlchemyDatasource
data_asset_type:
class_name: SqlAlchemyDataset
module_name: great_expectations.dataset
module_name: great_expectations.datasource
credentials:
drivername: secret|arn:aws:secretsmanager:${AWS_REGION}:${ACCOUNT_ID}:secret:dev_db_credentials|drivername
host: secret|arn:aws:secretsmanager:${AWS_REGION}:${ACCOUNT_ID}:secret:dev_db_credentials|host
port: secret|arn:aws:secretsmanager:${AWS_REGION}:${ACCOUNT_ID}:secret:dev_db_credentials|port
username: secret|arn:aws:secretsmanager:${AWS_REGION}:${ACCOUNT_ID}:secret:dev_db_credentials|username
password: secret|arn:aws:secretsmanager:${AWS_REGION}:${ACCOUNT_ID}:secret:dev_db_credentials|password
database: secret|arn:aws:secretsmanager:${AWS_REGION}:${ACCOUNT_ID}:secret:dev_db_credentials|database
prod_postgres_db:
class_name: SqlAlchemyDatasource
data_asset_type:
class_name: SqlAlchemyDataset
module_name: great_expectations.dataset
module_name: great_expectations.datasource
credentials:
drivername: secret|arn:aws:secretsmanager:${AWS_REGION}:${ACCOUNT_ID}:secret:PROD_DB_CREDENTIALS_DRIVERNAME
host: secret|arn:aws:secretsmanager:${AWS_REGION}:${ACCOUNT_ID}:secret:PROD_DB_CREDENTIALS_HOST
port: secret|arn:aws:secretsmanager:${AWS_REGION}:${ACCOUNT_ID}:secret:PROD_DB_CREDENTIALS_PORT
username: secret|arn:aws:secretsmanager:${AWS_REGION}:${ACCOUNT_ID}:secret:PROD_DB_CREDENTIALS_USERNAME
password: secret|arn:aws:secretsmanager:${AWS_REGION}:${ACCOUNT_ID}:secret:PROD_DB_CREDENTIALS_PASSWORD
database: secret|arn:aws:secretsmanager:${AWS_REGION}:${ACCOUNT_ID}:secret:PROD_DB_CREDENTIALS_DATABASE
This guide will explain how to configure
your
great_expectations.yml
project config to substitute variables
from GCP Secrets Manager.
Prerequisites: This how-to guide assumes you have:
- Completed the Getting Started Tutorial
- Have a working installation of Great Expectations
- Configured a secret manager and secrets in the cloud with GCP Secret Manager
Secrets store substitution uses the
configurations from your
great_expectations.yml
project config
after all other types
of substitution are applied (from
environment variables or from the
config_variables.yml
config file)
The secrets store substitution works based on keywords. It tries to retrieve secrets from the secrets store for the following values :
-
GCP: values matching the following
regex
^secret\|projects\/[a-z0-9\_\-]{6,30}\/secrets
if the values you provide don't match with the keywords above, the values won't be substituted.
Setup
To use GCP Secret Manager, you may need to
install the
great_expectations
package
with its gcp
extra
requirement:
pip install great_expectations[gcp]
In order to substitute your value by a
secret in GCP Secret Manager, you need to
provide a name of the secret like this
one:
secret|projects/project_id/secrets/my_secret
You will get the latest version of the secret by default.
You can get a specific version of the
secret you want to retrieve by specifying
its version id like this:
secret|projects/project_id/secrets/my_secret/versions/1
If your secret value is a JSON string, you
can retrieve a specific value like this:
secret|projects/project_id/secrets/my_secret|key
Or like this:
secret|projects/project_id/secrets/my_secret/versions/1|key
Example great_expectations.yml:
datasources:
dev_postgres_db:
class_name: SqlAlchemyDatasource
data_asset_type:
class_name: SqlAlchemyDataset
module_name: great_expectations.dataset
module_name: great_expectations.datasource
credentials:
drivername: secret|projects/${PROJECT_ID}/secrets/dev_db_credentials|drivername
host: secret|projects/${PROJECT_ID}/secrets/dev_db_credentials|host
port: secret|projects/${PROJECT_ID}/secrets/dev_db_credentials|port
username: secret|projects/${PROJECT_ID}/secrets/dev_db_credentials|username
password: secret|projects/${PROJECT_ID}/secrets/dev_db_credentials|password
database: secret|projects/${PROJECT_ID}/secrets/dev_db_credentials|database
prod_postgres_db:
class_name: SqlAlchemyDatasource
data_asset_type:
class_name: SqlAlchemyDataset
module_name: great_expectations.dataset
module_name: great_expectations.datasource
credentials:
drivername: secret|projects/${PROJECT_ID}/secrets/PROD_DB_CREDENTIALS_DRIVERNAME
host: secret|projects/${PROJECT_ID}/secrets/PROD_DB_CREDENTIALS_HOST
port: secret|projects/${PROJECT_ID}/secrets/PROD_DB_CREDENTIALS_PORT
username: secret|projects/${PROJECT_ID}/secrets/PROD_DB_CREDENTIALS_USERNAME
password: secret|projects/${PROJECT_ID}/secrets/PROD_DB_CREDENTIALS_PASSWORD
database: secret|projects/${PROJECT_ID}/secrets/PROD_DB_CREDENTIALS_DATABASE
This guide will explain how to configure
your
great_expectations.yml
project config to substitute variables
from Azure Key Vault.
Prerequisites: This how-to guide assumes you have:
- Completed the Getting Started Tutorial
- Have a working installation of Great Expectations
- Set up a working deployment of Great Expectations
- Configured a secret manager and secrets in the cloud with Azure Key Vault
Secrets store substitution uses the
configurations from your
great_expectations.yml
project config
after all other types
of substitution are applied (from
environment variables or from the
config_variables.yml
config file)
The secrets store substitution works based on keywords. It tries to retrieve secrets from the secrets store for the following values :
-
Azure : values matching the
following regex
^secret\|https:\/\/[a-zA-Z0-9\-]{3,24}\.vault\.azure\.net
if the values you provide don't match with the keywords above, the values won't be substituted.
Setup
To use Azure Key Vault, you may need to
install the
great_expectations
package
with its azure_secrets
extra
requirement:
pip install great_expectations[azure_secrets]
In order to substitute your value by a
secret in Azure Key Vault, you need to
provide a name of the secret like this
one:
secret|https://my-vault-name.vault.azure.net/secrets/my-secret
You will get the latest version of the secret by default.
You can get a specific version of the
secret you want to retrieve by specifying
its version id (32 lowercase alphanumeric
characters) like this:
secret|https://my-vault-name.vault.azure.net/secrets/my-secret/a0b00aba001aaab10b111001100a11ab
If your secret value is a JSON string, you
can retrieve a specific value like this:
secret|https://my-vault-name.vault.azure.net/secrets/my-secret|key
Or like this:
secret|https://my-vault-name.vault.azure.net/secrets/my-secret/a0b00aba001aaab10b111001100a11ab|key
Example great_expectations.yml:
datasources:
dev_postgres_db:
class_name: SqlAlchemyDatasource
data_asset_type:
class_name: SqlAlchemyDataset
module_name: great_expectations.dataset
module_name: great_expectations.datasource
credentials:
drivername: secret|https://${VAULT_NAME}.vault.azure.net/secrets/dev_db_credentials|drivername
host: secret|https://${VAULT_NAME}.vault.azure.net/secrets/dev_db_credentials|host
port: secret|https://${VAULT_NAME}.vault.azure.net/secrets/dev_db_credentials|port
username: secret|https://${VAULT_NAME}.vault.azure.net/secrets/dev_db_credentials|username
password: secret|https://${VAULT_NAME}.vault.azure.net/secrets/dev_db_credentials|password
database: secret|https://${VAULT_NAME}.vault.azure.net/secrets/dev_db_credentials|database
prod_postgres_db:
class_name: SqlAlchemyDatasource
data_asset_type:
class_name: SqlAlchemyDataset
module_name: great_expectations.dataset
module_name: great_expectations.datasource
credentials:
drivername: secret|https://${VAULT_NAME}.vault.azure.net/secrets/PROD_DB_CREDENTIALS_DRIVERNAME
host: secret|https://${VAULT_NAME}.vault.azure.net/secrets/PROD_DB_CREDENTIALS_HOST
port: secret|https://${VAULT_NAME}.vault.azure.net/secrets/PROD_DB_CREDENTIALS_PORT
username: secret|https://${VAULT_NAME}.vault.azure.net/secrets/PROD_DB_CREDENTIALS_USERNAME
password: secret|https://${VAULT_NAME}.vault.azure.net/secrets/PROD_DB_CREDENTIALS_PASSWORD
database: secret|https://${VAULT_NAME}.vault.azure.net/secrets/PROD_DB_CREDENTIALS_DATABASE