Overview and examples
dlt
retrieves configuration and secrets from several locations like environment variables, dedicated
files or secure vaults. It understands both simple and verbose layouts of configuration sections. You can use one of
built-in credentials for popular external systems. Functions decorated with@dlt.source
, @dlt.resource
, or @dlt.destination
can be configured without writing additional code - dlt
will automatically inject missing arguments (like passwords or API keys) when you call them.
Choose where to store configuration
To define your configuration (including sources, destinations, pipeline and parameters) in a declarative way using YAML files, check out dlt+.
dlt
looks for configuration and secrets in various locations (environment variables, toml files or secure vaults) through config providers that are queried when your pipeline runs. You can pick a single location or combine them - for example, define secret api_key
in environment variables and api_url
in a TOML file. Providers are queried in the following order:
-
Environment Variables: If a value is found in an environment variable,
dlt
uses it and doesn't check lower-priority providers. -
secrets.toml and config.toml files: These files store configuration values and secrets.
secrets.toml
contains sensitive information, whileconfig.toml
holds non-sensitive configuration. -
Vaults: Credentials stored in secure vaults like Google Secrets Manager, Azure Key Vault, or AWS Secrets Manager.
-
Custom Providers added with
register_provider
: These are custom implementations you can create to use your own configuration formats or perform specialized preprocessing. -
Default Argument Values: The values specified in the function signature.
Make sure your pipeline name contains only alphanumeric characters, hyphens (-
), and underscores (_
). Avoid whitespace and other punctuation to ensure compatibility with all configuration providers.
Select a configuration layout
You can define configuration in different ways depending on your project's complexity. For a simple pipeline with a single source and destination, your configuration can be straightforward:
Simplest source configuration:
- TOML config provider
- Environment variables
api_key="some_value"
export API_KEY="some_value"
For destination, you typically need to configure credentials which group multiple related keys together. dlt
places these under a credentials
section.
- TOML config provider
- Environment variables
[credentials]
user="dlthub"
password="some_value"
export CREDENTIALS__USER="dlthub"
export CREDENTIALS__PASSWORD="some_value"
Recommended section layout
When using multiple sources with potentially conflicting argument names, or multiple destinations where you want separate credentials, you can organize your config keys with sections. Here's the recommended section layout that is most often used in this documentation
and is also generated by dlt init
command.
- Use
sources
anddestination
top-level sections to separate their configurations - Use the Python module name where the source function is defined to separate configuration of sources defined in different modules
- Use destination type to separate destinations
- TOML config provider
- Environment variables
# source defined in notion.py
[sources.notion]
api_key="some_value"
export SOURCES__NOTION__API_KEY="some_value"
Destination:
- TOML config provider
- Environment variables
# use postgres destination
[destination.postgres.credentials]
user="dlthub"
password="some_value"
export DESTINATION__POSTGRES__CREDENTIALS__USER="dltHub"
export DESTINATION__POSTGRES__CREDENTIALS__PASSWORD="some_value"
Refer to Add credentials guide for more examples and tips how to configure particular source and destination.
How dlt looks for values
dlt
starts looking for a particular value with all possible sections present and if value is not found,
it will eliminate rightmost section and try again.
For example, if the source function is in module notion.py
:
# module: notion.py
@dlt.source
def notion_databases(api_key: str = dlt.secrets.value):
pass
dlt
will search for the following keys in this order:
sources.notion.notion_databases.api_key
sources.notion.api_key
sources.api_key
api_key
Similarly with destination credentials. In that case credentials
sections is considered a required grouping
and won't be eliminated:
destination.postgres.credentials.password
destination.credentials.password
credentials.password
For more detailed information about configuration organization, see configuration and secrets structure.
You can use pipeline name to create separate configurations for each pipeline in your project. Configuration values are searched first with the pipeline name prefix, then without it:
[pipeline_name_1.sources.google_sheets.credentials]
client_email = "<client_email_1>"
private_key = "<private_key_1>"
project_id = "<project_id_1>"
[pipeline_name_2.sources.google_sheets.credentials]
client_email = "<client_email_2>"
private_key = "<private_key_2>"
project_id = "<project_id_2>"
Use built-in credential types
Credentials are groups of configs and secrets that are defined together in order to access external systems.
dlt
implements several built-in credential types) to access AWS, Azure, Google Cloud and other common systems
Some of the credential types give you options how you specify them:
For example, to connect to a sql_database
source, you can either use a connection string:
[sources.sql_database]
credentials="snowflake://user:password@service-account/database?warehouse=warehouse_name&role=role"
Or set up the connection parameters separately:
[sources.sql_database.credentials]
drivername="snowflake"
username="user"
password="password"
database="database"
host="service-account"
warehouse="warehouse_name"
role="role"
dlt
can discover default credentials of all major cloud providers: it is able to use what is already present in
the runtime environment: ie. when running in Colab or Google VM it has access to cloud credentials and if
nothing is specified in the configuration it will use them instead.
Environment variables
Environment variables provide a convenient way to specify configuration and secrets, especially in deployment environments. When using environment variables, names are capitalized and sections are separated with double underscores (__
).
For example, to set the Facebook Ads access token:
export SOURCES__FACEBOOK_ADS__ACCESS_TOKEN="<access_token>"
See the examples section for more details on setting up credentials with environment variables.
For local development, you can use python-dotenv to automatically load variables from an .env
file, making credential management easier and more secure.
Environment variables can also retrieve secret values from /run/secrets/<secret-name>
to seamlessly work with Kubernetes/Docker secrets.
For these secrets, dlt
uses an alternative name format with lowercase letters, dashes (-
) as separators, and underscores converted to dashes. For example, sources--facebook-ads--access-token
would be checked for the above environment variable.
Only values marked as secrets (with dlt.secrets.value
or using types like TSecretStrValue
) are checked this way. Remember to name your secrets appropriately in Kubernetes resources or Docker Compose files.
Vaults
dlt
may read configuration from secure vaults - specialized services for storing credentials.
-
For Google Cloud Secrets Manager, see our example walkthrough.
-
For other vault integrations like AWS Secrets Manager or Azure Key Vault, contact our sales team to learn about our secure building blocks for data platform teams.
secrets.toml and config.toml
The TOML configuration provider uses two separate files:
config.toml:
- Contains non-sensitive configuration data that defines pipeline behavior
- Includes settings like file paths, database hosts, timeouts, API URLs, and performance options
- Values are accessible in code through the
dlt.config
dictionary - Can be safely committed to version control
secrets.toml:
- Contains sensitive information that must be kept confidential
- Includes credentials like passwords, API keys, and private keys
- Values are accessible in code through the
dlt.secrets
dictionary - Should never be committed to version control
By default, the .gitignore
file in your project prevents secrets.toml
from being added to version control, while config.toml
can be freely included.
File locations
The TOML provider loads files from the .dlt
folder relative to your current working directory.
For example, if your working directory is my_dlt_project
with this structure:
my_dlt_project:
|
pipelines/
|---- .dlt/secrets.toml
|---- google_sheets.py
When you run:
python pipelines/google_sheets.py
dlt
will look for secrets in my_dlt_project/.dlt/secrets.toml
and ignore my_dlt_project/pipelines/.dlt/secrets.toml
.
If you change your working directory to pipelines
and run:
python google_sheets.py
dlt
will look for my_dlt_project/pipelines/.dlt/secrets.toml
instead.
Special locations
The TOML provider also reads configuration from special locations depending on your runtime environment:
-
Home directory: If available,
dlt
checks~/.dlt/
forconfig.toml
andsecrets.toml
. These values are merged with project-specific configurations, with project values taking precedence. This is useful for sharing global settings (like telemetry preferences) across all pipelines on a machine. -
Google Colab: When running in Colab, you can use Colab Secrets named
secrets.toml
andconfig.toml
. The provider reads these as if they were TOML files. This functionality is disabled if files exist in the.dlt
folder. -
Streamlit: When running in Streamlit without local
.dlt/secrets.toml
, the provider uses Streamlit secrets. You can adddlt
secrets directly to your Streamlit secrets.
Custom providers
You can create and register your own configuration providers to customize how dlt
accesses configuration values. The simplest approach is to write a function that returns a nested dictionary where keys correspond to sections and argument names.
This example demonstrates how to create a custom provider that loads configuration from a JSON file:
import dlt
from dlt.common import json
from dlt.common.configuration.providers import CustomLoaderDocProvider
# Create a function that loads a dictionary
def load_config():
with open("config.json", "rb") as f:
return json.load(f)
# Create the custom provider
provider = CustomLoaderDocProvider(
"my_json_provider",
load_config,
supports_secrets=False
)
# Register the provider with dlt
dlt.config.register_provider(provider)
Check out our example YAML provider that supports switchable configuration profiles.
Examples
Configure both config and secrets
This example uses the Notion source and filesystem destination to demonstrate how to organize configuration in TOML files using the recommended section layout.
The Notion source is defined in a file named notion.py
, so we use that module name in the configuration. We configure the api_key
in our configuration while passing the list of database IDs explicitly in code. For the filesystem destination, we split configuration between config.toml
(for bucket_url
) and secrets.toml
(for AWS credentials).
import dlt
@dlt.source
def notion_databases(
database_ids = None,
api_key: str = dlt.secrets.value, # mark argument to be injected as secret
):
...
# Pass database_id in code, let `dlt` inject api_key
sales_database = notion_databases( # type: ignore
database_ids=[
{
"id": "a94223535c674d33a24e313e7921ce15",
"use_name": "sales_alias",
}
]
)
- TOML config provider
- Environment variables
- In the code
config.toml
[runtime]
log_level="INFO"
# Do not compress files sent to the filesystem bucket
[normalize.data_writer]
disable_compression=true
# Recommended sections for the destination (destination.module)
[destination.filesystem]
bucket_url = "s3://[your_bucket_name]"
secrets.toml
# Recommended sections for sources (sources.module)
[sources.notion]
api_key = "your-notion-api-key" # Will be injected to api_key argument
# Recommended sections for destination credentials
[destination.filesystem.credentials]
aws_access_key_id = "ABCDEFGHIJKLMNOPQRST"
aws_secret_access_key = "1234567890_access_key"
# Environment variables for both config and secrets follow the same format
export RUNTIME__LOG_LEVEL="INFO"
export DESTINATION__FILESYSTEM__BUCKET_URL="s3://[your_bucket_name]"
export NORMALIZE__DATA_WRITER__DISABLE_COMPRESSION="true"
export SOURCES__NOTION__API_KEY="your-notion-api-key"
export DESTINATION__FILESYSTEM__CREDENTIALS__AWS_ACCESS_KEY_ID="ABCDEFGHIJKLMNOPQRST"
export DESTINATION__FILESYSTEM__CREDENTIALS__AWS_SECRET_ACCESS_KEY="1234567890_access_key"
import os
import dlt
import botocore.session
from dlt.common.credentials import AwsCredentials
# Set configuration values directly in code
# Via environment variables
os.environ["RUNTIME__LOG_LEVEL"] = "INFO"
os.environ["DESTINATION__FILESYSTEM__BUCKET_URL"] = "s3://[your_bucket_name]"
os.environ["NORMALIZE__DATA_WRITER__DISABLE_COMPRESSION"] = "true"
# Or directly through dlt.config
dlt.config["runtime.log_level"] = "INFO"
dlt.config["destination.filesystem.bucket_url"] = "s3://[your_bucket_name]"
dlt.config["normalize.data_writer.disable_compression"] = "true"
# For secrets, avoid hardcoding - use existing environment variables
os.environ["SOURCES__NOTION__API_KEY"] = os.environ.get("NOTION_KEY")
# Or use credentials from third-party providers
credentials = AwsCredentials()
session = botocore.session.get_session()
credentials.parse_native_representation(session)
dlt.secrets["destination.filesystem.credentials"] = credentials
While you can put all configuration and credentials in secrets.toml
for convenience, sensitive information should never be placed in config.toml
or other non-secure locations. dlt
will raise an exception if it detects secrets in inappropriate locations.
Use different Google credentials for source and destination
This example shows how to configure different credentials for Google-based sources and destinations:
Option 1: Share credentials between source and destination
If you want both the BigQuery destination and Google Sheets source to use the same credentials:
- TOML config provider
- Environment variables
- In the code
[credentials]
client_email = "<client_email_for_both>"
private_key = "<private_key_for_both>"
project_id = "<project_id_for_both>"
export CREDENTIALS__CLIENT_EMAIL="<client_email_for_both>"
export CREDENTIALS__PRIVATE_KEY="<private_key_for_both>"
export CREDENTIALS__PROJECT_ID="<project_id_for_both>"
import os
# Avoid setting secrets directly in code
# Instead, use existing environment variables
os.environ["CREDENTIALS__CLIENT_EMAIL"] = os.environ.get("GOOGLE_CLIENT_EMAIL")
os.environ["CREDENTIALS__PRIVATE_KEY"] = os.environ.get("GOOGLE_PRIVATE_KEY")
os.environ["CREDENTIALS__PROJECT_ID"] = os.environ.get("GOOGLE_PROJECT_ID")
Option 2: Use separate credentials for sources and destinations
To keep source and destination credentials separate:
- TOML config provider
- Environment variables
- In the code
# Google Sheets credentials
[sources.credentials]
client_email = "<sheets_client_email>"
private_key = "<sheets_private_key>"
project_id = "<sheets_project_id>"
# BigQuery credentials
[destination.credentials]
client_email = "<bigquery_client_email>"
private_key = "<bigquery_private_key>"
project_id = "<bigquery_project_id>"
# Google Sheets credentials
export SOURCES__CREDENTIALS__CLIENT_EMAIL="<sheets_client_email>"
export SOURCES__CREDENTIALS__PRIVATE_KEY="<sheets_private_key>"
export SOURCES__CREDENTIALS__PROJECT_ID="<sheets_project_id>"
# BigQuery credentials
export DESTINATION__CREDENTIALS__CLIENT_EMAIL="<bigquery_client_email>"
export DESTINATION__CREDENTIALS__PRIVATE_KEY="<bigquery_private_key>"
export DESTINATION__CREDENTIALS__PROJECT_ID="<bigquery_project_id>"
import dlt
import os
# For destination credentials, use existing environment variables
os.environ["DESTINATION__CREDENTIALS__CLIENT_EMAIL"] = os.environ.get("BIGQUERY_CLIENT_EMAIL")
os.environ["DESTINATION__CREDENTIALS__PRIVATE_KEY"] = os.environ.get("BIGQUERY_PRIVATE_KEY")
os.environ["DESTINATION__CREDENTIALS__PROJECT_ID"] = os.environ.get("BIGQUERY_PROJECT_ID")
# For source credentials, set values in dlt.secrets
dlt.secrets["sources.credentials.client_email"] = os.environ.get("SHEETS_CLIENT_EMAIL")
dlt.secrets["sources.credentials.private_key"] = os.environ.get("SHEETS_PRIVATE_KEY")
dlt.secrets["sources.credentials.project_id"] = os.environ.get("SHEETS_PROJECT_ID")
With this setup, dlt
looks for destination credentials in this order:
destination.bigquery.credentials --> Not found
destination.credentials --> Found
And for source credentials:
sources.google_sheets_module.google_sheets_function.credentials --> Not found
sources.google_sheets_function.credentials --> Not found
sources.credentials --> Found
Configure credentials for multiple sources and destinations
When working with multiple Google-based sources and destinations, you can use recommended sections layout:
- TOML config provider
- Environment variables
- In the code
# Google Sheets credentials
[sources.google_sheets.credentials]
client_email = "<sheets_client_email>"
private_key = "<sheets_private_key>"
project_id = "<sheets_project_id>"
# Google Analytics credentials
[sources.google_analytics.credentials]
client_email = "<analytics_client_email>"
private_key = "<analytics_private_key>"
project_id = "<analytics_project_id>"
# BigQuery credentials
[destination.bigquery.credentials]
client_email = "<bigquery_client_email>"
private_key = "<bigquery_private_key>"
project_id = "<bigquery_project_id>"
# Google Sheets credentials
export SOURCES__GOOGLE_SHEETS__CREDENTIALS__CLIENT_EMAIL="<sheets_client_email>"
export SOURCES__GOOGLE_SHEETS__CREDENTIALS__PRIVATE_KEY="<sheets_private_key>"
export SOURCES__GOOGLE_SHEETS__CREDENTIALS__PROJECT_ID="<sheets_project_id>"
# Google Analytics credentials
export SOURCES__GOOGLE_ANALYTICS__CREDENTIALS__CLIENT_EMAIL="<analytics_client_email>"
export SOURCES__GOOGLE_ANALYTICS__CREDENTIALS__PRIVATE_KEY="<analytics_private_key>"
export SOURCES__GOOGLE_ANALYTICS__CREDENTIALS__PROJECT_ID="<analytics_project_id>"
# BigQuery credentials
export DESTINATION__BIGQUERY__CREDENTIALS__CLIENT_EMAIL="<bigquery_client_email>"
export DESTINATION__BIGQUERY__CREDENTIALS__PRIVATE_KEY="<bigquery_private_key>"
export DESTINATION__BIGQUERY__CREDENTIALS__PROJECT_ID="<bigquery_project_id>"
import os
import dlt
# For Analytics credentials
os.environ["SOURCES__GOOGLE_ANALYTICS__CREDENTIALS__CLIENT_EMAIL"] = os.environ.get("ANALYTICS_CLIENT_EMAIL")
os.environ["SOURCES__GOOGLE_ANALYTICS__CREDENTIALS__PRIVATE_KEY"] = os.environ.get("ANALYTICS_PRIVATE_KEY")
os.environ["SOURCES__GOOGLE_ANALYTICS__CREDENTIALS__PROJECT_ID"] = os.environ.get("ANALYTICS_PROJECT_ID")
# For BigQuery credentials
os.environ["DESTINATION__BIGQUERY__CREDENTIALS__CLIENT_EMAIL"] = os.environ.get("BIGQUERY_CLIENT_EMAIL")
os.environ["DESTINATION__BIGQUERY__CREDENTIALS__PRIVATE_KEY"] = os.environ.get("BIGQUERY_PRIVATE_KEY")
os.environ["DESTINATION__BIGQUERY__CREDENTIALS__PROJECT_ID"] = os.environ.get("BIGQUERY_PROJECT_ID")
# For Google Sheets credentials
dlt.secrets["sources.google_sheets.credentials.client_email"] = os.environ.get("SHEETS_CLIENT_EMAIL")
dlt.secrets["sources.google_sheets.credentials.private_key"] = os.environ.get("SHEETS_PRIVATE_KEY")
dlt.secrets["sources.google_sheets.credentials.project_id"] = os.environ.get("SHEETS_PROJECT_ID")
Configure multiple instances of the same source
If you need to extract data from the same source type with different configurations, you can run them in different pipeline names:
- TOML config provider
- Environment variables
- In the code
[pipeline_name_1.sources.sql_database]
credentials="snowflake://user1:password1@service-account/database1?warehouse=warehouse_name&role=role1"
[pipeline_name_2.sources.sql_database]
credentials="snowflake://user2:password2@service-account/database2?warehouse=warehouse_name&role=role2"
export PIPELINE_NAME_1__SOURCES__SQL_DATABASE__CREDENTIALS="snowflake://user1:password1@service-account/database1?warehouse=warehouse_name&role=role1"
export PIPELINE_NAME_2__SOURCES__SQL_DATABASE__CREDENTIALS="snowflake://user2:password2@service-account/database2?warehouse=warehouse_name&role=role2"
import os
import dlt
# Use existing environment variables to set credentials
os.environ["PIPELINE_NAME_1__SOURCES__SQL_DATABASE__CREDENTIALS"] = os.environ.get("SQL_CREDENTIAL_STRING_1")
# Or set values directly in dlt.secrets
dlt.secrets["pipeline_name_2.sources.sql_database.credentials"] = os.environ.get("SQL_CREDENTIAL_STRING_2")
You have additional options for using multiple instances of the same source:
-
Use the
clone()
method as explained in the sql_database documentation. -
Create named destinations to use the same destination type with different configurations.
Troubleshoot configuration errors
If dlt
can't find a required configuration value or secret, it raises a ConfigFieldMissingException
that provides detailed information about what was searched for and where.
For example, running the chess.py
example without providing the password:
$ CREDENTIALS="postgres://loader@localhost:5432/dlt_data" python chess.py
...
dlt.common.configuration.exceptions.ConfigFieldMissingException: Following fields are missing: ['password'] in configuration with spec PostgresCredentials
for field "password" config providers and keys were tried in the following order:
In Environment Variables key CHESS_GAMES__DESTINATION__POSTGRES__CREDENTIALS__PASSWORD was not found.
In Environment Variables key CHESS_GAMES__DESTINATION__CREDENTIALS__PASSWORD was not found.
In Environment Variables key CHESS_GAMES__CREDENTIALS__PASSWORD was not found.
In secrets.toml key chess_games.destination.postgres.credentials.password was not found.
In secrets.toml key chess_games.destination.credentials.password was not found.
In secrets.toml key chess_games.credentials.password was not found.
In Environment Variables key DESTINATION__POSTGRES__CREDENTIALS__PASSWORD was not found.
In Environment Variables key DESTINATION__CREDENTIALS__PASSWORD was not found.
In Environment Variables key CREDENTIALS__PASSWORD was not found.
In secrets.toml key destination.postgres.credentials.password was not found.
In secrets.toml key destination.credentials.password was not found.
In secrets.toml key credentials.password was not found.
Please refer to https://dlthub.com/docs/general-usage/credentials/ for more information
This error message shows exactly:
- Which field is missing (
password
in this case) - All the keys and locations
dlt
checked, in order of priority - That it first looked with the pipeline name (
chess_games
) prefix, then without it - That it searched environment variables first, then
secrets.toml
Note that config.toml
wasn't checked since it's not appropriate for storing secrets.