Deploy with Modal
Introduction to Modal
Modal is a serverless platform designed for developers. It allows you to run and deploy code in the cloud without managing infrastructure.
With Modal, you can perform tasks like running generative models, large-scale batch jobs, and job queues, all while easily scaling compute resources.
Modal features
- Serverless Compute: No infrastructure management; scales automatically from zero to thousands of CPUs/GPUs.
- Cloud Functions: Run Python code in the cloud instantly and scale horizontally.
- GPU/CPU Scaling: Easily attach GPUs for heavy tasks like AI model training with a single line of code.
- Web Endpoints: Expose any function as an HTTPS API endpoint quickly.
- Scheduled Jobs: Convert Python functions into scheduled tasks effortlessly.
To learn more, please refer to Modal's documentation.
How to run dlt on Modal
Here’s a dlt project setup to copy data from public MySQL database into DuckDB as a destination:
Step 1: Initialize source
Run the dlt init
CLI command to initialize the SQL database source and set up the sql_database_pipeline.py
template.
dlt init sql_database duckdb
Step 2: Define Modal Image
Open the file and define the Modal Image you want to run dlt
in:
import modal
# Define the Modal Image
image = modal.Image.debian_slim().pip_install(
"dlt>=1.1.0",
"dlt[duckdb]", # destination
"dlt[sql_database]", # source (MySQL)
"dlt[parquet]", # file format dependency
"pymysql", # database driver for MySQL source
)
app = modal.App("example-dlt", image=image)
# Modal Volume used to store the duckdb database file
vol = modal.Volume.from_name("duckdb-vol", create_if_missing=True)