Introduction
What is dlt
?
dlt
is an open-source library that you can add to your Python scripts to load data
from various and often messy data sources into well-structured, live datasets. Install it with:
pip install dlt
There's no need to start any backends or containers. Import dlt
in your Python script and write a simple pipeline like the one below:
import dlt
from dlt.sources.helpers import requests
# Create a dlt pipeline that will load
# chess player data to the DuckDB destination
pipeline = dlt.pipeline(
pipeline_name='chess_pipeline',
destination='duckdb',
dataset_name='player_data'
)
# Grab some player data from Chess.com API
data = []
for player in ['magnuscarlsen', 'rpragchess']:
response = requests.get(f'https://api.chess.com/pub/player/{player}')
response.raise_for_status()
data.append(response.json())
# Extract, normalize, and load the data
load_info = pipeline.run(data, table_name='player')
Now copy this snippet to a file or a Notebook cell and run it. If you do not have it yet, install duckdb dependency (default dlt
installation is really minimal):
pip install "dlt[duckdb]"
How the script works?: It extracts data from a source (here: chess.com REST API), inspects its structure to create a schema, structures, normalizes and verifies the data, and then loads it into a destination (here: duckdb into a database schema player_data and table name player).
Why use dlt
?
- Automated maintenance - with schema inference and evolution and alerts, and with short declarative code, maintenance becomes simple.
- Run it where Python runs - on Airflow, serverless functions, notebooks. No external APIs, backends or containers, scales on micro and large infra alike.
- User-friendly, declarative interface that removes knowledge obstacles for beginners while empowering senior professionals.
Getting started with dlt
- Play with the
Google Colab demo.
This is the simplest way to see
dlt
in action. - Run Getting Started snippets and load data from python objects, files, data frames, databases, APIs or PDFs into any destination.
- Read Pipeline Tutorial to start building E(t)LT pipelines from ready components.
- We have many interesting walkthroughs where you create, run, customize and deploy pipelines.
- Ask us on Slack if you have any questions about use cases or the library.