Skip to main content
Version: 0.3.25

Introduction

dlt pacman

What is dlt?

dlt is an open-source library that you can add to your Python scripts to load data from various and often messy data sources into well-structured, live datasets. Install it with:

pip install dlt

There's no need to start any backends or containers. Import dlt in your Python script and write a simple pipeline like the one below:

import dlt
from dlt.sources.helpers import requests
# Create a dlt pipeline that will load
# chess player data to the DuckDB destination
pipeline = dlt.pipeline(
pipeline_name='chess_pipeline',
destination='duckdb',
dataset_name='player_data'
)
# Grab some player data from Chess.com API
data = []
for player in ['magnuscarlsen', 'rpragchess']:
response = requests.get(f'https://api.chess.com/pub/player/{player}')
response.raise_for_status()
data.append(response.json())
# Extract, normalize, and load the data
load_info = pipeline.run(data, table_name='player')

Now copy this snippet to a file or a Notebook cell and run it. If you do not have it yet, install duckdb dependency (default dlt installation is really minimal):

pip install "dlt[duckdb]"

How the script works?: It extracts data from a source (here: chess.com REST API), inspects its structure to create a schema, structures, normalizes and verifies the data, and then loads it into a destination (here: duckdb into a database schema player_data and table name player).

Why use dlt?

  • Automated maintenance - with schema inference and evolution and alerts, and with short declarative code, maintenance becomes simple.
  • Run it where Python runs - on Airflow, serverless functions, notebooks. No external APIs, backends or containers, scales on micro and large infra alike.
  • User-friendly, declarative interface that removes knowledge obstacles for beginners while empowering senior professionals.

Getting started with dlt

  1. Play with the Google Colab demo. This is the simplest way to see dlt in action.
  2. Run Getting Started snippets and load data from python objects, files, data frames, databases, APIs or PDFs into any destination.
  3. Read Pipeline Tutorial to start building E(t)LT pipelines from ready components.
  4. We have many interesting walkthroughs where you create, run, customize and deploy pipelines.
  5. Ask us on Slack if you have any questions about use cases or the library.

Become part of the dlt community

  1. Give the library a ⭐ and check out the code on GitHub.
  2. Ask questions and share how you use the library on Slack.
  3. Report problems and make feature requests here.

This demo works on codespaces. Codespaces is a development environment available for free to anyone with a Github account. You'll be asked to fork the demo repository and from there the README guides you with further steps.
The demo uses the Continue VSCode extension.

Off to codespaces!

DHelp

Ask a question

Welcome to "Codex Central", your next-gen help center, driven by OpenAI's GPT-4 model. It's more than just a forum or a FAQ hub – it's a dynamic knowledge base where coders can find AI-assisted solutions to their pressing problems. With GPT-4's powerful comprehension and predictive abilities, Codex Central provides instantaneous issue resolution, insightful debugging, and personalized guidance. Get your code running smoothly with the unparalleled support at Codex Central - coding help reimagined with AI prowess.