extract.resource
with_table_name
def with_table_name(item: TDataItems, table_name: str) -> DataItemWithMeta
Marks item
to be dispatched to table table_name
when yielded from resource function.
with_hints
def with_hints(item: TDataItems, hints: TResourceHints) -> DataItemWithMeta
Marks item
to update the resource with specified hints
.
Create TResourceHints
with make_hints
.
Setting table_name
will dispatch the item
to a specified table, like with_table_name
DltResource Objects
class DltResource(Iterable[TDataItem], DltResourceHints)
Implements dlt resource. Contains a data pipe that wraps a generating item and table schema that can be adjusted
source_name
Name of the source that contains this instance of the source, set when added to DltResourcesDict
section
A config section name
name
@property
def name() -> str
Resource name inherited from the pipe
with_name
def with_name(new_name: str) -> "DltResource"
Clones the resource with a new name. Such resource keeps separate state and loads data to new_name
table by default.
is_transformer
@property
def is_transformer() -> bool
Checks if the resource is a transformer that takes data from another resource
requires_args
@property
def requires_args() -> bool
Checks if resource has unbound arguments
incremental
@property
def incremental() -> IncrementalResourceWrapper
Gets incremental transform if it is in the pipe
validator
@property
def validator() -> Optional[ValidateItem]
Gets validator transform if it is in the pipe
validator
@validator.setter
def validator(validator: Optional[ValidateItem]) -> None
Add/remove or replace the validator in pipe
pipe_data_from
def pipe_data_from(data_from: Union["DltResource", Pipe]) -> None
Replaces the parent in the transformer resource pipe from which the data is piped.
add_pipe
def add_pipe(data: Any) -> None
Creates additional pipe for the resource from the specified data
select_tables
def select_tables(*table_names: Iterable[str]) -> "DltResource"
For resources that dynamically dispatch data to several tables allows to select tables that will receive data, effectively filtering out other data items.
Both with_table_name
marker and data-based (function) table name hints are supported.
add_map
def add_map(item_map: ItemTransformFunc[TDataItem],
insert_at: int = None) -> "DltResource"
Adds mapping function defined in item_map
to the resource pipe at position inserted_at
item_map
receives single data items, dlt
will enumerate any lists of data items automatically
Arguments:
item_map
ItemTransformFunc[TDataItem] - A function taking a single data item and optional meta argument. Returns transformed data item.insert_at
int, optional - At which step in pipe to insert the mapping. Defaults to None which inserts after last step
Returns:
"DltResource"
- returns self
add_yield_map
def add_yield_map(item_map: ItemTransformFunc[Iterator[TDataItem]],
insert_at: int = None) -> "DltResource"
Adds generating function defined in item_map
to the resource pipe at position inserted_at
item_map
receives single data items, dlt
will enumerate any lists of data items automatically. It may yield 0 or more data items and be used to
ie. pivot an item into sequence of rows.
Arguments:
item_map
ItemTransformFunc[Iterator[TDataItem]] - A function taking a single data item and optional meta argument. Yields 0 or more data items.insert_at
int, optional - At which step in pipe to insert the generator. Defaults to None which inserts after last step
Returns:
"DltResource"
- returns self
add_filter
def add_filter(item_filter: ItemTransformFunc[bool],
insert_at: int = None) -> "DltResource"
Adds filter defined in item_filter
to the resource pipe at position inserted_at
item_filter
receives single data items, dlt
will enumerate any lists of data items automatically
Arguments:
item_filter
ItemTransformFunc[bool] - A function taking a single data item and optional meta argument. Returns bool. If True, item is keptinsert_at
int, optional - At which step in pipe to insert the filter. Defaults to None which inserts after last step
Returns:
"DltResource"
- returns self
add_limit
def add_limit(max_items: int) -> "DltResource"
Adds a limit max_items
to the resource pipe
This mutates the encapsulated generator to stop after max_items
items are yielded. This is useful for testing and debugging. It is
a no-op for transformers. Those should be limited by their input data.
Arguments:
max_items
int - The maximum number of items to yield
Returns:
"DltResource"
- returns self
parallelize
def parallelize() -> "DltResource"
Wraps the resource to execute each item in a threadpool to allow multiple resources to extract in parallel.
The resource must be a generator or generator function or a transformer function.
bind
def bind(*args: Any, **kwargs: Any) -> "DltResource"
Binds the parametrized resource to passed arguments. Modifies resource pipe in place. Does not evaluate generators or iterators.
explicit_args
@property
def explicit_args() -> StrAny
Returns a dictionary of arguments used to parametrize the resource. Does not include defaults and injected args.
state
@property
def state() -> StrAny
Gets resource-scoped state from the active pipeline. PipelineStateNotAvailable is raised if pipeline context is not available
__call__
def __call__(*args: Any, **kwargs: Any) -> "DltResource"
Binds the parametrized resources to passed arguments. Creates and returns a bound resource. Generators and iterators are not evaluated.
__or__
def __or__(transform: Union["DltResource", AnyFun]) -> "DltResource"
Allows to pipe data from across resources and transform functions with | operator This is the LEFT side OR so the self may be resource or transformer
__ror__
def __ror__(data: Union[Iterable[Any], Iterator[Any]]) -> "DltResource"
Allows to pipe data from across resources and transform functions with | operator This is the RIGHT side OR so the self may not be a resource and the LEFT must be an object that does not implement | ie. a list
__iter__
def __iter__() -> Iterator[TDataItem]
Opens iterator that yields the data items from the resources in the same order as in Pipeline class.
A read-only state is provided, initialized from active pipeline state. The state is discarded after the iterator is closed.