datumaro.plugins.framework_converter#
Classes
|
|
|
Create a PyTorch dataset for a specific task given a dataset and subset. |
|
|
- class datumaro.plugins.framework_converter.FrameworkConverter(dataset, subset, task)[source]#
Bases:
object
- class datumaro.plugins.framework_converter.DmTorchDataset(dataset: Dataset, subset: str, task: str, transform: Callable | None = None, target_transform: Callable | None = None, target: str | None = None, tokenizer: tuple[Callable, Callable] | None = None, vocab: tuple[Callable, Callable] | None = None)[source]#
Bases:
_MultiFrameworkDataset
,Dataset
Create a PyTorch dataset for a specific task given a dataset and subset.
- Parameters:
tokenizer – Callable converting a string into a series of tokens. The output can either be token IDs (integers) or token strings (text). If the later, the vocab parameter must also be provided.
vocab – Callable converting a list of token IDs into a list of token strings