datumaro.plugins.framework_converter#

Classes

DmTfDataset(dataset, subset, task[, ...])

DmTorchDataset(dataset, subset, task[, ...])

Create a PyTorch dataset for a specific task given a dataset and subset.

FrameworkConverter(dataset, subset, task)

FrameworkConverterFactory()

class datumaro.plugins.framework_converter.FrameworkConverterFactory[source]#

Bases: object

static create_converter(framework)[source]#
class datumaro.plugins.framework_converter.FrameworkConverter(dataset, subset, task)[source]#

Bases: object

to_framework(framework, **kwargs)[source]#
class datumaro.plugins.framework_converter.DmTorchDataset(dataset: Dataset, subset: str, task: str, transform: Callable | None = None, target_transform: Callable | None = None, target: str | None = None, tokenizer: tuple[Callable, Callable] | None = None, vocab: tuple[Callable, Callable] | None = None)[source]#

Bases: _MultiFrameworkDataset, Dataset

Create a PyTorch dataset for a specific task given a dataset and subset.

Parameters:
  • tokenizer – Callable converting a string into a series of tokens. The output can either be token IDs (integers) or token strings (text). If the later, the vocab parameter must also be provided.

  • vocab – Callable converting a list of token IDs into a list of token strings

class datumaro.plugins.framework_converter.DmTfDataset(dataset: Dataset, subset: str, task: str, output_signature: tuple | None = None)[source]#

Bases: _MultiFrameworkDataset

create() DatasetV2[source]#
repeat(count=None) DatasetV2[source]#
batch(batch_size, drop_remainder=False) DatasetV2[source]#