datumaro.experimental.type_registry#

Type conversion registry for extensible tensor/array type support.

This module provides a runtime-extensible registry system for converting between different tensor libraries (PyTorch, NumPy, JAX, TensorFlow, etc.) and Polars DataFrames. New types can be registered at runtime without modifying core code.

Functions

from_polars_data(polars_data, target_type)

Convert polars data to target type.

register_from_polars_converter(target_type, ...)

Register a converter function to convert from polars data to target_type.

register_numpy_converter(source_type, ...)

Register a converter function to convert from source_type to numpy array.

to_numpy(value[, dtype])

Convert any registered type to numpy array with optional dtype conversion.

datumaro.experimental.type_registry.register_numpy_converter(source_type: type, converter_func: Callable[[Any], ndarray[Any, Any]]) None[source]#

Register a converter function to convert from source_type to numpy array.

Parameters:
  • source_type – The source type to convert from

  • converter_func – Function that takes a value of source_type and returns np.ndarray

Example

>>> import jax.numpy as jnp
>>> register_numpy_converter(jnp.ndarray, lambda x: np.array(x))
datumaro.experimental.type_registry.register_from_polars_converter(target_type: type, converter_func: Callable[[Any], Any]) None[source]#

Register a converter function to convert from polars data to target_type.

Parameters:
  • target_type – The target type to convert to

  • converter_func – Function that takes polars data and returns target_type

Example

>>> import jax.numpy as jnp
>>> register_from_polars_converter(jnp.ndarray, lambda x: jnp.array(x))
datumaro.experimental.type_registry.to_numpy(value: Any, dtype: Any = None) ndarray[Any, Any][source]#

Convert any registered type to numpy array with optional dtype conversion.

Parameters:
  • value – Value to convert to numpy array

  • dtype – Optional Polars dtype to ensure numpy array has correct dtype

Returns:

numpy array representation of the value with correct dtype

Raises:

TypeError – If the value type is not registered for conversion

Example

>>> import torch
>>> tensor = torch.tensor([1, 2, 3])
>>> numpy_array = to_numpy(tensor)
>>> isinstance(numpy_array, np.ndarray)
True
datumaro.experimental.type_registry.from_polars_data(polars_data: Any, target_type: type) Any[source]#

Convert polars data to target type.

Parameters:
  • polars_data – Data from polars DataFrame

  • target_type – Target type to convert to

Returns:

Value converted to target_type

Raises:

TypeError – If target_type is not registered for conversion

Example

>>> import torch
>>> polars_data = [1, 2, 3]
>>> tensor = from_polars_data(polars_data, torch.Tensor)
>>> isinstance(tensor, torch.Tensor)
True