datumaro.experimental.schema#

Schema definitions for the dataset system.

Classes

AttributeInfo(type, annotation)

Container for attribute type and field annotation information.

Field()

Base class for fields with semantic tags and Polars type mapping.

Schema(attributes, ...)

Represents the schema of a dataset with attribute definitions.

Semantic(value[, names, module, qualname, ...])

Used for disambiguation when multiple fields of the same type exist.

class datumaro.experimental.schema.Semantic(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)[source]#

Bases: Flag

Used for disambiguation when multiple fields of the same type exist. Default is used for fields that don’t need disambiguation. Left/Right are used for stereo vision scenarios.

Default = 1#
Left = 2#
Right = 4#
class datumaro.experimental.schema.Field[source]#

Bases: object

Base class for fields with semantic tags and Polars type mapping.

This abstract base class defines the interface for all field types, providing methods for converting between Python objects and Polars DataFrame representations.

semantic#

Semantic tags for disambiguation (Default, Left, Right)

Type:

datumaro.experimental.schema.Semantic

semantic: Semantic#
to_polars_schema(name: str) dict[str, DataType][source]#

Generate Polars schema definition for this field.

Parameters:

name – The column name for this field

Returns:

Dictionary mapping column names to Polars data types

Raises:

NotImplementedError – Must be implemented by subclasses

to_polars(name: str, value: Any) dict[str, Series][source]#

Convert the field value to Polars-compatible format.

Parameters:
  • name – The column name for this field

  • value – The value to convert

Returns:

Dictionary mapping column names to Polars Series

from_polars(name: str, row_index: int, df: DataFrame, target_type: type) Any[source]#

Convert from Polars-compatible format back to the field’s value.

Parameters:
  • name – The column name for this field

  • row_index – The row index to extract

  • df – The source DataFrame

  • target_type – The target type to convert to

Returns:

The converted value in the target type

class datumaro.experimental.schema.AttributeInfo(type: type, annotation: Field)[source]#

Bases: object

Container for attribute type and field annotation information.

type: type#
annotation: Field#
class datumaro.experimental.schema.Schema(attributes: dict[str, ~datumaro.experimental.schema.AttributeInfo] = <factory>)[source]#

Bases: object

Represents the schema of a dataset with attribute definitions. Enforces that only one field of each type exists per semantic context.

attributes: dict[str, AttributeInfo]#