datumaro.experimental.legacy#
Legacy dataset conversion functionality.
This module provides functionality to convert legacy Datumaro datasets to the new experimental dataset format with automatic schema inference and type conversion.
Functions
Analyze experimental dataset schema to determine legacy format. |
|
|
Analyze legacy dataset and generate schema using registered converters. |
|
Convert legacy dataset to experimental format with automatic schema inference. |
|
Convert experimental dataset to legacy format. |
Get forward converter for an annotation type that can handle the given categories. |
|
|
Get forward converter for a dataset by trying registered converters. |
Register a backward converter class for an annotation type. |
|
Register a backward converter class for a media type. |
|
Register built-in backward converters. |
|
Register built-in forward converters for common types. |
|
Register a forward converter class for an annotation type. |
|
|
Register a forward converter class for media types it supports. |
Classes
|
Result of legacy dataset analysis. |
|
Result of experimental dataset analysis for backward conversion. |
Base class for backward annotation type converters. |
|
|
Backward converter for Bbox annotations. |
|
Backward converter for Image media type. |
Base class for backward media type converters. |
|
Backward converter for Polygon annotations. |
|
Backward converter for RotatedBbox annotations. |
|
Base class for forward annotation type converters. |
|
Forward converter for Bbox annotations. |
|
|
Forward converter for Image media type supporting both file paths and byte data. |
Base class for forward media type converters. |
|
Forward converter for Polygon annotations. |
|
|
Forward converter for RotatedBbox annotations. |
- class datumaro.experimental.legacy.ForwardMediaConverter[source]#
Bases:
ABC
Base class for forward media type converters.
- abstract classmethod get_supported_media_types() list[Type[MediaElement[Any]]] [source]#
Return list of media types this converter can handle.
- abstract classmethod create(dataset: Dataset) ForwardMediaConverter | None [source]#
Create converter instance if dataset is supported, None otherwise.
- abstract get_schema_attributes() dict[str, AttributeInfo] [source]#
Return schema attributes for this media type.
- class datumaro.experimental.legacy.ForwardAnnotationConverter[source]#
Bases:
ABC
Base class for forward annotation type converters.
- abstract classmethod create_from_categories(categories: Dict[AnnotationType, Categories]) ForwardAnnotationConverter | None [source]#
Create converter instance if categories support this annotation type.
- abstract classmethod get_annotation_type() AnnotationType [source]#
Get the annotation type this converter handles.
- abstract get_schema_attributes() dict[str, AttributeInfo] [source]#
Return schema attributes for this annotation type.
- abstract convert_annotations(annotations: list[Annotation], item: DatasetItem) dict[str, Any] [source]#
Convert annotations of this type to experimental format.
- datumaro.experimental.legacy.register_forward_media_converter(converter_class: Type[ForwardMediaConverter]) None [source]#
Register a forward converter class for media types it supports.
- datumaro.experimental.legacy.register_forward_annotation_converter(converter_class: Type[ForwardAnnotationConverter]) None [source]#
Register a forward converter class for an annotation type.
- datumaro.experimental.legacy.get_forward_media_converter(dataset: Dataset) ForwardMediaConverter | None [source]#
Get forward converter for a dataset by trying registered converters.
- datumaro.experimental.legacy.get_forward_annotation_converter(annotation_type: AnnotationType, categories: Dict[AnnotationType, Categories]) ForwardAnnotationConverter | None [source]#
Get forward converter for an annotation type that can handle the given categories.
- class datumaro.experimental.legacy.ForwardImageMediaConverter(media_mixin: type, has_image_info: bool, has_callable_data: bool = False)[source]#
Bases:
ForwardMediaConverter
Forward converter for Image media type supporting both file paths and byte data.
Initialize converter with format preference and image info availability.
- classmethod get_supported_media_types() list[Type[MediaElement[Any]]] [source]#
Return list of media types this converter can handle.
- classmethod create(dataset: Dataset) ForwardImageMediaConverter | None [source]#
Create converter instance, detecting whether to use paths or bytes.
- get_schema_attributes() dict[str, AttributeInfo] [source]#
Return schema attributes for this media type.
- class datumaro.experimental.legacy.ForwardBboxAnnotationConverter(bbox_attribute: AttributeInfo, bbox_labels_attribute: AttributeInfo | None)[source]#
Bases:
ForwardAnnotationConverter
Forward converter for Bbox annotations.
Initialize with bbox attributes and label attribute name.
- classmethod create_from_categories(categories: Dict[AnnotationType, Categories]) ForwardBboxAnnotationConverter | None [source]#
Create converter instance for bbox annotations.
- classmethod get_annotation_type() AnnotationType [source]#
Get the annotation type this converter handles.
- get_schema_attributes() dict[str, AttributeInfo] [source]#
Return schema attributes for this annotation type.
- convert_annotations(annotations: list[Annotation], item: DatasetItem) dict[str, Any] [source]#
Convert annotations of this type to experimental format.
- class datumaro.experimental.legacy.ForwardRotatedBboxAnnotationConverter(rotated_bbox_attribute: AttributeInfo, rotated_bbox_labels_attribute: AttributeInfo | None = None)[source]#
Bases:
ForwardAnnotationConverter
Forward converter for RotatedBbox annotations.
Initialize converter with rotated bbox attributes.
- classmethod create_from_categories(categories: Dict[AnnotationType, Categories]) ForwardRotatedBboxAnnotationConverter [source]#
Create converter instance from dataset categories.
- classmethod get_annotation_type() AnnotationType [source]#
Get the annotation type this converter handles.
- get_schema_attributes() dict[str, AttributeInfo] [source]#
Return schema attributes for this annotation type.
- convert_annotations(annotations: list[Annotation], item: DatasetItem) dict[str, Any] [source]#
Convert annotations of this type to experimental format.
- class datumaro.experimental.legacy.ForwardPolygonAnnotationConverter(polygon_attribute: AttributeInfo, polygon_labels_attribute: AttributeInfo | None)[source]#
Bases:
ForwardAnnotationConverter
Forward converter for Polygon annotations.
Initialize with polygon attributes and label attribute.
- classmethod create_from_categories(categories: Dict[AnnotationType, Categories]) ForwardPolygonAnnotationConverter | None [source]#
Create converter instance for polygon annotations.
- classmethod get_annotation_type() AnnotationType [source]#
Get the annotation type this converter handles.
- get_schema_attributes() dict[str, AttributeInfo] [source]#
Return schema attributes for this annotation type.
- convert_annotations(annotations: list[Annotation], item: DatasetItem) dict[str, Any] [source]#
Convert annotations of this type to experimental format.
- datumaro.experimental.legacy.register_builtin_forward_converters()[source]#
Register built-in forward converters for common types.
- class datumaro.experimental.legacy.AnalysisResult(schema: Schema, media_converter: ForwardMediaConverter | None, ann_converters: dict[AnnotationType, ForwardAnnotationConverter])[source]#
Bases:
object
Result of legacy dataset analysis.
- media_converter: ForwardMediaConverter | None#
- ann_converters: dict[AnnotationType, ForwardAnnotationConverter]#
- datumaro.experimental.legacy.analyze_legacy_dataset(legacy_dataset: Dataset) AnalysisResult [source]#
Analyze legacy dataset and generate schema using registered converters.
- Parameters:
legacy_dataset – The legacy Datumaro dataset to analyze
- Returns:
AnalysisResult containing the inferred schema and converters
- datumaro.experimental.legacy.convert_from_legacy(legacy_dataset: Dataset) Dataset[Sample] [source]#
Convert legacy dataset to experimental format with automatic schema inference.
- Parameters:
legacy_dataset – The legacy Datumaro dataset to convert
- Returns:
A new experimental Dataset with inferred schema and converted data
Example
>>> legacy_ds = Dataset.import_from("path/to/coco", "coco") >>> experimental_ds = convert_from_legacy(legacy_ds) >>> sample = experimental_ds[0] >>> print(sample.image_path) >>> print(sample.bboxes.shape)
- class datumaro.experimental.legacy.BackwardMediaConverter[source]#
Bases:
ABC
Base class for backward media type converters.
- abstract classmethod create_from_schema(schema: Schema) BackwardMediaConverter | None [source]#
Create converter instance if schema is supported, None otherwise.
- abstract get_media_type() Type[MediaElement[Any]] [source]#
Get the legacy media type this converter produces.
- abstract convert_to_legacy_media(sample: Sample) MediaElement[Any] [source]#
Convert experimental sample media to legacy MediaElement.
- class datumaro.experimental.legacy.BackwardAnnotationConverter[source]#
Bases:
ABC
Base class for backward annotation type converters.
- abstract classmethod create_from_schema(schema: Schema) BackwardAnnotationConverter | None [source]#
Create converter instance if schema is supported, None otherwise.
- abstract get_annotation_type() AnnotationType [source]#
Get the legacy annotation type this converter produces.
- abstract infer_categories(experimental_dataset: Dataset[Sample]) Dict[AnnotationType, Categories] [source]#
Infer legacy categories from experimental dataset.
- abstract convert_to_legacy_annotations(sample: Sample, categories: Dict[AnnotationType, Categories]) list[Annotation] [source]#
Convert experimental sample annotations to legacy format.
- datumaro.experimental.legacy.register_backward_media_converter(converter_class: Type[BackwardMediaConverter]) None [source]#
Register a backward converter class for a media type.
- datumaro.experimental.legacy.register_backward_annotation_converter(converter_class: Type[BackwardAnnotationConverter]) None [source]#
Register a backward converter class for an annotation type.
- class datumaro.experimental.legacy.BackwardImageMediaConverter(image_path_attr: str)[source]#
Bases:
BackwardMediaConverter
Backward converter for Image media type.
Initialize with the name of the image path attribute.
- classmethod create_from_schema(schema: Schema) BackwardImageMediaConverter | None [source]#
Create converter instance if schema contains image_path field.
- get_media_type() Type[MediaElement[Any]] [source]#
Get the legacy media type this converter produces.
- convert_to_legacy_media(sample: Sample) MediaElement[Any] [source]#
Convert image_path back to Image MediaElement.
- class datumaro.experimental.legacy.BackwardBboxAnnotationConverter(bboxes_attr: str, bbox_labels_attr: str)[source]#
Bases:
BackwardAnnotationConverter
Backward converter for Bbox annotations.
Initialize with the names of the bbox-related attributes.
- classmethod create_from_schema(schema: Schema) BackwardBboxAnnotationConverter | None [source]#
Create converter instance if schema contains bbox-related fields.
- get_annotation_type() AnnotationType [source]#
Get the legacy annotation type this converter produces.
- convert_to_legacy_annotations(sample: Sample, categories: Dict[AnnotationType, Categories]) list[Annotation] [source]#
Convert bboxes and bbox_labels back to legacy Bbox annotations.
- infer_categories(experimental_dataset: Dataset[Sample]) Dict[AnnotationType, Categories] [source]#
Infer label categories from bbox_labels.
- class datumaro.experimental.legacy.BackwardRotatedBboxAnnotationConverter(rotated_bboxes_attr: str, rotated_bbox_labels_attr: str | None)[source]#
Bases:
BackwardAnnotationConverter
Backward converter for RotatedBbox annotations.
Initialize with the names of the rotated bbox-related attributes.
- classmethod create_from_schema(schema: Schema) BackwardRotatedBboxAnnotationConverter | None [source]#
Create converter if schema contains rotated bbox fields.
- get_annotation_type() AnnotationType [source]#
Get the legacy annotation type this converter produces.
- convert_to_legacy_annotations(sample: Sample, categories: Dict[AnnotationType, Categories]) list[Annotation] [source]#
Convert experimental rotated bbox data to legacy RotatedBbox annotations.
- infer_categories(experimental_dataset: Dataset[Sample]) Dict[AnnotationType, Categories] [source]#
Infer label categories from rotated_bbox_labels.
- class datumaro.experimental.legacy.BackwardPolygonAnnotationConverter(polygons_attr: str, polygon_labels_attr: str | None)[source]#
Bases:
BackwardAnnotationConverter
Backward converter for Polygon annotations.
Initialize with the names of the polygon-related attributes.
- classmethod create_from_schema(schema: Schema) BackwardPolygonAnnotationConverter | None [source]#
Create converter instance if schema contains polygon-related fields.
- get_annotation_type() AnnotationType [source]#
Get the legacy annotation type this converter produces.
- convert_to_legacy_annotations(sample: Sample, categories: Dict[AnnotationType, Categories]) list[Annotation] [source]#
Convert polygons and polygon_labels back to legacy Polygon annotations.
- infer_categories(experimental_dataset: Dataset[Sample]) Dict[AnnotationType, Categories] [source]#
Infer label categories from polygon_labels.
- class datumaro.experimental.legacy.BackwardAnalysisResult(media_type: Type[MediaElement[Any]] | None, ann_types: set[AnnotationType], categories: Dict[AnnotationType, Categories], media_converter: BackwardMediaConverter | None, ann_converters: dict[AnnotationType, BackwardAnnotationConverter])[source]#
Bases:
object
Result of experimental dataset analysis for backward conversion.
- media_type: Type[MediaElement[Any]] | None#
- ann_types: set[AnnotationType]#
- categories: Dict[AnnotationType, Categories]#
- media_converter: BackwardMediaConverter | None#
- ann_converters: dict[AnnotationType, BackwardAnnotationConverter]#
- datumaro.experimental.legacy.analyze_experimental_dataset(experimental_dataset: Dataset[Sample]) BackwardAnalysisResult [source]#
Analyze experimental dataset schema to determine legacy format.
- Parameters:
experimental_dataset – The experimental dataset to analyze
- Returns:
BackwardAnalysisResult containing legacy format information
- datumaro.experimental.legacy.convert_to_legacy(experimental_dataset: Dataset[Sample]) Dataset [source]#
Convert experimental dataset to legacy format.
- Parameters:
experimental_dataset – The experimental Dataset to convert
- Returns:
A new legacy Datumaro Dataset with converted data
Example
>>> experimental_ds = Dataset(MySchema) >>> # ... add samples to experimental_ds >>> legacy_ds = convert_to_legacy(experimental_ds) >>> legacy_ds.export("output", "coco")