Developer Guide¶
Comprehensive guide for developers integrating their own pytest tests into the Intel® ESQ framework.
Quick Reference: Looking for a quick cheat sheet? See the Developer Quick Reference for templates and common commands.
Table of Contents¶
- Overview
- Framework Architecture
- Creating Your First Test
- Configuration System
- Available Fixtures
- System Requirements Flags
- Test Execution Pattern
- Working with Results and Metrics
- KPI Validation
- Asset Management
- Best Practices
- Advanced Topics
Overview¶
The Intel® ESQ framework provides a comprehensive pytest-based testing infrastructure with:
- Automatic test parameterization from YAML configuration files
- Built-in fixtures for caching, validation, and reporting
- System requirement validation with reusable flags
- KPI-based test validation with flexible configuration
- Asset management for models, videos, and files
- Allure reporting with rich visualizations
- Docker integration for containerized tests
This guide will help you integrate your own tests into this framework and leverage its powerful features.
Framework Architecture¶
Dual-Package Structure¶
src/
├── sysagent/ # Core framework (reusable infrastructure)
│ ├── cli.py # Main CLI entry point
│ ├── utils/
│ │ ├── plugins/ # Pytest fixtures and hooks
│ │ ├── core/ # Result, Metrics, Cache classes
│ │ ├── testing/ # System validation utilities
│ │ └── config/ # Configuration loaders
│ └── suites/ # Core test suites (examples)
│
└── esq/ # ESQ-specific extensions
├── suites/ # Domain-specific test suites
│ ├── ai/ # AI tests (vision, audio, gen)
│ ├── media/ # Media processing tests
│ ├── system/ # System-level tests
│ └── vertical/ # Vertical-specific tests
└── configs/
└── profiles/ # Test profiles (qualifications, suites, verticals)
Test Discovery Flow¶
Profile YAML → Consolidator → Pytest Plugin → Test Function
↓ ↓ ↓ ↓
params merge with filter by configs fixture
defaults function name (all merged params)
Creating Your First Test¶
Step 1: Create Test File¶
Create a test file in an appropriate suite directory:
src/esq/suites/
└── my_domain/
└── my_feature/
├── config.yml # KPI and test configurations
└── test_my_feature.py # Your test implementation
Step 2: Basic Test Structure¶
# src/esq/suites/my_domain/my_feature/test_my_feature.py
# Copyright (C) 2025 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
import logging
from sysagent.utils.core import Result, Metrics
logger = logging.getLogger(__name__)
def test_my_feature(
request,
configs,
cached_result,
cache_result,
get_kpi_config,
validate_test_results,
summarize_test_results,
validate_system_requirements_from_configs,
execute_test_with_cache,
prepare_test,
):
"""
My custom feature test.
This test demonstrates the standard 7-step pattern for ESQ tests.
"""
# Step 1: Extract parameters from configs
test_name = request.node.name.split("[")[0]
test_id = configs.get("test_id", test_name)
test_display_name = configs.get("display_name", test_name)
timeout = configs.get("timeout", 300)
devices = configs.get("devices", ["cpu"])
logger.info(f"Starting test: {test_display_name}")
# Step 2: Validate system requirements
validate_system_requirements_from_configs(configs)
# Step 3: Prepare assets/dependencies
def prepare_assets():
"""Prepare test assets and dependencies."""
# Download models, prepare data, etc.
return Result(
name=f"{test_id} - Asset Preparation",
metadata={"status": "completed"}
)
prepare_test(
test_name=test_name,
prepare_func=prepare_assets,
configs=configs,
name="Assets"
)
# Step 4: Execute test logic (with caching)
def execute_logic():
"""Execute the main test logic."""
results = Result(name=f"{test_id} - {test_display_name}")
# Your test implementation here
# Example: Run inference, process data, etc.
inference_time = 0.123 # seconds
accuracy = 0.95 # 95%
# Add metrics to results
results.metrics["inference_time"] = Metrics(
value=inference_time,
unit="seconds"
)
results.metrics["accuracy"] = Metrics(
value=accuracy,
unit="percentage"
)
# Add parameters
results.parameters["Test ID"] = test_id
results.parameters["Display Name"] = test_display_name
results.parameters["Devices"] = ", ".join(devices)
# Update timestamps
results.update_timestamps()
return results
results = execute_test_with_cache(
cached_result=cached_result,
cache_result=cache_result,
run_test_func=execute_logic,
test_name=test_name,
configs=configs
)
# Step 5: Validate results (if qualification profile)
validation_results = validate_test_results(
results=results,
configs=configs,
get_kpi_config=get_kpi_config,
test_name=test_name
)
# Step 6: Generate summary
summarize_test_results(
results=results,
test_name=test_name,
configs=configs,
get_kpi_config=get_kpi_config
)
Step 3: Create Configuration File¶
# src/esq/suites/my_domain/my_feature/config.yml
# Copyright (C) 2025 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
# KPI definitions for this test suite
kpi:
inference_time:
name: "Inference Time"
type: "numeric"
validation:
operator: "lte" # less than or equal
reference: 0.5 # 500ms max
enabled: true
unit: "seconds"
severity: "major"
description: "Time taken for model inference"
default_value: 999.0
accuracy:
name: "Model Accuracy"
type: "numeric"
validation:
operator: "gte" # greater than or equal
reference: 0.90 # 90% minimum
enabled: true
unit: "percentage"
severity: "critical"
description: "Model prediction accuracy"
default_value: 0.0
# Test configurations
tests:
test_my_feature:
params:
- test_id: "MF-001"
display_name: "My Feature Test - CPU"
devices: [cpu]
timeout: 300
kpi_refs:
- inference_time
- accuracy
requirements:
cpu_min_cores: 4
memory_min_gb: 8.0
docker_required: false
- test_id: "MF-002"
display_name: "My Feature Test - GPU"
devices: [igpu]
timeout: 300
kpi_refs:
- inference_time
- accuracy
requirements:
igpu_required: true
memory_min_gb: 8.0
Step 4: Create a Profile¶
# src/esq/configs/profiles/suites/my_feature.yml
# Copyright (C) 2025 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
name: "profile.suite.my_feature"
description: "My Feature Test Suite"
version: "1.0.0"
params:
labels:
profile_display_name: "My Feature"
group: "custom.my_feature"
type: "suite"
requirements:
cpu_min_cores: 4
memory_min_gb: 8.0
storage_min_gb: 5.0
os_type:
- "linux"
docker_required: false
suites:
- name: "my_domain"
sub_suites:
- name: "my_feature"
tests:
test_my_feature:
params:
- test_id: "MF-001"
display_name: "My Feature Test - CPU"
devices: [cpu]
- test_id: "MF-002"
display_name: "My Feature Test - GPU"
devices: [igpu]
requirements:
igpu_required: true
Step 5: Run Your Test¶
# List available profiles
esq list
# Run your profile
esq -v run --profile profile.suite.my_feature
# Run specific test with filter
esq -d run --profile profile.suite.my_feature --filter test_id=MF-001
# Run without cache
esq -d run -nc --profile profile.suite.my_feature
Configuration System¶
Profile Structure¶
Profiles define test execution plans with parameters and requirements:
name: "profile.suite.example"
description: "Example test suite"
version: "1.0.0"
params:
labels:
profile_display_name: "Example"
group: "example.group"
type: "suite" # or "qualification" or "vertical"
requirements:
# Hardware requirements
cpu_min_cores: 4
memory_min_gb: 8.0
storage_min_gb: 10.0
# Software requirements
os_type: ["linux"]
docker_required: true
# Device requirements
igpu_required: false
dgpu_required: false
npu_required: false
suites:
- name: "suite_name"
sub_suites:
- name: "sub_suite_name"
tests:
test_function_name:
params:
- test_id: "EX-001"
display_name: "Example Test"
devices: [cpu, igpu]
timeout: 300
kpi_refs:
- example_kpi
Profile to Test Mapping¶
The framework automatically maps profile configuration to test files:
Profile YAML Test File
────────────────────────────────────────────────────────────────
suites:
- name: "ai" → src/esq/suites/ai/
sub_suites:
- name: "vision" → src/esq/suites/ai/vision/
tests:
test_dlstreamer: → test_dlstreamer.py
params:
- test_id: "VSN-001"
devices: [cpu]
Test Configuration (config.yml)¶
Each test directory can have a config.yml defining KPIs and default parameters:
# KPI definitions
kpi:
metric_name:
name: "Human Readable Name"
type: "numeric" # or "string", "boolean", "list"
validation:
operator: "gte" # gte, lte, eq, ne, gt, lt
reference: 100.0
enabled: true
unit: "unit_name"
severity: "critical" # critical, major, normal, minor
description: "KPI description"
default_value: 0.0
# Test-specific parameters
tests:
test_function_name:
params:
- test_id: "T001"
display_name: "Test Name"
kpi_refs:
- metric_name
Available Fixtures¶
The framework provides comprehensive pytest fixtures automatically available to all tests.
Core Fixtures¶
request¶
- Type: pytest.FixtureRequest
- Scope: function
- Description: Standard pytest request object for accessing test metadata
- Usage:
configs¶
- Type: Dict[str, Any]
- Scope: function
- Description: Test configuration parameters merged from profile and config.yml
- Usage:
Cache Fixtures¶
cached_result¶
- Type: Callable[[Optional[Dict]], Optional[Union[Result, Dict]]]
- Scope: function
- Description: Retrieves cached test results if available
- Usage:
cache_result¶
- Type: Callable[[Union[Result, Dict], Optional[Dict]], None]
- Scope: function
- Description: Stores test results in cache
- Usage:
Validation Fixtures¶
validate_system_requirements_from_configs¶
- Type: Callable[[Dict[str, Any]], None]
- Scope: function
- Description: Validates system meets test requirements from configs
- Usage:
validate_test_results¶
- Type: Callable[..., Dict[str, Any]]
- Scope: function
- Description: Validates test results against KPI thresholds
- Parameters:
results: Result object or dictconfigs: Test configurationget_kpi_config: KPI config retrieval functiontest_name: Test identifiermode: Validation mode ("all" or "any")- Usage:
Execution Fixtures¶
execute_test_with_cache¶
- Type: Callable[..., Union[Result, Dict]]
- Scope: function
- Description: Executes test with automatic caching support
- Parameters:
cached_result: Cached result fixturecache_result: Cache result fixturerun_test_func: Function to execute test logictest_name: Test identifierconfigs: Test configurationcache_configs: Optional cache-specific configsname: Step name (default: "Analysis")- Usage:
prepare_test¶
- Type: Callable[..., Any]
- Scope: function
- Description: Prepares test assets with progress tracking
- Parameters:
test_name: Test identifierprepare_func: Function to execute preparationconfigs: Test configurationname: Step name (default: "Preparation")- Usage:
Reporting Fixtures¶
summarize_test_results¶
- Type: Callable[..., None]
- Scope: function
- Description: Generates test result summary with Allure attachments
- Parameters:
results: Result objecttest_name: Test identifierconfigs: Test configuration (optional)get_kpi_config: KPI config function (optional)iteration_data: Iteration-level data (optional)enable_visualizations: Enable charts (default: False)- Usage:
Suite Fixtures¶
suite_configs¶
- Type: Dict[str, Any]
- Scope: function
- Description: Loads suite-level configuration from config.yml
- Usage:
get_kpi_config¶
- Type: Callable[[str], Optional[Dict[str, Any]]]
- Scope: function
- Description: Retrieves KPI configuration by name
- Usage:
System Requirements Flags¶
The framework provides reusable requirement validation flags. Add these to your profile or test parameters under the requirements key.
Hardware Requirements¶
CPU Requirements¶
requirements:
# Minimum CPU cores
cpu_min_cores: 4
# Minimum CPU threads
cpu_min_threads: 8
# CPU brand requirements
cpu_xeon_required: true # Requires Intel® Xeon® processor
cpu_core_required: true # Requires Intel® Core™ processor (includes Ultra Desktop)
cpu_ultra_required: true # Requires Intel® Ultra processor
# CPU socket count
cpu_min_sockets: 1
cpu_max_sockets: 2
Memory Requirements¶
requirements:
# Minimum total memory
memory_min_gb: 8.0
# Maximum memory (for constrained tests)
memory_max_gb: 32.0
Storage Requirements¶
requirements:
# Total storage capacity
storage_total_min_gb: 64.0
# Free storage for test execution
storage_min_gb: 10.0
Device Requirements¶
requirements:
# Integrated GPU (iGPU)
igpu_required: true
igpu_min_count: 1
igpu_max_count: 1
# Discrete GPU (dGPU)
dgpu_required: true
dgpu_min_count: 1
dgpu_max_count: 4
# Neural Processing Unit (NPU)
npu_required: true
npu_min_count: 1
npu_max_count: 1
Device Detection Logic: - The framework automatically detects Intel devices using OpenVINO - Only Intel devices detected by OpenVINO are counted - Non-Intel devices are ignored for requirements validation - If Intel devices exist but aren't detected by OpenVINO, helpful error messages suggest driver installation
Software Requirements¶
Operating System¶
requirements:
# Supported OS types
os_type:
- "linux"
- "windows"
# Specific OS versions
os_version_min: "20.04" # For Ubuntu
os_version_max: "24.04"
Docker Requirements¶
requirements:
# Docker daemon required
docker_required: true
# Minimum Docker version
docker_version_min: "20.10.0"
Software Dependencies¶
requirements:
# Python version
python_version_min: "3.10"
# Other software (validated via command presence)
software_required:
- name: "git"
command: "git --version"
- name: "ffmpeg"
command: "ffmpeg -version"
Example: Complete Requirements Block¶
requirements:
# Hardware
cpu_min_cores: 8
cpu_xeon_required: true
memory_min_gb: 16.0
storage_min_gb: 20.0
dgpu_required: true
dgpu_min_count: 2
# Software
os_type: ["linux"]
docker_required: true
python_version_min: "3.10"
# Dependencies
software_required:
- name: "ffmpeg"
command: "ffmpeg -version"
Using Requirements in Tests¶
Requirements are automatically validated when you call:
This will:
1. Check all specified requirements against system capabilities
2. Skip the test if requirements aren't met (using pytest.skip)
3. Provide detailed error messages for failed requirements
4. Log fix suggestions for common issues
Example Validation Output:
Validation failed for profile: my-profile
CPU Requirements:
✗ CPU cores >= 8 (Actual: 4 cores)
Device Requirements:
✗ dGPU required (Actual: 1 Intel dGPU found but not detected by OpenVINO)
Suggestions:
- Upgrade to system with more CPU cores
- Install Intel GPU drivers: sudo apt install intel-gpu-tools
Test Execution Pattern¶
The framework follows a standardized 7-step pattern for all tests:
Standard Test Pattern¶
def test_example(
request,
configs,
cached_result,
cache_result,
get_kpi_config,
validate_test_results,
summarize_test_results,
validate_system_requirements_from_configs,
execute_test_with_cache,
prepare_test,
):
"""Standard test pattern."""
# ====================================================================
# STEP 1: Extract Parameters
# ====================================================================
test_name = request.node.name.split("[")[0]
test_id = configs.get("test_id", test_name)
test_display_name = configs.get("display_name", test_name)
timeout = configs.get("timeout", 300)
devices = configs.get("devices", ["cpu"])
logger.info(f"Starting test: {test_display_name}")
# ====================================================================
# STEP 2: Validate System Requirements
# ====================================================================
# Automatically skips test if requirements not met
validate_system_requirements_from_configs(configs)
# ====================================================================
# STEP 3: Prepare Assets/Dependencies
# ====================================================================
def prepare_assets():
"""Prepare test assets."""
# Download models, videos, files
# Build Docker images
# Set up test environment
return Result(
name=f"{test_id} - Asset Preparation",
metadata={"status": "completed"}
)
prepare_test(
test_name=test_name,
prepare_func=prepare_assets,
configs=configs,
name="Assets"
)
# ====================================================================
# STEP 4: Execute Test Logic (with caching)
# ====================================================================
def execute_logic():
"""Main test logic."""
results = Result(name=f"{test_id} - {test_display_name}")
# Your test implementation
# - Run benchmarks
# - Execute inference
# - Collect metrics
# Add metrics
results.metrics["throughput"] = Metrics(
value=1234.5,
unit="ops/sec"
)
# Add parameters
results.parameters["Test ID"] = test_id
results.parameters["Devices"] = ", ".join(devices)
# Update timestamps
results.update_timestamps()
return results
results = execute_test_with_cache(
cached_result=cached_result,
cache_result=cache_result,
run_test_func=execute_logic,
test_name=test_name,
configs=configs
)
# ====================================================================
# STEP 5: Validate Results Against KPIs
# ====================================================================
validation_results = validate_test_results(
results=results,
configs=configs,
get_kpi_config=get_kpi_config,
test_name=test_name
)
# ====================================================================
# STEP 6: Generate Summary
# ====================================================================
summarize_test_results(
results=results,
test_name=test_name,
configs=configs,
get_kpi_config=get_kpi_config
)
# ====================================================================
# STEP 7: Test Complete (implicit)
# ====================================================================
# Framework automatically:
# - Saves results to JSON
# - Generates Allure report
# - Caches results (if enabled)
Execution Flow¶
1. Parameter Extraction
├─> Read test_id, display_name, timeout, etc.
└─> Configure logging
2. System Validation
├─> Check CPU/Memory/Storage
├─> Check Device availability
├─> Check Software dependencies
└─> Skip test if requirements not met
3. Asset Preparation
├─> Download models/videos
├─> Build Docker images
└─> Set up test environment
4. Test Execution (Cached)
├─> Check cache for existing results
├─> If cache hit: return cached results
└─> If cache miss: execute test logic
5. KPI Validation
├─> Load KPI configurations
├─> Compare results vs thresholds
└─> Mark passed/failed/skipped
6. Result Summarization
├─> Generate JSON summary
├─> Create Allure attachments
└─> Generate visualizations
7. Cleanup & Reporting
├─> Save results to file
├─> Update Allure report
└─> Clear temporary resources
Working with Results and Metrics¶
Result Class¶
The Result dataclass provides structured test result handling:
from sysagent.utils.core import Result, Metrics
# Create result object
results = Result(
name="T001 - My Test",
parameters={
"Test ID": "T001",
"Device": "CPU",
"Batch Size": 8
},
metrics={
"throughput": Metrics(value=1234.5, unit="ops/sec"),
"latency": Metrics(value=0.123, unit="seconds")
},
metadata={
"status": "completed",
"custom_field": "custom_value"
}
)
# Update timestamps (auto-calculates duration)
results.update_timestamps()
# Convert to dictionary for serialization
results_dict = results.to_dict()
Metrics Class¶
The Metrics dataclass structures metric data:
from sysagent.utils.core import Metrics
# Create metric
metric = Metrics(
value=123.45,
unit="ms",
is_key_metric=False
)
# Access metric properties
print(f"Value: {metric.value} {metric.unit}")
Adding Metrics to Results¶
# Add single metric
results.metrics["accuracy"] = Metrics(
value=0.95,
unit="percentage"
)
# Add multiple metrics
results.metrics.update({
"precision": Metrics(value=0.93, unit="percentage"),
"recall": Metrics(value=0.91, unit="percentage"),
"f1_score": Metrics(value=0.92, unit="percentage")
})
# Device-specific metrics
for device in ["cpu", "igpu", "dgpu"]:
results.metrics[f"throughput_{device}"] = Metrics(
value=get_throughput(device),
unit="fps"
)
Key Metrics¶
Designate a primary metric for reporting:
# Set key metric
results.set_key_metric("throughput")
# Get current key metric
key_metric_name = results.get_key_metric()
if key_metric_name:
print(f"Key metric: {key_metric_name}")
Automatic Metadata¶
The Result class automatically includes:
{
"created_at": "2025-12-29T10:00:00+00:00",
"updated_at": "2025-12-29T10:05:30+00:00",
"total_duration_seconds": 330.0,
"kpi_validation_status": "passed" # or "failed", "skipped"
}
Complete Example¶
def execute_benchmark():
"""Execute benchmark and return results."""
# Create result object
results = Result(name=f"{test_id} - {test_display_name}")
# Run benchmark
start_time = time.time()
throughput, latency = run_inference(model, data)
elapsed = time.time() - start_time
# Add metrics
results.metrics["throughput"] = Metrics(
value=throughput,
unit="fps",
is_key_metric=True
)
results.metrics["latency"] = Metrics(
value=latency,
unit="ms"
)
results.metrics["elapsed_time"] = Metrics(
value=elapsed,
unit="seconds"
)
# Add parameters
results.parameters["Model"] = "yolo11n"
results.parameters["Precision"] = "INT8"
results.parameters["Device"] = "GPU"
# Add custom metadata
results.metadata["model_version"] = "1.0.0"
results.metadata["batch_size"] = 8
# Update timestamps
results.update_timestamps()
return results
KPI Validation¶
Defining KPIs¶
KPIs are defined in config.yml files:
kpi:
inference_time:
name: "Inference Time"
type: "numeric"
validation:
operator: "lte" # less than or equal
reference: 0.5 # 500ms threshold
enabled: true
unit: "seconds"
severity: "major"
description: "Time taken for model inference"
default_value: 999.0
accuracy:
name: "Model Accuracy"
type: "numeric"
validation:
operator: "gte" # greater than or equal
reference: 0.90 # 90% minimum
enabled: true
unit: "percentage"
severity: "critical"
description: "Model prediction accuracy"
default_value: 0.0
status:
name: "Test Status"
type: "string"
validation:
operator: "eq" # equals
reference: "success"
enabled: true
unit: ""
severity: "critical"
description: "Overall test status"
default_value: "unknown"
KPI Configuration Fields¶
| Field | Description | Required | Values |
|---|---|---|---|
name |
Human-readable name | Yes | String |
type |
Data type | Yes | numeric, string, boolean, list |
validation.operator |
Comparison operator | Yes | gte, lte, eq, ne, gt, lt |
validation.reference |
Expected value/threshold | Yes | Varies by type |
validation.enabled |
Enable validation | No | true (default), false |
unit |
Measurement unit | No | String (e.g., "ms", "fps") |
severity |
Impact level | No | critical, major, normal, minor |
description |
KPI description | No | String |
default_value |
Fallback value | No | Varies by type |
KPI Operators¶
| Operator | Description | Example |
|---|---|---|
gte |
Greater than or equal | actual >= reference |
lte |
Less than or equal | actual <= reference |
gt |
Greater than | actual > reference |
lt |
Less than | actual < reference |
eq |
Equal to | actual == reference |
ne |
Not equal to | actual != reference |
Referencing KPIs in Tests¶
tests:
test_inference:
params:
- test_id: "INF-001"
display_name: "Inference Test"
kpi_refs:
- inference_time
- accuracy
- status
KPI Validation in Tests¶
# Validate results against KPIs
validation_results = validate_test_results(
results=results,
configs=configs,
get_kpi_config=get_kpi_config,
test_name=test_name
)
# Check validation status
if validation_results["passed"]:
logger.info("All KPIs passed")
elif validation_results["skipped"]:
logger.info(f"KPI validation skipped: {validation_results['skip_reason']}")
else:
logger.error("Some KPIs failed")
for kpi_name, kpi_result in validation_results["validations"].items():
if not kpi_result["passed"]:
logger.error(f" {kpi_name}: {kpi_result}")
Validation Result Structure¶
{
"passed": False,
"skipped": False,
"skip_reason": None,
"validations": {
"inference_time": {
"passed": True,
"actual_value": 0.45,
"expected_value": "<= 0.5",
"unit": "seconds",
"operator": "lte",
"severity": "major"
},
"accuracy": {
"passed": False,
"actual_value": 0.85,
"expected_value": ">= 0.90",
"unit": "percentage",
"operator": "gte",
"severity": "critical"
}
}
}
Disabling KPI Validation¶
kpi:
optional_metric:
name: "Optional Metric"
type: "numeric"
validation:
operator: "gte"
reference: 100.0
enabled: false # Metric collected but not validated
Asset Management¶
The framework provides asset management for models, videos, and files.
Asset Types¶
Video Assets¶
assets:
- id: "video_sample"
type: "video"
name: "sample_1920_1080_30fps.h264"
url: "https://example.com/video.mp4"
sha256: "abc123..."
width: 1920 # Resize to this width
height: 1080 # Resize to this height
fps: 30 # Convert to this framerate
codec: "h264" # Convert to this codec (h264, h265)
duration: 30 # Trim to this duration (seconds)
loop: 120 # Loop video to achieve this duration
Model Assets¶
assets:
# Ultralytics model
- id: "yolo11n"
type: "model"
source: "ultralytics"
precision: "int8"
format: "pt"
export_args:
dynamic: true
half: true
# KaggleHub model
- id: "resnet-50"
type: "model"
source: "kagglehub"
precision: "int8"
format: "openvino"
kaggle_handle: "google/resnet-v1/tensorFlow2/50-classification"
convert_args:
input_shape: [1, 224, 224, 3]
quantize_args:
calibration_samples: 512
File Assets¶
assets:
- id: "config_file"
type: "file"
url: "https://example.com/config.json"
sha256: "def456..."
path: "./configs/model.json"
Using Assets in Tests¶
Assets are automatically prepared when specified in profile configurations. Access them in your test:
# Assets prepared in standard locations
models_dir = os.path.join(data_dir, "models")
videos_dir = os.path.join(data_dir, "videos")
# Model path
model_path = os.path.join(models_dir, "yolo11n", "int8", "yolo11n.xml")
# Video path
video_path = os.path.join(videos_dir, "sample_1920_1080_30fps.h264")
Best Practices¶
Test Design¶
- Follow the 7-step pattern - Maintain consistency across all tests
- Use descriptive test IDs - Format:
{SUITE}-{NUM}(e.g.,VSN-001) - Leverage caching - Tests should support both cached and non-cached execution
- Validate requirements early - Call
validate_system_requirements_from_configsfirst - Use proper logging - Log important steps at appropriate levels
- Handle cleanup - Always clean up resources (containers, files, processes)
Configuration¶
- Define clear KPIs - Set realistic thresholds based on baseline measurements
- Use requirement flags - Leverage existing flags instead of custom validation
- Document parameters - Add descriptions to test parameters
- Version profiles - Include version numbers in profile configurations
- Organize profiles - Use appropriate directories (qualifications/, suites/, verticals/)
Error Handling¶
- Use pytest mechanisms -
pytest.skip(),pytest.fail(),pytest.xfail() - Log errors clearly - Include context and suggested fixes
- Attach diagnostics - Use Allure attachments for logs and screenshots
- Clean up on failure - Use try/finally blocks for cleanup
Performance¶
- Cache expensive operations - Model downloads, conversions, compilations
- Use timeouts - Set appropriate timeouts for long-running operations
- Parallelize when possible - Use thread pools for multi-device tests
- Monitor resources - Track memory and storage usage
Security¶
- Validate inputs - Use allow-lists for user-provided values
- Avoid shell=True - Use list-based subprocess calls
- Set file permissions - Use restrictive permissions (0o750, 0o770)
- Handle secrets safely - Never log sensitive information
Advanced Topics¶
Custom Fixtures¶
Create test-specific fixtures in a conftest.py file:
# src/esq/suites/my_domain/conftest.py
import pytest
@pytest.fixture(scope="session")
def shared_resource():
"""Fixture shared across all tests in this suite."""
resource = setup_resource()
yield resource
cleanup_resource(resource)
@pytest.fixture
def test_specific_resource(request):
"""Fixture for individual tests."""
return create_resource()
Multi-Device Testing¶
Test across multiple devices efficiently:
from concurrent.futures import ThreadPoolExecutor, as_completed
from sysagent.utils.system.ov_helper import get_available_devices_by_category
# Get available devices
device_dict = get_available_devices_by_category(
device_categories=["cpu", "igpu", "dgpu"]
)
# Execute tests in parallel
with ThreadPoolExecutor(max_workers=len(device_dict)) as executor:
futures = {
executor.submit(run_test_on_device, device_id): device_id
for device_id in device_dict.keys()
}
for future in as_completed(futures):
device_id = futures[future]
try:
result = future.result()
results.metrics[f"throughput_{device_id}"] = Metrics(
value=result["throughput"],
unit="fps"
)
except Exception as e:
logger.error(f"Test failed on {device_id}: {e}")
Docker Integration¶
Use Docker for isolated test environments:
from sysagent.utils.infrastructure import DockerClient
docker_client = DockerClient()
# Build image
build_result = docker_client.build_image(
path=dockerfile_dir,
tag="my-test-image:latest",
nocache=False
)
# Run container
container_result = docker_client.run_container(
image="my-test-image:latest",
command=["python", "test_script.py"],
volumes={
"/host/path": {"bind": "/container/path", "mode": "rw"}
},
environment={"VAR": "value"},
timeout=300
)
# Parse results
output = container_result.get("output", "")
exit_code = container_result.get("exit_code", -1)
Profile Inheritance¶
Profiles can extend other profiles:
# base_profile.yml
name: "profile.base"
params:
requirements:
cpu_min_cores: 4
memory_min_gb: 8.0
# extended_profile.yml
extends: "profile.base"
name: "profile.extended"
params:
requirements:
cpu_min_cores: 8 # Override
dgpu_required: true # Add new requirement
Custom Metrics and Aggregation¶
Implement complex metric calculations:
def aggregate_device_metrics(device_results):
"""Aggregate metrics across multiple devices."""
total_throughput = sum(r["throughput"] for r in device_results.values())
avg_latency = sum(r["latency"] for r in device_results.values()) / len(device_results)
return {
"total_throughput": total_throughput,
"avg_latency": avg_latency,
"device_count": len(device_results)
}
# Use in test
for device_id, device_result in device_results.items():
results.metrics[f"throughput_{device_id}"] = Metrics(
value=device_result["throughput"],
unit="fps"
)
aggregated = aggregate_device_metrics(device_results)
results.metrics["total_throughput"] = Metrics(
value=aggregated["total_throughput"],
unit="fps",
is_key_metric=True
)
Iteration Data and Visualizations¶
Track per-iteration metrics for detailed analysis:
iteration_data = {
"iterations": [],
"throughput": [],
"latency": []
}
for i in range(num_iterations):
result = run_iteration()
iteration_data["iterations"].append(i)
iteration_data["throughput"].append(result["throughput"])
iteration_data["latency"].append(result["latency"])
# Summarize with visualizations
summarize_test_results(
results=results,
test_name=test_name,
iteration_data=iteration_data,
enable_visualizations=True,
configs=configs,
get_kpi_config=get_kpi_config
)
Next Steps¶
- Review existing tests - Study tests in
src/esq/suites/ai/for real-world examples - Create your first test - Follow the quick start guide above
- Run tests - Use
esq -v run --profile your-profileto execute - Review results - Check JSON summary and Allure report
- Iterate - Refine your tests based on results and requirements
For more information: - Developer Quick Reference - Cheat sheet for common tasks - Quick Start Guide - Troubleshooting Guide - API Reference
Questions or Issues? - Open an issue on GitHub - Review existing tests for examples - Check the troubleshooting guide