simvx.core.testing.benchmark¶
Performance-benchmark harness for SimVX’s opt-in regression suite.
This module is the single home for the benchmark plumbing shared by every
@pytest.mark.perf suite (across packages) and by tools/run_benchmarks.py:
- class:
BenchmarkResult– one measured data point (frame timing + metadata).
- func:
bench_scene_runner/ :func:bench_headless_render– run a scene and measure it (CPU-only via :class:~simvx.core.testing.SceneRunner, or a full headless Vulkan render; the latter lazy-importssimvx.graphics).
- class:
MachineInfo– host/CPU/GPU/interpreter snapshot stamped onto records.
- class:
HistoryStore– append-only per-machine history + pinned baselines.
- class:
PerfRecorder– records results, compares against the machine’s own baseline, and (outside report-only mode) fails when a metric regresses.
These utilities measure speed only: correctness lives in the normal suites.
Results are stored locally under the perf directory (:func:perf_dir) and are
never committed – absolute numbers are machine-specific, so regressions are
judged against this machine’s own recorded baseline, not a hardcoded threshold.
Module Contents¶
Classes¶
One measured benchmark point. |
|
Host + interpreter + GPU snapshot stamped onto every record. |
|
A flattened, JSON-serialisable benchmark record (one JSONL line). |
|
Append-only per-machine benchmark history with pinned baselines. |
|
Records benchmark results, compares to the machine baseline, gates on drift. |
Functions¶
Benchmark a scene with :class: |
|
Benchmark a scene with full Vulkan rendering (headless). |
|
Resolve the local perf directory ( |
|
Stable identity for a measured point across runs. |
|
Classify |
|
Construct a :class: |
|
Register the shared perf CLI options on a pytest parser. |
Data¶
API¶
- simvx.core.testing.benchmark.__all__¶
[‘BenchmarkResult’, ‘HistoryStore’, ‘MachineInfo’, ‘PerfRecord’, ‘PerfRecorder’, ‘bench_headless_ren…
- simvx.core.testing.benchmark.DEFAULT_TOLERANCE¶
0.2
- class simvx.core.testing.benchmark.BenchmarkResult[source]¶
One measured benchmark point.
The headline metric is :meth:
metric_value. By default it is the average frame time in ms (metric="frame_ms", lower is better). A limit-style bench (e.g. “max sprites at 60 FPS”) setsmetric,valueandlower_is_better=Falseso the comparison treats higher as better.- name: str¶
None
- count: int¶
0
- total_ms: float¶
0.0
- frames: int¶
0
- samples_ms: list[float]¶
‘field(…)’
- gpu_ms: float | None¶
None
- backend: str¶
‘default’
- metric: str¶
‘frame_ms’
- value: float | None¶
None
- lower_is_better: bool¶
True
- extra: dict[str, float]¶
‘field(…)’
- simvx.core.testing.benchmark.bench_scene_runner(root, frames: int = 120, draw: bool = False, *, backend: str = 'default') simvx.core.testing.benchmark.BenchmarkResult[source]¶
Benchmark a scene with :class:
SceneRunner(CPU only, no GPU).
- simvx.core.testing.benchmark.bench_headless_render(scene_or_cls, frames: int = 60, width: int = 1280, height: int = 720, *, warmup: int = 0, telemetry_keys: tuple[str, ...] = (), backend: str = 'default', **kwargs) simvx.core.testing.benchmark.BenchmarkResult[source]¶
Benchmark a scene with full Vulkan rendering (headless).
Lazy-imports
simvx.graphicsso the core package stays render-agnostic.telemetry_keysnames entries fromapp.last_telemetryto copy intoresult.extra(e.g."occlusion_drawn","transform_high_water").
- class simvx.core.testing.benchmark.MachineInfo[source]¶
Host + interpreter + GPU snapshot stamped onto every record.
- host: str¶
None
- os: str¶
None
- cpu: str¶
None
- cpu_count: int¶
None
- ram_gb: float¶
None
- python: str¶
None
- free_threaded: bool¶
None
- gpu: str¶
‘unknown’
- git_commit: str¶
‘unknown’
- classmethod capture() simvx.core.testing.benchmark.MachineInfo[source]¶
- simvx.core.testing.benchmark.perf_dir() pathlib.Path[source]¶
Resolve the local perf directory (
$SIMVX_PERF_DIRor<repo>/.perf).
- simvx.core.testing.benchmark.record_key(suite: str, name: str, backend: str, count: int, metric: str) str[source]¶
Stable identity for a measured point across runs.
- class simvx.core.testing.benchmark.PerfRecord[source]¶
A flattened, JSON-serialisable benchmark record (one JSONL line).
- ts: str¶
None
- suite: str¶
None
- name: str¶
None
- backend: str¶
None
- count: int¶
None
- metric: str¶
None
- value: float¶
None
- lower_is_better: bool¶
None
- avg_frame_ms: float¶
None
- fps: float¶
None
- p95_ms: float¶
None
- p99_ms: float¶
None
- gpu_ms: float | None¶
None
- extra: dict[str, float]¶
None
- machine: dict[str, Any]¶
None
- classmethod from_result(suite: str, result: simvx.core.testing.benchmark.BenchmarkResult, machine: simvx.core.testing.benchmark.MachineInfo) simvx.core.testing.benchmark.PerfRecord[source]¶
- class simvx.core.testing.benchmark.HistoryStore(root: pathlib.Path | None = None, machine: simvx.core.testing.benchmark.MachineInfo | None = None)[source]¶
Append-only per-machine benchmark history with pinned baselines.
Layout under :func:
perf_dir::history/<host>/<suite>.jsonl one appended line per measured point baseline/<host>.json {record_key: record_dict} pinned baselineInitialization
- append(record: simvx.core.testing.benchmark.PerfRecord) None[source]¶
- simvx.core.testing.benchmark.compare(current: float, baseline: float, lower_is_better: bool, tol: float = DEFAULT_TOLERANCE) str[source]¶
Classify
currentagainstbaseline-> ok / regressed / improved / new.
- class simvx.core.testing.benchmark.PerfRecorder(suite: str, *, store: simvx.core.testing.benchmark.HistoryStore | None = None, tol: float = DEFAULT_TOLERANCE, update_baseline: bool = False, report_only: bool = False, emit: collections.abc.Callable[[str], None] = print)[source]¶
Records benchmark results, compares to the machine baseline, gates on drift.
A perf test takes the
perf_recorderfixture and calls- Meth:
recordonce per measured point. On fixture teardown- Meth:
finishruns: it pins the baseline when--update-perf-baselinewas given, and otherwise raises if any metric regressed beyond tolerance (unless--perf-report-only). The first run on a machine has no baseline, so points are seeded and pass.
Initialization
- record(result: simvx.core.testing.benchmark.BenchmarkResult, *, tol: float | None = None) str[source]¶
- simvx.core.testing.benchmark.make_perf_recorder(suite: str, *, update_baseline: bool = False, report_only: bool = False, tol: float = DEFAULT_TOLERANCE) simvx.core.testing.benchmark.PerfRecorder[source]¶
Construct a :class:
PerfRecorder(used by theperf_recorderfixture).