"""CPU-side per-frame render snapshot for the pipelined render thread.
A :class:`RenderPacket` is an owned, immutable-by-convention snapshot of the
renderer's per-frame submission state, taken on the MAIN thread right after
``SceneAdapter.submit_scene`` populates that state. In pipelined render mode the
packet is handed to the render thread, which reconstructs the renderer's
per-frame lists from it and then records + submits the GPU frame, while the main
thread is free to simulate the next frame and rebuild the renderer's (now empty)
lists without mutating the in-flight packet.
This is the FOUNDATION wave: the packet, its :func:`extract_render_packet`
builder, and the :class:`RenderPacketRing` double-buffer are defined and unit
tested, but no render thread is spawned and the synchronous frame loop is
unchanged. The default (synchronous) path never constructs a packet, so it stays
byte-identical.
Field provenance
----------------
Every field mirrors a per-frame attribute the GPU frame reads, each cleared in
``Renderer.begin_frame`` (forward.py ~605-616) and rebuilt by the scene adapter:
``instances``
``Renderer._instances`` (forward.py ~64): list of
``(MeshHandle, transform, material_id, viewport_id)``. Consumed by
``_upload_transforms``, the shadow pass, occlusion cull, and the forward
draw. The 4x4 ``transform`` arrays are copied so the main thread may rebuild
next frame without touching the in-flight packet.
``skinned_instances``
``Renderer._skinned_instances`` (forward.py ~176):
``(MeshHandle, transform, material_id, joint_matrices)``. Both numpy arrays
per entry are copied.
``shader_material_submissions``
``Renderer._shader_material_submissions`` (forward.py ~120):
``(MeshHandle, transform, material_id, shader_material)``. The transform is
copied; ``shader_material`` is a shared, frame-stable object referenced.
``particle_submissions``
``Renderer._particle_submissions`` (forward.py ~109): ``(data, count)``.
``data`` is copied.
``gpu_particle_submissions``
``Renderer._gpu_particle_submissions`` (forward.py ~113):
``(emitter_id, emitter_config)``. ``emitter_config`` is a per-frame dict;
shallow-copied so a downstream mutation of the dict object is isolated.
``materials``
``Renderer._materials`` (forward.py ~105, set via ``set_materials``). Copied.
``lights``
``Renderer._lights`` (forward.py ~106, set via ``set_lights``). Copied.
``viewports``
Snapshot of ``Renderer.viewport_manager.viewports`` (scene_adapter ~785).
Each entry is a :class:`ViewportSnapshot` carrying the viewport id plus an
owned copy of the camera view/proj matrices, rect, and the (shared, not
copied) render-target handle.
``structure_version``
``tree._structure_version`` at extract time. The velocity and occlusion
passes guard prev<->cur pairing on this; the render thread must see the
value that matched the snapshotted instance ordering, not a later one.
``draw2d_ops``
Snapshot of ``Draw2D._ops`` (draw2d.py ~170): the ordered immediate-mode 2D
op list a HUD/overlay scene rebuilds every frame on the MAIN thread
(``Draw2D._reset(); tree.render(Draw2D)``). ``Op`` is an immutable
``NamedTuple`` whose ``verts``/``indices`` lists Draw2D builds fresh per emit
and never mutates after appending, so a shallow ``list(...)`` copy is a
faithful owned snapshot. ``install_packet`` binds it as the op source the
render thread's ``Draw2DPass`` reads, so the render thread never touches the
live global the main thread is concurrently clearing + rebuilding.
``frame_index``
Monotonic producer frame counter, for ordering / telemetry / debugging.
Subsystem submission buffers (packetised this wave)
---------------------------------------------------
These mirror per-frame submission lists that live inside lazily-created
subsystem passes. Each is rebuilt every frame on the MAIN thread (during
``adapter.submit_scene``) and read during recording, so the render thread must
read an OWNED copy rather than the live list the main thread is clearing.
``tilemap_layers``
Owned snapshot of ``Renderer._tilemap_pass._submissions``
(tilemap_pass.py ~42, populated by ``submit_layer`` from
``SceneAdapter._submit_tilemaps``). Each entry is
``(tile_data: ndarray, tileset_texture_id: int, tile_size: tuple)``; the
``tile_data`` structured array is copied (the main thread reuses TileMap
layer buffers across frames).
``light2d_lights`` / ``light2d_occluders``
Owned snapshots of ``Renderer._light2d_pass._lights`` (list of per-light
dicts, light2d_pass.py ~65) and ``._occluders`` (list of polygon
vertex-lists, ~66). Each dict / polygon is shallow-copied so a downstream
mutation of the per-frame entry is isolated from the packet.
``text_vertices`` / ``text_indices``
Owned copies of ``Renderer._text_renderer.vertices`` / ``.indices``
(text_renderer.py ~316-326): the CPU MSDF geometry the shared TextRenderer
builds each frame from ``draw_text`` calls and that
``OverlayRenderer.render_text`` uploads + draws via ``TextPass``. The arrays
are copied because the TextRenderer reuses its vertex/index buffers next
frame (``begin_frame`` resets ``_char_count`` and overwrites in place).
Scene-render-unit (SRU) plans
-----------------------------
``subviewport_srus``
Ordered list of :class:`SubViewportSRU` plans, one per live SubViewport,
captured on the MAIN thread by :meth:`SubViewportManager.build_srus` (which
reuses the P1 topological ``order_subviewports`` so producers precede
consumers). Each plan owns the SubViewport's submitted opaque/skinned
instances (transform copies), its camera view/proj matrices, its isolated
Draw2D op list, target identity, and the update-mode decision, so the render
thread can record the offscreen SRU WITHOUT walking the live tree. Empty when
no SubViewport is present or in the synchronous path (which still walks the
tree live, byte-identically).
State intentionally NOT snapshotted (and why)
---------------------------------------------
- ``_main_base`` / arena slice info: reservation is a GPU-SSBO concern owned
entirely by the render thread (``reserve_main_slice`` + ``_upload_transforms``
run there from the packet's ``instances``). The main thread never reserves.
- Per-frame HDR / post flags (``_hdr_rendered``): recomputed by the render
thread inside ``pre_render``/``render`` from the snapshotted lists; they are
outputs of recording, not inputs to it.
- Reflection-probe capture (``ReflectionProbePass.update_probes``): DEFERRED.
The probe path is deeply GPU-stateful (per-probe source cubemaps, six face
renders, an IBL compute convolution with record-time descriptor mutation, and
cube-array copies with many intra-cmd barriers). Packetising it faithfully
would require snapshotting six face instance-lists per probe plus replaying the
convolution/copy state machine on the render thread, which is out of scope for
this wave. Probe-bearing scenes keep the safe skip + one-time warning in
pipelined mode (app.py ``_warn_pipelined_unpacketised`` / ``pre_render_fn``).
Note: ``Draw2D._ops`` (immediate-mode 2D HUD/overlay) was previously in the
"not snapshotted" set and raced; it is now ``draw2d_ops`` (installed via
``install_packet``). The subsystem buffers (tilemap / 2D-light / 3D-overlay
text) and SubViewport SRUs documented above are now packet-owned AND consumed by
the render thread: ``Renderer.install_packet`` binds them onto the per-frame
override attributes the pass read-sites consult, and the pipelined ``pre_render``
replays the SubViewport SRUs from the plan via
``SceneAdapter.render_sru_from_plan`` (no live-tree walk). Only reflection-probe
capture remains deferred in pipelined mode (see above). The synchronous path
installs no packet, so every read-site falls back to the live state and stays
byte-identical.
"""
from __future__ import annotations
import threading
from dataclasses import dataclass, field
from typing import TYPE_CHECKING, Any
import numpy as np
if TYPE_CHECKING:
from simvx.core import SceneTree
from ..types import MeshHandle
from .forward import Renderer
__all__ = [
"RenderPacket",
"RenderPacketRing",
"SubViewportSRU",
"ViewportSnapshot",
"extract_render_packet",
]
[docs]
@dataclass(frozen=True, slots=True)
class ViewportSnapshot:
"""Owned snapshot of one viewport's camera + rect for a single frame.
Matrices are copied so the render thread reads a stable view/proj even after
the main thread rebuilds the live ``Viewport`` next frame (TAA jitter and the
motion-blur matrix update both mutate ``camera_proj`` in place during
recording, which must happen on the render thread's owned copy).
"""
vp_id: int
x: int
y: int
width: int
height: int
camera_view: np.ndarray
camera_proj: np.ndarray
render_target: Any | None
[docs]
@dataclass(frozen=True, slots=True)
class SubViewportSRU:
"""Owned plan to record one SubViewport offscreen without walking the tree.
Captured on the MAIN thread by :meth:`SubViewportManager.build_srus` from a
SubViewport's already-submitted offscreen scene. The render thread (next
wave) replays it: reserve a transform-SSBO slice for ``instances`` +
``skinned_instances``, write those transforms, record the draws into the
SubViewport's ``renderer`` target with ``camera_view`` / ``camera_proj``, then
overlay ``draw2d_ops``. The producer-before-consumer ordering is encoded by
this list's position in :attr:`RenderPacket.subviewport_srus` (P1 topo sort),
so a consumer SRU appears after the producer it samples.
Fields
------
sru_id
Stable identity (``id(node)``) keying the frustum-visibility cache so
SRUs never collide. Mirrors ``render_to_target(sru_id=...)``.
renderer
The SubViewport's :class:`SubViewportRenderer` (shared GPU target, not
copied: it owns the offscreen image whose bindless slot the main scene
samples). The render thread records into it; the main thread does not
mutate it concurrently (create/resize happen during extract, before the
plan is built).
width / height
The SRU's offscreen extent at capture time (drives viewport + the SSBO
transform write).
clear_colour
Per-frame clear colour (``transparent_bg`` decides RGBA), copied.
camera_view / camera_proj
Owned copies of the SRU camera's view + projection matrices (``None`` for
a 2D-only SubViewport, which uses the screen-size path).
screen_size
``(width, height)`` float screen size override the 2D path needs.
instances / skinned_instances
Owned snapshots of the offscreen scene's submitted opaque / skinned
instances (same shape as :attr:`RenderPacket.instances` /
``skinned_instances``; transform + joint arrays copied).
draw2d_ops
Owned copy of the SubViewport subtree's isolated Draw2D op list (the
2D overlay drawn on top of its 3D content via ``render_draw2d``).
"""
sru_id: int
renderer: Any
width: int
height: int
clear_colour: tuple[float, float, float, float]
camera_view: np.ndarray | None
camera_proj: np.ndarray | None
screen_size: tuple[float, float]
instances: list[tuple[MeshHandle, np.ndarray, int, int]]
skinned_instances: list[tuple[MeshHandle, np.ndarray, int, np.ndarray]]
draw2d_ops: list[Any] = field(default_factory=list)
[docs]
@dataclass(frozen=True, slots=True)
class RenderPacket:
"""Owned snapshot of the renderer's per-frame submission state.
Construct via :func:`extract_render_packet`. All mutable numpy data is copied
at extract time, so the main thread may rebuild the renderer's per-frame
lists for the next frame while this packet is in flight on the render thread.
"""
frame_index: int
structure_version: int
instances: list[tuple[MeshHandle, np.ndarray, int, int]]
skinned_instances: list[tuple[MeshHandle, np.ndarray, int, np.ndarray]]
shader_material_submissions: list[tuple[MeshHandle, np.ndarray, int, Any]]
particle_submissions: list[tuple[np.ndarray, int]]
gpu_particle_submissions: list[tuple[int, dict]]
materials: np.ndarray
lights: np.ndarray
viewports: list[ViewportSnapshot] = field(default_factory=list)
# Owned snapshot of ``Draw2D._ops`` (immutable NamedTuples, shallow-copied).
# The render thread's Draw2DPass reads this instead of the live global.
draw2d_ops: list[Any] = field(default_factory=list)
# Owned snapshots of the subsystem-pass per-frame submission buffers. See the
# module docstring "Subsystem submission buffers" section for each source.
tilemap_layers: list[tuple[np.ndarray, int, tuple[float, float]]] = field(default_factory=list)
light2d_lights: list[dict] = field(default_factory=list)
light2d_occluders: list[list[tuple[float, float]]] = field(default_factory=list)
text_vertices: np.ndarray | None = None
text_indices: np.ndarray | None = None
# Ordered SubViewport SRU plans (producers first). Empty in the synchronous
# path and for scenes with no SubViewport. See ``SubViewportSRU``.
subviewport_srus: list[SubViewportSRU] = field(default_factory=list)
def _extract_subsystems(
renderer: Renderer,
) -> tuple[
list[tuple[np.ndarray, int, tuple[float, float]]],
list[dict],
list[list[tuple[float, float]]],
np.ndarray | None,
np.ndarray | None,
]:
"""Owned snapshots of the tilemap / 2D-light / 3D-overlay-text submission buffers.
Each subsystem pass is lazily created, so a getter may return ``None``; absent
passes yield empty / ``None`` fields. Arrays are copied because the main thread
reuses the underlying buffers next frame (TileMap layer arrays, the
TextRenderer's vertex/index buffers); per-light/occluder Python entries are
shallow-copied so a downstream mutation is isolated from the packet.
"""
tilemap_layers: list[tuple[np.ndarray, int, tuple[float, float]]] = []
tm = renderer._tilemap_pass
if tm is not None:
tilemap_layers = [(data.copy(), tex_id, tuple(size)) for (data, tex_id, size) in tm._submissions]
light2d_lights: list[dict] = []
light2d_occluders: list[list[tuple[float, float]]] = []
l2d = renderer._light2d_pass
if l2d is not None:
light2d_lights = [dict(light) for light in l2d._lights]
light2d_occluders = [list(poly) for poly in l2d._occluders]
text_vertices: np.ndarray | None = None
text_indices: np.ndarray | None = None
tr = renderer._text_renderer
if tr is not None and tr.has_text:
text_vertices = np.array(tr.vertices, copy=True)
text_indices = np.array(tr.indices, copy=True)
return tilemap_layers, light2d_lights, light2d_occluders, text_vertices, text_indices
[docs]
class RenderPacketRing:
"""Bounded double-buffered handoff between the main and render threads.
The ring IS the double-buffer: a fixed ``capacity`` (default 2) of packet
slots with backpressure. ``submit`` blocks the producer (main thread) once it
is one frame ahead, bounding latency to +1 frame (D6). ``acquire`` blocks the
consumer (render thread) until a packet is available. The render thread calls
``release`` after it has finished the GPU frame for a packet, freeing the slot
so the producer may run ahead again.
Shutdown is cooperative: ``close`` wakes any blocked thread. After close,
``submit`` raises and ``acquire`` drains remaining packets then returns
``None`` so the consumer loop exits cleanly.
"""
def __init__(self, capacity: int = 2):
if capacity < 1:
raise ValueError(f"RenderPacketRing capacity must be >= 1, got {capacity}")
self._capacity = capacity
self._lock = threading.Lock()
self._not_empty = threading.Condition(self._lock)
self._not_full = threading.Condition(self._lock)
self._queue: list[RenderPacket] = []
# In-flight = queued (not yet acquired) + acquired-but-not-released. The
# producer blocks when in-flight would exceed capacity, so a 2-slot ring
# lets the producer be exactly one frame ahead of the consumer.
self._in_flight = 0
self._closed = False
[docs]
@property
def capacity(self) -> int:
return self._capacity
[docs]
@property
def closed(self) -> bool:
with self._lock:
return self._closed
[docs]
def submit(self, packet: RenderPacket) -> None:
"""Producer: enqueue a packet, blocking while the ring is full.
Raises ``RuntimeError`` if the ring has been closed.
"""
with self._not_full:
while self._in_flight >= self._capacity and not self._closed:
self._not_full.wait()
if self._closed:
raise RuntimeError("submit on closed RenderPacketRing")
self._queue.append(packet)
self._in_flight += 1
self._not_empty.notify()
[docs]
def acquire(self, timeout: float | None = None) -> RenderPacket | None:
"""Consumer: dequeue the next packet, blocking until one is available.
Returns ``None`` if the ring is closed and drained (consumer should
exit), or if ``timeout`` elapses with no packet.
"""
with self._not_empty:
while not self._queue and not self._closed:
if not self._not_empty.wait(timeout):
return None
if self._queue:
return self._queue.pop(0)
# Closed and drained.
return None
[docs]
def release(self) -> None:
"""Consumer: signal that the most recently acquired packet's GPU frame is done.
Frees one in-flight slot so the producer may run ahead again.
"""
with self._not_full:
if self._in_flight > 0:
self._in_flight -= 1
self._not_full.notify()
[docs]
def close(self) -> None:
"""Mark the ring closed and wake all blocked threads for clean shutdown."""
with self._lock:
self._closed = True
self._not_empty.notify_all()
self._not_full.notify_all()
[docs]
def pending(self) -> int:
"""Number of packets queued but not yet acquired (diagnostics/tests)."""
with self._lock:
return len(self._queue)