LLM NPC Flavour

a night guard whose barks come from a low-frequency LLM brain.

▶ Run in browser

Tags: ai llm

The classical, authoritative game logic runs every frame and is the source of truth: hp, ammo, and an alert flag drift / toggle in on_update and are written onto the agent’s blackboard. A child AgentNode runs an LLMBrain that, a few seconds apart and entirely off the frame thread, turns that authoritative slice into a single in-character line. A bottom-anchored HUD renders the latest bark every frame.

The LLM is a stage, not the NPC: while a call is in flight, or if it is late / dropped / fails, the HUD keeps the last good line and the game never stalls.

How to run

OFFLINE (default, no LLM, no network, no setup): a scripted fake client picks a canned bark for the situation, after a trivial awaitable that proves the off-thread path. This is what the screenshot walker and web export use.

uv run python examples/features/ai/llm_npc_flavour.py

LIVE: point it at any OpenAI-compatible chat endpoint (vLLM, llama.cpp, Ollama, hosted, etc.) by setting these environment variables, then pass –live. The client is built by OpenAICompatibleClient.from_env():

SIMVX_LLM_BASE_URL   required, e.g. http://host:8000/v1
SIMVX_LLM_MODEL      required, the model name the endpoint serves
SIMVX_LLM_API_KEY    optional, only if your endpoint needs a key

SIMVX_LLM_BASE_URL=http://host:8000/v1 SIMVX_LLM_MODEL=your-model         uv run python examples/features/ai/llm_npc_flavour.py --live

Live runs record responses to a local cache (CACHE_DIR) so a re-run is deterministic and free.

Controls: A toggles the alert state (watch the next bark change), Esc quits.

Source

  1"""LLM NPC Flavour: a night guard whose barks come from a low-frequency LLM brain.
  2
  3The classical, authoritative game logic runs every frame and is the source of
  4truth: hp, ammo, and an alert flag drift / toggle in ``on_update`` and are written
  5onto the agent's blackboard. A child ``AgentNode`` runs an ``LLMBrain`` that, a few
  6seconds apart and entirely off the frame thread, turns that authoritative slice
  7into a single in-character line. A bottom-anchored HUD renders the latest bark
  8every frame.
  9
 10The LLM is a stage, not the NPC: while a call is in flight, or if it is late /
 11dropped / fails, the HUD keeps the last good line and the game never stalls.
 12
 13## How to run
 14
 15OFFLINE (default, no LLM, no network, no setup): a scripted fake client picks a
 16canned bark for the situation, after a trivial awaitable that proves the
 17off-thread path. This is what the screenshot walker and web export use.
 18
 19    uv run python examples/features/ai/llm_npc_flavour.py
 20
 21LIVE: point it at any OpenAI-compatible chat endpoint (vLLM, llama.cpp, Ollama,
 22hosted, etc.) by setting these environment variables, then pass --live. The
 23client is built by `OpenAICompatibleClient.from_env()`:
 24
 25    SIMVX_LLM_BASE_URL   required, e.g. http://host:8000/v1
 26    SIMVX_LLM_MODEL      required, the model name the endpoint serves
 27    SIMVX_LLM_API_KEY    optional, only if your endpoint needs a key
 28
 29    SIMVX_LLM_BASE_URL=http://host:8000/v1 SIMVX_LLM_MODEL=your-model \
 30        uv run python examples/features/ai/llm_npc_flavour.py --live
 31
 32Live runs record responses to a local cache (CACHE_DIR) so a re-run is
 33deterministic and free.
 34
 35Controls: A toggles the alert state (watch the next bark change), Esc quits.
 36
 37# /// simvx
 38# tags = ["ai", "llm"]
 39# ///
 40"""
 41
 42from __future__ import annotations
 43
 44import asyncio
 45import math
 46import random
 47import re
 48import sys
 49
 50from simvx.ai import BARK_KEY, CachingClient, LLMBrain, OpenAICompatibleClient
 51from simvx.ai.client import LLMClient, LLMResponse
 52from simvx.core import AnchorPreset, Input, InputMap, Key, Label, Node2D
 53from simvx.core.ai import AgentNode
 54from simvx.graphics import App
 55
 56CACHE_DIR = "/tmp/simvx_llm_npc_cache"
 57
 58
 59class ScriptedGuardClient(LLMClient):
 60    """An offline fake: a trivial awaitable, then a canned bark for the situation.
 61
 62    This stands in for a real model so the demo runs with no network. It still
 63    exercises the full async path (the ``await`` runs on the AsyncSlot loop, never
 64    the frame), so the non-blocking / coalescing / degrade behaviour is identical.
 65    """
 66
 67    CALM = ["All quiet on the wall.", "Another slow shift.", "Nothing moving out there.", "Cold night. Stay sharp."]
 68    ALERT = ["Movement, north side!", "I heard something. Eyes up.", "Stay down, company.", "That's not the wind."]
 69    LOW_AMMO = ["Running low on rounds.", "Down to my last clip.", "Need a resupply soon."]
 70
 71    async def complete(self, messages, **kwargs) -> LLMResponse:
 72        await asyncio.sleep(0.08)  # simulate model latency, off the frame thread
 73        user = messages[-1]["content"].lower()
 74        # Read the exact ammo value (a substring check would match "ammo": 1 inside 12).
 75        ammo_match = re.search(r'"ammo":\s*(\d+)', user)
 76        ammo = int(ammo_match.group(1)) if ammo_match else 99
 77        if '"alert": true' in user:
 78            pool = self.ALERT
 79        elif ammo <= 2:
 80            pool = self.LOW_AMMO
 81        else:
 82            pool = self.CALM
 83        return LLMResponse(text=random.choice(pool))
 84
 85
 86class NightGuard(Node2D):
 87    """Authoritative classical state every frame; an LLMBrain only colours it."""
 88
 89    def __init__(self, client: LLMClient | None = None, **kwargs) -> None:
 90        super().__init__(**kwargs)
 91        # Default (no-arg) construction runs offline, so the screenshot walker and
 92        # web export (both of which instantiate the root with no args) get the
 93        # canned barks; main() passes a real client for --live.
 94        self._client = client if client is not None else ScriptedGuardClient()
 95        self.hp = 100.0
 96        self.ammo = 12
 97        self.alert = False
 98        self._t = 0.0
 99        self.agent: AgentNode | None = None
100        self.hud: Label | None = None
101        self.status: Label | None = None
102
103    def on_ready(self) -> None:
104        InputMap.add_action("toggle_alert", [Key.A])
105        InputMap.add_action("quit", [Key.ESCAPE])
106
107        # The child agent runs the LLM brain low-frequency (every 4s).
108        self.agent = AgentNode(
109            brain=LLMBrain(
110                self._client,
111                persona="a terse, tired night-watch guard",
112                facts=["hp", "ammo", "alert"],
113                period=4.0,
114            ),
115            name="GuardBrain",
116        )
117        self.add_child(self.agent)
118
119        # Bottom-anchored HUD (anchors + margins, never absolute position).
120        hud = Label("...", name="Bark")
121        hud.set_anchor_preset(AnchorPreset.BOTTOM_WIDE)
122        hud.margin_left = 20
123        hud.margin_right = 20
124        hud.margin_top = -64
125        hud.margin_bottom = -16
126        hud.font_size = 28.0
127        hud.alignment = "center"
128        self.add_child(hud)
129        self.hud = hud
130
131        title = Label("Night Guard  -  A: toggle alert   Esc: quit", name="Title")
132        title.set_anchor_preset(AnchorPreset.CENTER_TOP)
133        title.margin_left = -260
134        title.margin_right = 260
135        title.margin_top = 16
136        title.margin_bottom = 40
137        title.font_size = 18.0
138        title.alignment = "center"
139        self.add_child(title)
140
141        # Authoritative classical state, centred, updated every frame (the LLM never
142        # writes this: it only reads it to colour the bark below).
143        status = Label("", name="Status")
144        status.set_anchor_preset(AnchorPreset.CENTER)
145        status.margin_left = -260
146        status.margin_right = 260
147        status.margin_top = -20
148        status.margin_bottom = 20
149        status.font_size = 24.0
150        status.alignment = "center"
151        self.add_child(status)
152        self.status = status
153
154    def on_update(self, dt: float) -> None:
155        # Classical authoritative simulation, every single frame.
156        self._t += dt
157        self.hp = 60.0 + 40.0 * (0.5 + 0.5 * math.sin(self._t * 0.7))
158        if self._t % 2.0 < dt:
159            # Drain a round every couple of seconds, then resupply once empty, so the
160            # demo cycles through calm / low-ammo states (and barks) instead of draining flat.
161            self.ammo = self.ammo - 1 if self.ammo > 0 else 12
162        if Input.is_action_just_pressed("toggle_alert"):
163            self.alert = not self.alert
164        if Input.is_action_just_pressed("quit"):
165            self.app.quit()
166
167        # Publish the authoritative slice onto the agent's blackboard each frame.
168        board = self.agent.blackboard if self.agent else None
169        if board is not None:
170            board.set("hp", round(self.hp))
171            board.set("ammo", self.ammo)
172            board.set("alert", self.alert)
173
174        # Render the authoritative classical state (updates every frame) and the latest
175        # LLM bark (last good line if a call is in flight / failed).
176        if self.status is not None:
177            flag = "ALERT" if self.alert else "calm"
178            self.status.text = f"hp {round(self.hp)}    ammo {self.ammo}    {flag}"
179        if self.hud is not None and board is not None:
180            self.hud.text = str(board.get(BARK_KEY, "..."))
181
182
183def _build_client(live: bool) -> LLMClient:
184    if not live:
185        return ScriptedGuardClient()
186    # Record/replay so a re-run is deterministic and free.
187    return CachingClient(OpenAICompatibleClient.from_env(), CACHE_DIR, mode="auto")
188
189
190def main() -> None:
191    live = "--live" in sys.argv
192    app = App(title="SimVX LLM NPC Flavour", width=900, height=600)
193    app.run(NightGuard(_build_client(live), name="NightGuard"))
194
195
196if __name__ == "__main__":
197    main()