LLM NPC Flavour¶
a night guard whose barks come from a low-frequency LLM brain.
▶ Run in browserTags: ai llm
The classical, authoritative game logic runs every frame and is the source of
truth: hp, ammo, and an alert flag drift / toggle in on_update and are written
onto the agent’s blackboard. A child AgentNode runs an LLMBrain that, a few
seconds apart and entirely off the frame thread, turns that authoritative slice
into a single in-character line. A bottom-anchored HUD renders the latest bark
every frame.
The LLM is a stage, not the NPC: while a call is in flight, or if it is late / dropped / fails, the HUD keeps the last good line and the game never stalls.
How to run¶
OFFLINE (default, no LLM, no network, no setup): a scripted fake client picks a canned bark for the situation, after a trivial awaitable that proves the off-thread path. This is what the screenshot walker and web export use.
uv run python examples/features/ai/llm_npc_flavour.py
LIVE: point it at any OpenAI-compatible chat endpoint (vLLM, llama.cpp, Ollama,
hosted, etc.) by setting these environment variables, then pass –live. The
client is built by OpenAICompatibleClient.from_env():
SIMVX_LLM_BASE_URL required, e.g. http://host:8000/v1
SIMVX_LLM_MODEL required, the model name the endpoint serves
SIMVX_LLM_API_KEY optional, only if your endpoint needs a key
SIMVX_LLM_BASE_URL=http://host:8000/v1 SIMVX_LLM_MODEL=your-model uv run python examples/features/ai/llm_npc_flavour.py --live
Live runs record responses to a local cache (CACHE_DIR) so a re-run is deterministic and free.
Controls: A toggles the alert state (watch the next bark change), Esc quits.
Source¶
1"""LLM NPC Flavour: a night guard whose barks come from a low-frequency LLM brain.
2
3The classical, authoritative game logic runs every frame and is the source of
4truth: hp, ammo, and an alert flag drift / toggle in ``on_update`` and are written
5onto the agent's blackboard. A child ``AgentNode`` runs an ``LLMBrain`` that, a few
6seconds apart and entirely off the frame thread, turns that authoritative slice
7into a single in-character line. A bottom-anchored HUD renders the latest bark
8every frame.
9
10The LLM is a stage, not the NPC: while a call is in flight, or if it is late /
11dropped / fails, the HUD keeps the last good line and the game never stalls.
12
13## How to run
14
15OFFLINE (default, no LLM, no network, no setup): a scripted fake client picks a
16canned bark for the situation, after a trivial awaitable that proves the
17off-thread path. This is what the screenshot walker and web export use.
18
19 uv run python examples/features/ai/llm_npc_flavour.py
20
21LIVE: point it at any OpenAI-compatible chat endpoint (vLLM, llama.cpp, Ollama,
22hosted, etc.) by setting these environment variables, then pass --live. The
23client is built by `OpenAICompatibleClient.from_env()`:
24
25 SIMVX_LLM_BASE_URL required, e.g. http://host:8000/v1
26 SIMVX_LLM_MODEL required, the model name the endpoint serves
27 SIMVX_LLM_API_KEY optional, only if your endpoint needs a key
28
29 SIMVX_LLM_BASE_URL=http://host:8000/v1 SIMVX_LLM_MODEL=your-model \
30 uv run python examples/features/ai/llm_npc_flavour.py --live
31
32Live runs record responses to a local cache (CACHE_DIR) so a re-run is
33deterministic and free.
34
35Controls: A toggles the alert state (watch the next bark change), Esc quits.
36
37# /// simvx
38# tags = ["ai", "llm"]
39# ///
40"""
41
42from __future__ import annotations
43
44import asyncio
45import math
46import random
47import re
48import sys
49
50from simvx.ai import BARK_KEY, CachingClient, LLMBrain, OpenAICompatibleClient
51from simvx.ai.client import LLMClient, LLMResponse
52from simvx.core import AnchorPreset, Input, InputMap, Key, Label, Node2D
53from simvx.core.ai import AgentNode
54from simvx.graphics import App
55
56CACHE_DIR = "/tmp/simvx_llm_npc_cache"
57
58
59class ScriptedGuardClient(LLMClient):
60 """An offline fake: a trivial awaitable, then a canned bark for the situation.
61
62 This stands in for a real model so the demo runs with no network. It still
63 exercises the full async path (the ``await`` runs on the AsyncSlot loop, never
64 the frame), so the non-blocking / coalescing / degrade behaviour is identical.
65 """
66
67 CALM = ["All quiet on the wall.", "Another slow shift.", "Nothing moving out there.", "Cold night. Stay sharp."]
68 ALERT = ["Movement, north side!", "I heard something. Eyes up.", "Stay down, company.", "That's not the wind."]
69 LOW_AMMO = ["Running low on rounds.", "Down to my last clip.", "Need a resupply soon."]
70
71 async def complete(self, messages, **kwargs) -> LLMResponse:
72 await asyncio.sleep(0.08) # simulate model latency, off the frame thread
73 user = messages[-1]["content"].lower()
74 # Read the exact ammo value (a substring check would match "ammo": 1 inside 12).
75 ammo_match = re.search(r'"ammo":\s*(\d+)', user)
76 ammo = int(ammo_match.group(1)) if ammo_match else 99
77 if '"alert": true' in user:
78 pool = self.ALERT
79 elif ammo <= 2:
80 pool = self.LOW_AMMO
81 else:
82 pool = self.CALM
83 return LLMResponse(text=random.choice(pool))
84
85
86class NightGuard(Node2D):
87 """Authoritative classical state every frame; an LLMBrain only colours it."""
88
89 def __init__(self, client: LLMClient | None = None, **kwargs) -> None:
90 super().__init__(**kwargs)
91 # Default (no-arg) construction runs offline, so the screenshot walker and
92 # web export (both of which instantiate the root with no args) get the
93 # canned barks; main() passes a real client for --live.
94 self._client = client if client is not None else ScriptedGuardClient()
95 self.hp = 100.0
96 self.ammo = 12
97 self.alert = False
98 self._t = 0.0
99 self.agent: AgentNode | None = None
100 self.hud: Label | None = None
101 self.status: Label | None = None
102
103 def on_ready(self) -> None:
104 InputMap.add_action("toggle_alert", [Key.A])
105 InputMap.add_action("quit", [Key.ESCAPE])
106
107 # The child agent runs the LLM brain low-frequency (every 4s).
108 self.agent = AgentNode(
109 brain=LLMBrain(
110 self._client,
111 persona="a terse, tired night-watch guard",
112 facts=["hp", "ammo", "alert"],
113 period=4.0,
114 ),
115 name="GuardBrain",
116 )
117 self.add_child(self.agent)
118
119 # Bottom-anchored HUD (anchors + margins, never absolute position).
120 hud = Label("...", name="Bark")
121 hud.set_anchor_preset(AnchorPreset.BOTTOM_WIDE)
122 hud.margin_left = 20
123 hud.margin_right = 20
124 hud.margin_top = -64
125 hud.margin_bottom = -16
126 hud.font_size = 28.0
127 hud.alignment = "center"
128 self.add_child(hud)
129 self.hud = hud
130
131 title = Label("Night Guard - A: toggle alert Esc: quit", name="Title")
132 title.set_anchor_preset(AnchorPreset.CENTER_TOP)
133 title.margin_left = -260
134 title.margin_right = 260
135 title.margin_top = 16
136 title.margin_bottom = 40
137 title.font_size = 18.0
138 title.alignment = "center"
139 self.add_child(title)
140
141 # Authoritative classical state, centred, updated every frame (the LLM never
142 # writes this: it only reads it to colour the bark below).
143 status = Label("", name="Status")
144 status.set_anchor_preset(AnchorPreset.CENTER)
145 status.margin_left = -260
146 status.margin_right = 260
147 status.margin_top = -20
148 status.margin_bottom = 20
149 status.font_size = 24.0
150 status.alignment = "center"
151 self.add_child(status)
152 self.status = status
153
154 def on_update(self, dt: float) -> None:
155 # Classical authoritative simulation, every single frame.
156 self._t += dt
157 self.hp = 60.0 + 40.0 * (0.5 + 0.5 * math.sin(self._t * 0.7))
158 if self._t % 2.0 < dt:
159 # Drain a round every couple of seconds, then resupply once empty, so the
160 # demo cycles through calm / low-ammo states (and barks) instead of draining flat.
161 self.ammo = self.ammo - 1 if self.ammo > 0 else 12
162 if Input.is_action_just_pressed("toggle_alert"):
163 self.alert = not self.alert
164 if Input.is_action_just_pressed("quit"):
165 self.app.quit()
166
167 # Publish the authoritative slice onto the agent's blackboard each frame.
168 board = self.agent.blackboard if self.agent else None
169 if board is not None:
170 board.set("hp", round(self.hp))
171 board.set("ammo", self.ammo)
172 board.set("alert", self.alert)
173
174 # Render the authoritative classical state (updates every frame) and the latest
175 # LLM bark (last good line if a call is in flight / failed).
176 if self.status is not None:
177 flag = "ALERT" if self.alert else "calm"
178 self.status.text = f"hp {round(self.hp)} ammo {self.ammo} {flag}"
179 if self.hud is not None and board is not None:
180 self.hud.text = str(board.get(BARK_KEY, "..."))
181
182
183def _build_client(live: bool) -> LLMClient:
184 if not live:
185 return ScriptedGuardClient()
186 # Record/replay so a re-run is deterministic and free.
187 return CachingClient(OpenAICompatibleClient.from_env(), CACHE_DIR, mode="auto")
188
189
190def main() -> None:
191 live = "--live" in sys.argv
192 app = App(title="SimVX LLM NPC Flavour", width=900, height=600)
193 app.run(NightGuard(_build_client(live), name="NightGuard"))
194
195
196if __name__ == "__main__":
197 main()