Fix sudo follow-up status checks and intent repair gaps

This commit is contained in:
Claude Code
2026-03-17 20:26:29 -04:00
parent 40b4da345a
commit 30aa8388e3
2 changed files with 110 additions and 22 deletions
+3
View File
@@ -131,6 +131,9 @@ This section captures decisions and context accumulated across conversations wit
- **Lookup and benevolent TP context refinements (2026-03-17):** lookup mode now has local fallback answering for contextual gameplay questions (e.g., invisibility vs mobs) when retrieval returns no hits, and prompts/gateway guidance now explicitly discourage teleport use in helpful responses unless movement is explicitly requested. - **Lookup and benevolent TP context refinements (2026-03-17):** lookup mode now has local fallback answering for contextual gameplay questions (e.g., invisibility vs mobs) when retrieval returns no hits, and prompts/gateway guidance now explicitly discourage teleport use in helpful responses unless movement is explicitly requested.
- **Post-retest fixes (2026-03-17):** added execute-tail syntax repair so command fixers apply inside `execute ... run ...` payloads (fixes old bow NBT + fill fire variants emitted under execute wrappers), and added TNT quantity expansion from prompt count for summon-heavy intents (bounded by `sudo_tnt_max_commands`, default 80) when model output under-produces summons. - **Post-retest fixes (2026-03-17):** added execute-tail syntax repair so command fixers apply inside `execute ... run ...` payloads (fixes old bow NBT + fill fire variants emitted under execute wrappers), and added TNT quantity expansion from prompt count for summon-heavy intents (bounded by `sudo_tnt_max_commands`, default 80) when model output under-produces summons.
- **Retest outcome (2026-03-17 late):** bow repair now works in live runs (`give strongest bow` successfully converted old NBT to `minecraft:bow[enchantments={...}]` and delivered item). Remaining issues: fire-spread requests still often execute as no-op/invalid hybrid fill chains (`execute ... run fill ... fire ...` with mixed legacy args), and TNT intent can still collapse into single-command failure then destructive fallback (large `fill ... air` + few TNT) instead of honoring requested count semantics. - **Retest outcome (2026-03-17 late):** bow repair now works in live runs (`give strongest bow` successfully converted old NBT to `minecraft:bow[enchantments={...}]` and delivered item). Remaining issues: fire-spread requests still often execute as no-op/invalid hybrid fill chains (`execute ... run fill ... fire ...` with mixed legacy args), and TNT intent can still collapse into single-command failure then destructive fallback (large `fill ... air` + few TNT) instead of honoring requested count semantics.
- **Late-night retest results (2026-03-18):** strongest-bow path is confirmed fixed; lookup follow-up (`did that command do what I asked?`) still intermittently returns no result; fire-spread prompt still no-ops/invalid despite syntax repair; TNT quantity prompts (`spawn 80/20 tnt`) still execute a single summon in some runs, indicating quantity expansion is not consistently applied before execution.
- **Follow-up fixes after late-night retest (2026-03-18):** added per-player `last_sudo_feedback` memory and deterministic lookup fallback answer for `did that command do what I asked?`; expanded execute-tail repair coverage and fire fill metadata repair (`... fire 0 replace <block>`); disabled destructive air-fill fallback for TNT intents; added fire-intent fallback retry and robust TNT expansion by parsing summon tail from nested execute wrappers.
- **Project governance for bug intake (2026-03-17):** `bug_log` reports from all players are useful advisory input, but direction/prioritization authority remains with project owner (`slingshooter08`); non-owner reports should be weighted accordingly.
- **God voice update (2026-03-17):** Increased default God persona emphasis on irony, dark humor, and sarcastic one-liners in both command and message system prompts (vanilla + Paper variants) while keeping command strictness unchanged. - **God voice update (2026-03-17):** Increased default God persona emphasis on irony, dark humor, and sarcastic one-liners in both command and message system prompts (vanilla + Paper variants) while keeping command strictness unchanged.
- **Bug-log triage (2026-03-17):** `bug_log` entry confirmed an unintended-feeling movement reward in prayer flow (`execute as slingshooter08 run tp slingshooter08 ~ ~10 ~`) during a build-oriented prayer; prioritize pray-path teleport safety guards and intent alignment. - **Bug-log triage (2026-03-17):** `bug_log` entry confirmed an unintended-feeling movement reward in prayer flow (`execute as slingshooter08 run tp slingshooter08 ~ ~10 ~`) during a build-oriented prayer; prioritize pray-path teleport safety guards and intent alignment.
- **Bug follow-up (2026-03-17):** second `bug_log` entry reported God feeling "too nice" after greedy follow-up prayer; prompt context updated to bias repeated greedy demands toward corrective responses (rebuke/debuff/symbolic punishment) instead of extra rewards. - **Bug follow-up (2026-03-17):** second `bug_log` entry reported God feeling "too nice" after greedy follow-up prayer; prompt context updated to bias repeated greedy demands toward corrective responses (rebuke/debuff/symbolic punishment) instead of extra rewards.
+107 -22
View File
@@ -9,6 +9,7 @@ Config: /etc/mc_aigod.json
import json, os, random, re, socket, struct, threading, time, logging import json, os, random, re, socket, struct, threading, time, logging
from collections import deque from collections import deque
from datetime import datetime from datetime import datetime
from typing import Any, Dict
import shutil import shutil
from urllib.parse import parse_qs, unquote, urljoin, urlparse from urllib.parse import parse_qs, unquote, urljoin, urlparse
import requests import requests
@@ -80,6 +81,9 @@ sudo_history: deque = deque() # entries: (ts, player, prompt, translated_
SUDO_FAILURE_SIZE = 20 SUDO_FAILURE_SIZE = 20
sudo_failures: deque = deque() # entries: (ts, player, command, error) sudo_failures: deque = deque() # entries: (ts, player, command, error)
# Last sudo execution feedback by player for follow-up questions.
last_sudo_feedback: Dict[str, Dict[str, Any]] = {}
_memory_lock = threading.Lock() _memory_lock = threading.Lock()
# Gateway client session mapping (player+mode -> session_id) # Gateway client session mapping (player+mode -> session_id)
@@ -1012,6 +1016,21 @@ def get_sudo_failures_block(player: str = "") -> str:
return "\n=== RECENT FAILED SUDO PATTERNS ===\n" + "\n".join(lines) + "\n" return "\n=== RECENT FAILED SUDO PATTERNS ===\n" + "\n".join(lines) + "\n"
def set_last_sudo_feedback(player: str, prompt: str, results_seen: list, ineffective: bool):
with _memory_lock:
last_sudo_feedback[player] = {
"ts": time.time(),
"prompt": (prompt or "")[:240],
"ineffective": bool(ineffective),
"results": [(c[:220], (r or "")[:240]) for c, r in results_seen[:12]],
}
def get_last_sudo_feedback(player: str) -> Dict[str, Any]:
with _memory_lock:
return dict(last_sudo_feedback.get(player, {}))
def _bug_log_path(config) -> str: def _bug_log_path(config) -> str:
return config.get("bug_log_path", "/var/log/mc_aigod_paper_bug.log") return config.get("bug_log_path", "/var/log/mc_aigod_paper_bug.log")
@@ -2306,16 +2325,36 @@ def fix_weather_command(cmd: str) -> str:
def fix_fill_fire_command(cmd: str) -> str: def fix_fill_fire_command(cmd: str) -> str:
"""Fix legacy fill syntax like `... fire 0 replace air` for 1.21.""" """Fix legacy fill syntax like `... fire 0 replace air` for 1.21."""
raw = (cmd or "").strip() raw = (cmd or "").strip()
m = re.match(r'^(fill\s+\S+\s+\S+\s+\S+\s+\S+\s+\S+\s+\S+\s+)(minecraft:)?(fire|soul_fire)\s+0\s+replace\s+air$', raw, flags=re.IGNORECASE) m = re.match(
r'^(fill\s+\S+\s+\S+\s+\S+\s+\S+\s+\S+\s+\S+\s+)(minecraft:)?(fire|soul_fire)\s+0\s+replace\s+(\S+)$',
raw,
flags=re.IGNORECASE,
)
if not m: if not m:
return raw return raw
prefix, ns, block = m.groups() prefix, ns, block, repl = m.groups()
block_id = f"minecraft:{block.lower()}" if not ns else f"{ns.lower()}{block.lower()}" block_id = f"minecraft:{block.lower()}" if not ns else f"{ns.lower()}{block.lower()}"
fixed = f"{prefix}{block_id} replace air" fixed = f"{prefix}{block_id} replace {repl.lower()}"
log.warning(f"Fixed fill fire syntax: '{cmd}' -> '{fixed}'") log.warning(f"Fixed fill fire syntax: '{cmd}' -> '{fixed}'")
return fixed return fixed
def _split_execute_tail(cmd: str):
"""Return (wrappers, tail) for execute chains."""
tail = (cmd or "").strip()
wrappers = []
for _ in range(6):
if not tail.startswith("execute "):
break
marker = " run "
idx = tail.find(marker)
if idx < 0:
break
wrappers.append(tail[: idx + len(marker)])
tail = tail[idx + len(marker):].strip()
return wrappers, tail
def fix_bow_enchant_syntax(cmd: str) -> str: def fix_bow_enchant_syntax(cmd: str) -> str:
"""Rewrite old bow Enchantments NBT to 1.21 component format.""" """Rewrite old bow Enchantments NBT to 1.21 component format."""
raw = (cmd or "").strip() raw = (cmd or "").strip()
@@ -2418,6 +2457,11 @@ def _is_destructive_intent(prompt: str) -> bool:
return any(k in p for k in keys) return any(k in p for k in keys)
def _is_fire_intent(prompt: str) -> bool:
p = (prompt or "").lower()
return any(k in p for k in ("fire", "ignite", "burn", "flame"))
def _normalize_sudo_command_shape(cmd: str, player: str) -> str: def _normalize_sudo_command_shape(cmd: str, player: str) -> str:
c = (cmd or "").strip() c = (cmd or "").strip()
if not c: if not c:
@@ -2565,6 +2609,20 @@ def _repair_failed_sudo_commands(player: str, results_seen: list, config) -> lis
if len(out) >= max_retry: if len(out) >= max_retry:
return out[:max_retry] return out[:max_retry]
# Fire fill repair: remove legacy metadata and simplify execution anchor.
if "incorrect argument" in r and "fill" in c and "fire" in c:
w, tail = _split_execute_tail(c)
tail = fix_fill_fire_command(tail)
out.append(f"execute at {player} run {tail}")
if len(out) >= max_retry:
return out[:max_retry]
# Empty result for large fire fill: retry with tighter vertical band.
if (not r.strip()) and ("fill" in c and "fire" in c):
out.append(f"execute at {player} run fill ~-25 ~-1 ~-25 ~25 ~3 ~25 minecraft:fire replace air")
if len(out) >= max_retry:
return out[:max_retry]
return out[:max_retry] return out[:max_retry]
@@ -2582,22 +2640,23 @@ def _expand_tnt_commands_from_prompt(commands: list, prompt: str, player: str, c
if len(commands) >= target: if len(commands) >= target:
return commands return commands
summons = [c for c in commands if "summon" in c and "tnt" in c] first = None
if not summons: wrappers = []
x = y = z = None
for c in commands:
w, tail = _split_execute_tail(c)
m = re.match(r'^summon\s+(?:minecraft:)?tnt\s+(\S+)\s+(\S+)\s+(\S+)(?:\s+\{.*\})?$', tail)
if not m:
continue
first = c
wrappers = w
x, y, z = m.groups()
break
if not first:
return commands return commands
x = x or "~"
base = summons[0] y = y or "~1"
prefix = "" z = z or "~"
body = base
m_pref = re.match(rf'^(execute\s+at\s+{re.escape(player)}\s+run\s+)(.+)$', base)
if m_pref:
prefix = m_pref.group(1)
body = m_pref.group(2)
m = re.match(r'^summon\s+(?:minecraft:)?tnt\s+(\S+)\s+(\S+)\s+(\S+)(?:\s+\{.*\})?$', body)
if not m:
return commands
x, y, z = m.groups()
expanded = [] expanded = []
for i in range(target): for i in range(target):
@@ -2609,7 +2668,9 @@ def _expand_tnt_commands_from_prompt(commands: list, prompt: str, player: str, c
xx = "~" if dx == 0 else f"~{dx}" xx = "~" if dx == 0 else f"~{dx}"
if z.startswith("~"): if z.startswith("~"):
zz = "~" if dz == 0 else f"~{dz}" zz = "~" if dz == 0 else f"~{dz}"
expanded.append(f"{prefix}summon minecraft:tnt {xx} {y} {zz}") tail = f"summon minecraft:tnt {xx} {y} {zz}"
cmd = "".join(wrappers) + tail if wrappers else tail
expanded.append(cmd)
log.warning(f"Expanded TNT commands from {len(commands)} to {len(expanded)} (requested={requested}, cap={cap})") log.warning(f"Expanded TNT commands from {len(commands)} to {len(expanded)} (requested={requested}, cap={cap})")
return expanded return expanded
@@ -2865,9 +2926,15 @@ def process_sudo(player, prompt, config):
return True return True
return False return False
def _local_lookup_fallback_answer(query: str, ref_cmd: str) -> str: def _local_lookup_fallback_answer(query: str, ref_cmd: str, last_feedback: Dict[str, Any]) -> str:
q = (query or "").lower() q = (query or "").lower()
rc = (ref_cmd or "").lower() rc = (ref_cmd or "").lower()
if re.search(r'\bdid that command do what i asked\b', q):
if not last_feedback:
return "I do not have enough recent execution context to verify that yet."
if bool(last_feedback.get("ineffective", False)):
return "Likely no. Recent execution results indicate the command was ineffective or partially failed."
return "Likely yes. Recent execution results indicate the command completed successfully."
if "invisible" in q and "mob" in q and "invisibility" in rc: if "invisible" in q and "mob" in q and "invisibility" in rc:
return "Invisibility greatly reduces mob detection, but it does not make you perfectly undetectable at close range or while making noise/actions." return "Invisibility greatly reduces mob detection, but it does not make you perfectly undetectable at close range or while making noise/actions."
return "" return ""
@@ -2897,6 +2964,7 @@ def process_sudo(player, prompt, config):
if last_cmd: if last_cmd:
_send_private(player, f"context command: {last_cmd}", config, "dark_gray") _send_private(player, f"context command: {last_cmd}", config, "dark_gray")
last_fb = get_last_sudo_feedback(player)
wiki_rows = _info_lookup_wiki(lookup_query) wiki_rows = _info_lookup_wiki(lookup_query)
web_rows = _info_lookup_web(lookup_query) web_rows = _info_lookup_web(lookup_query)
gateway_msg = "" gateway_msg = ""
@@ -2947,7 +3015,7 @@ def process_sudo(player, prompt, config):
_send_private(player, f"- {tool}: {q}", config, "dark_gray") _send_private(player, f"- {tool}: {q}", config, "dark_gray")
if not wiki_rows and not web_rows and not gateway_msg: if not wiki_rows and not web_rows and not gateway_msg:
fb = _local_lookup_fallback_answer(lookup_query, last_cmd) fb = _local_lookup_fallback_answer(lookup_query, last_cmd, last_fb)
if fb: if fb:
_send_private(player, f"- {fb}", config, "gray") _send_private(player, f"- {fb}", config, "gray")
else: else:
@@ -3193,7 +3261,7 @@ def process_sudo(player, prompt, config):
ineffective = (len(executed) == 0) or (effective_hits == 0) ineffective = (len(executed) == 0) or (effective_hits == 0)
# Adaptive fallback for destructive intent when output appears ineffective. # Adaptive fallback for destructive intent when output appears ineffective.
if _is_destructive_intent(prompt) and ineffective: if _is_destructive_intent(prompt) and ineffective and "tnt" not in prompt.lower():
fallback_cmds = _build_destructive_fallback(player, config) fallback_cmds = _build_destructive_fallback(player, config)
log.warning(f"SUDO destructive fallback engaged for prompt={prompt!r}: {fallback_cmds}") log.warning(f"SUDO destructive fallback engaged for prompt={prompt!r}: {fallback_cmds}")
_send_private(player, "[SUDO] Initial plan was weak; applying destructive fallback.", config, "yellow") _send_private(player, "[SUDO] Initial plan was weak; applying destructive fallback.", config, "yellow")
@@ -3211,10 +3279,27 @@ def process_sudo(player, prompt, config):
effective_hits = sum(1 for _, res in results_seen if _sudo_result_is_effective(res)) effective_hits = sum(1 for _, res in results_seen if _sudo_result_is_effective(res))
ineffective = (len(executed) == 0) or (effective_hits == 0) ineffective = (len(executed) == 0) or (effective_hits == 0)
if _is_fire_intent(prompt) and ineffective:
fire_retry = [f"execute at {player} run fill ~-25 ~-1 ~-25 ~25 ~3 ~25 minecraft:fire replace air"]
log.warning(f"SUDO fire fallback engaged for prompt={prompt!r}: {fire_retry}")
for cmd in fire_retry:
resolved, is_safe = validate_command(cmd, online, player, config)
if not is_safe:
continue
log.info(f"SUDO fire fallback execute: {resolved}")
result = rcon(resolved, config["rcon_host"], config["rcon_port"], config["rcon_password"])
log.info(f"SUDO fire fallback result: {result!r}")
executed.append(resolved)
results_seen.append((resolved, str(result or "")))
time.sleep(0.15)
effective_hits = sum(1 for _, res in results_seen if _sudo_result_is_effective(res))
ineffective = (len(executed) == 0) or (effective_hits == 0)
for cmd, res in results_seen: for cmd, res in results_seen:
if not _sudo_result_is_effective(res): if not _sudo_result_is_effective(res):
add_sudo_failure(player, cmd, res) add_sudo_failure(player, cmd, res)
set_last_sudo_feedback(player, prompt, results_seen, ineffective)
_report_sudo_feedback(player, prompt, commands, results_seen, ineffective, config) _report_sudo_feedback(player, prompt, commands, results_seen, ineffective, config)
add_sudo_history(player, prompt, commands, executed) add_sudo_history(player, prompt, commands, executed)