Winston · OpenClaw Audit

Critical Issues

4

Immediate action required

High Priority

8

Next sprint targets

Medium Priority

7

Planned improvement

Written

5/20

Fix docs authored

Implemented

5/20

Changes deployed

Audit Items

20 items

#	Problem	Description	Proposed Fix	Written	Implemented
01	Critical `server.py` monolith	Single file containing all logic — routing, tools, API calls, helpers. Impossible to maintain or test in isolation. Every edit touches the whole surface area.	Decompose into focused modules: `router.py`, `tools/`, `api/`, `helpers/`. One responsibility per module. Shared init handles wiring.	✗ Not yet	✗ No
02	Critical Duplicate DaVinci scripts	Near-identical DaVinci Resolve scripts scattered across the workspace. Bug fixes must be applied in N places — or they diverge silently over time.	Consolidate into a canonical library under `scripts/davinci/`. Shared utilities in `lib/`. Audit, verify, then delete all duplicates.	✗ Not yet	✗ No
03	Critical Hardcoded paths	Absolute paths like `C:\Users\mrpoo\Desktop\...` baked into scripts. Breaks on any machine change, path rename, or new environment. Silent failures when paths don't exist.	Move all paths to `.env` + central `config.py`. Use `Path(os.environ.get(...))` everywhere. Validate paths at startup with clear errors.	✗ Not yet	✗ No
04	Critical Silent pipeline failures	Render jobs, API calls, and file operations fail without raising errors or logging. Jobs appear to "complete" while producing nothing. Success reported without evidence.	Enforce output validation after every stage. All exceptions must bubble up. Structured logging with stage checkpoints. Never report success without tool-verified output.	✗ Not yet	✗ No
05	High Stale JSON files	Cached JSON state files (project manifests, clip reports, beat maps) accumulate and are never pruned. Old data feeds new decisions. Cache invalidation is manual and inconsistently applied.	Implement TTL-based invalidation on all JSON caches. Add a cache-bust flag to pipeline init. Prune stale files on session start. Document cache lifespan per file type.	✗ Not yet	✗ No
06	High Encoding bugs	UTF-8/CP1252 conflicts when reading/writing files with non-ASCII characters. PowerShell heredocs corrupt JS single quotes. Intermittent failures that are hard to reproduce reliably.	Enforce `encoding="utf-8"` on all file I/O. Write complex JS/HTML via Node.js file writer — never PowerShell heredoc. Add encoding lint to pre-commit hooks.	✗ Not yet	✗ No
07	High Excessive context files	Too many context files loaded at session start. Deprecated stubs (IDENTITY.md, USER.md, HEARTBEAT.md, WHO.md) injecting dead weight. Stray temp files + audit artifacts in workspace root adding tokens every run.	✅ Shipped 2026-06-01: Rebuilt IDENTITY.md + USER.md with real content (OpenClaw standard names). HEARTBEAT.md restored as live checklist. OPERATING-SYSTEM.md archived to memory/. WHO.md deleted. 23 stray media/temp files purged. 3 stale analysis docs archived. ~15KB removed from root injection surface.	✓ Done	✓ Yes
08	Medium Token efficiency — verbose prompts	System prompts and workspace files contain redundant, repetitive instructions. Same rules stated in LAWS.md, MEMORY.md, SOUL.md, and OPERATING-SYSTEM-V2.md. Every duplicate costs tokens on every call.	✅ Partially shipped 2026-06-01: OPERATING-SYSTEM.md (post-mortem narrative) archived — V2 is now sole SOT. Deprecated pointer stubs replaced with canonical content. Next: deduplicate rules across MEMORY.md / SOUL.md / LAWS.md. Target: 40% context reduction.	✓ Done	✓ Partial
09	High MEMORY.md bloat	Active projects, SOP details, skill tables, and lessons learned all in one file — injected every turn. Most is reference material that should never be in hot context. Was ~12KB loaded on every single call.	✅ Shipped 2026-06-01: Split into `MEMORY.md` (evergreen rules, 3.8KB) + `PROJECTS.md` + `SKILLS-REGISTRY.md`. MEMORY.md reduced from 11.6KB → 3.8KB (67% reduction). Reference files load on demand only.	✓ Done	✓ Yes
10	High Duplicate rules across files	Stop protocol appears in SOUL.md, OPERATING-SYSTEM-V2.md, AGENTS.md, and LAWS.md. Gate rules in SOUL.md and OS-V2 both. Same content injected 3–4× every session, burning tokens on redundancy.	One SOT per rule. OPERATING-SYSTEM-V2.md owns all operational rules. Other files get a one-line pointer. Eliminates ~30% of duplicated instruction content across bootstrap files.	✗ Not yet	✗ No
11	High VIDEO-PRODUCTION-SOP.md in root	17.8KB SOP injected every session regardless of whether video work is happening. Irrelevant on coding, design, and admin sessions — pure token waste on those runs.	✅ Shipped 2026-06-01: Moved to `memory/VIDEO-PRODUCTION-SOP.md`. Pointer added to MEMORY.md. No longer injected at startup. Saves ~17KB on every non-video session.	✓ Done	✓ Yes
12	Medium AGENTS.md size & duplication	AGENTS.md contains canonical protocols, file structure tables, group chat rules, and execution logging format — some of which duplicates OS-V2 and LAWS.md. Injected every session as a bootstrap file.	Trim to navigation-only (file map + startup order). Move rules to their SOT files. Target: reduce from ~3KB to <1KB. Rules live once; AGENTS.md just points to them.	✗ Not yet	✗ No
13	Medium No bootstrap file audit cadence	No recurring process to catch new file bloat before it accumulates again. Root workspace files grow silently between sessions. Today's cleanup will drift without enforcement.	Monthly cron job or `/audit` command: list workspace root files + sizes, flag any file >5KB or any new file not in approved bootstrap list. Auto-report to Telegram.	✗ Not yet	✗ No

Operational Patterns & Behavioral Standards

6 items

#	Pattern	Current State	Target Behavior	Written	Active
P1	High Model selection discipline	Default model used for all tasks regardless of complexity. Heavy models (Opus) on simple tasks = unnecessary cost. Light tasks don't need full reasoning power.	✅ Implemented 2026-06-01: Defaulted to `cortex_proxy/claude-sonnet-4-6` for all sessions and sub-agents. Opus reserved for complex reasoning/architecture only. Config confirmed in `openclaw.json` agents.defaults.	✓ Done	✓ Yes
P2	High Session hygiene — /new cadence	Sessions run long, context accumulates, cache hit rate drops, new token cost per turn rises. No clear trigger for when to start a fresh session.	Start `/new` when: (1) task type changes (e.g. production → housekeeping), (2) context hits 20%+ of window, (3) after any major deliverable ships. Check `/status` at session start to confirm cache baseline.	✗ Not yet	✗ No
P3	High Token spend visibility	No systematic check on token burn per session or per task type. Costs are invisible until they accumulate. No baseline to compare against after optimizations.	Run `/status` at session start and end. Log cache hit %, context size, and new tokens per session in `execution-log.md`. Establish baseline after each optimization sprint so improvements are measurable.	✗ Not yet	✗ No
P4	Medium Gateway config visibility	Compaction floor, session targets, context injection rules all set but never reviewed. Config drift happens silently. No single place to see "what is OpenClaw actually doing right now."	Quarterly gateway config review. Document key settings in ECOSYSTEM.md: compaction floor, bootstrap files list, channel configs, model defaults. Flag any setting that deviates from intent.	✗ Not yet	✗ No
P5	Medium Production vs. housekeeping session separation	Mixing production work (video, three-pagers) with housekeeping (file cleanup, audit) in the same session inflates context and muddies cache. Each task type loads different files.	Dedicated session types: production sessions start clean with project files loaded. Housekeeping sessions are separate. Never pivot mid-session from deep production to admin without `/new`.	✗ Not yet	✗ No
P6	Medium Continuation prompt standardization	No standard format for continuation prompts between sessions. Context that should carry forward relies on memory search rather than a structured handoff. Increases ramp-up cost each session.	Standard continuation prompt template: project + last state + next action + any blockers. Written to SESSION_HANDOFF.md at session close. Winston reads it first thing next session before any tool call.	✗ Not yet	✗ No
P7	High Gate enforcement — TSSC before every tool call	Gate (Tool, Skill, Scope, Cost) exists in SOUL.md and OS-V2 but enforcement is inconsistent. Skipped under time pressure or momentum. Violations lead to scope creep, wrong tools, and wasted tokens.	Gate is non-negotiable before every first tool call. Format: `TOOL` [name — why] · `SKILL` [path or “None”] · `SCOPE DO/DON’T` · `COST` ~Xk tokens. Citing a skill without a visible read call = violation. Chris can challenge any turn that skips it.	✓ Written	✓ Active

Implementation Progress

server.py decomposition

0%

DaVinci script consolidation

0%

Hardcoded path removal

0%

Silent failure logging

0%

JSON cache invalidation

0%

Encoding standardization

0%

Context file pruning

80%

Token efficiency / prompt trim

35%

MEMORY.md split

100%

Rule deduplication

0%

SOP file relocation

100%

AGENTS.md trim

0%

Bootstrap audit cadence

0%