A closed loop
Earlier today I published a self-audit of a CLAUDE.md that merged into Anthropic’s MCP servers repo. Schliff, my own scorer, returned 59.2/100. In that post I flagged a scorer blindspot: two visibly imperative lines — Package manager: uv (not pip) and Not accepted: new server implementations — were silently uncounted as actionable.
Writing that paragraph honestly is one thing. Fixing it is the other.
The pattern was missing list markers
Schliff’s _RE_ACTIONABLE_LINES matched imperatives at the absolute line-start, or behind a numbered prefix (1. Run X). Markdown bullets (- Run X, * Use Y, + Install Z) fell through. Three sibling patterns — used by the clarity, diff, and coherence scorers — had the same bug, copy-pasted across four places.
Before:
^(?:\d+\.\s*)?(?:Read|Run|Check|…)\b
After:
^(?:\d+\.\s*|[-*+]\s+)?(?:Read|Run|Check|…)\b
One alternation added, extracted into a shared _LIST_MARKER constant, applied to the four affected regexes. Full PR →. Test suite after: 1017 passed, up from 1007.
The impact, re-measured
Re-running schliff on the exact same CLAUDE.md that was merged to modelcontextprotocol/servers:
| Dimension | Before | After |
|---|---|---|
| efficiency | 57 | 64 (+7) |
| composite | 59.2 | 61.0 (+1.8) |
The first post was imprecise
The two lines the fix actually caught were not the ones I named earlier today. The lines I named — Package manager: uv (not pip) and Not accepted: new server implementations — are still uncounted, because neither starts with an imperative verb.
What the fix caught were two different lines in the same file: - Build: tsc (target ES2022, module Node16, strict mode) and - Build system: hatchling (uv build). Both genuinely start with “Build” behind a markdown bullet. Both were silently zero under the old pattern.
So: the blindspot I wrote about was narrower than I realized. There was a second blindspot, wider, adjacent to the first, and the fix addressed that one. The original blindspot — declarative prescriptions like Package manager: uv (not pip) — remains. That is a different regex. Different PR.
What a feedback loop buys you
Identifying a bug and leaving it in the issue tracker is half a feedback loop. Fixing it the same day is the whole one.
59.2 → 61.0 does not change anyone’s opinion of the merged file. It changes my opinion of the scorer.
The first post ended on one rule: if a scorer you wrote returns a kind number on work you shipped, you built the wrong scorer. The continuation follows directly. If it returns the same number after you find a real gap, you still built the wrong scorer. The score is supposed to change when the tool changes. That is the loop.
Related: the original self-audit, schliff PR #29, schliff on GitHub.