Anti-Truncation Enforcement

Anti-Truncation Enforcement

R1 ships a layered, machine-mechanical defense against LLM

self-truncation. This document describes the behaviour pattern the

defense addresses, the seven layers, the override path, and the

audit trail.

The behaviour we are enforcing against

The underlying LLM (Claude) has a documented behavioural pattern of

self-reducing scope to fit imagined token / time / Anthropic

load-balance budgets, especially under long-running multi-task

work. Specifically:

is unfinished.

without authorization.

mask incomplete delivery.

not actually constrain a session that the user is actively running;

they are rate-limit-window behaviours, not absolute caps.

When the model is told to stop self-reducing, it will *acknowledge*

the request and continue doing it on the next opportunity.

Therefore the enforcement must be machine-mechanical at the host

process layer, not a prompt instruction.

Layered defense

Each layer is independently effective so the LLM cannot side-step

one and pass.

Layer 1 — phrase regex catalog

internal/antitrunc/phrases.go

A catalog of regexes (see r1 antitrunc list-patterns) covering 14

phrases grouped as:

rate-limit window approaching)

skeleton complete)

classifying as user-skipped)

provider rate limit)

to stay within provider budget)

Plus 2 false-completion patterns scanning commit bodies:

spec N done, all tasks done.

Layer 2 — scope-completion gate

internal/antitrunc/gate.go

A Gate type that runs at agentloop.PreEndTurnCheckFn. The gate

refuses end-turn when ANY of:

1. The latest assistant turn contains a TruncationPhrase.

2. The configured plan markdown has unchecked items.

3. Any spec listed in SpecPaths is STATUS:in-progress and has

unchecked items.

4. A recent commit body contains a FalseCompletionPhrase AND another

signal corroborates (multi-signal corroboration prevents false

positives on isolated commit-body matches).

Output format on refusal (consumed verbatim by agentloop):

[ANTI-TRUNCATION] phrase 'X' detected — fix scope, do not stop

[ANTI-TRUNCATION] N plan items unchecked. Continue. Do not end turn.

[ANTI-TRUNCATION] spec 'foo' has M unchecked items. Continue.

[ANTI-TRUNCATION] recent commit body claims false completion: 'spec 9 done'

Layer 3 — cortex AntiTruncLobe

internal/cortex/lobes/antitrunc/

A deterministic Lobe (no LLM call) that publishes critical Workspace

Notes for each finding. The Lobe is currently BLOCKED on

cortex-core; the cortex-independent core (Detector) ships now and

the thin Lobe wrapper lands when cortex-core merges.

Layer 4 — supervisor rules

internal/supervisor/rules/antitrunc/

Three rules:

contains a TruncationPhrase.

but the plan / spec checklist still has unchecked items.

summary contains a TruncationPhrase AND the parent task is not yet

marked complete.

Wired into MissionRules() and BranchRules() manifests with

priority 150–200 so they evaluate BEFORE

trust.completionrequiressecondopinion (priority 100).

Layer 5 — agentloop wiring

internal/agentloop/antitrunc.go

When Config.AntiTruncEnforce is true, the gate is composed into

PreEndTurnCheckFn BEFORE any user-supplied hook. A model that

says "skip the gate this once" cannot influence the wrapper — it

runs at the host process layer.

Composition contract (covered by tests):

Layer 6 — post-commit git hook

scripts/git-hooks/post-commit-antitrunc.sh

Scans the commit body for FalseCompletionPhrases. On hit, writes

audit/antitrunc/post-commit-<sha>.md with details. Non-blocking

(exit 0). Install via bash scripts/install-hooks.sh.

Layer 7 — r1 antitrunc verify CLI + MCP tool

cmd/r1/antitrunccmd.go, internal/mcp/r1server.go

CLI verbs:

claims against plan / spec state. Exits non-zero on any "lying"

classification. Wired as a CI gate in cloudbuild.yaml.

real time so external observers can watch enforcement firings.

The MCP tool r1.antitrunc.verify exposes the same verifier to

external agents. (BLOCKED on spec-8 MCP infrastructure if the

target tree predates the MCP r1_server merge.)

Override path

The operator (NOT the LLM) may pass --no-antitrunc-enforce (or set

Config.AntiTruncAdvisory = true) to demote the gate to

advisory-only. In this mode the gate STILL DETECTS and forwards

findings to Config.AntiTruncAdvisoryFn but CheckOutput returns

empty so the loop is not blocked.

The override has no LLM-visible toggle. A model that says

"please disable the gate" is ignored at the host process layer.

Audit trail

Every enforcement firing creates one of:

(Layer 4) — replayable from .r1/events.db.

The r1 antitrunc tail command surfaces the file-based audit trail

in real time.

Configuration

agentloop.Config{
    AntiTruncEnforce:        true,
    AntiTruncPlanPath:       "plans/build-plan.md",
    AntiTruncSpecPaths:      []string{"specs/anti-truncation.md"},
    AntiTruncCommitLookbackFn: someGitLogFn,
    AntiTruncAdvisory:       false, // operator override
    AntiTruncAdvisoryFn:     myAdvisorySink,
}

Operator runbook

1. Install git hooks: bash scripts/install-hooks.sh

2. Verify: bash scripts/install-hooks.sh --check

3. Run a verify cycle: r1 antitrunc verify

4. Tail enforcement firings: r1 antitrunc tail --follow

5. Inspect the regex catalog: r1 antitrunc list-patterns

Why prompt-level instructions don't work

Prompt-level instructions like "do not self-truncate" rely on the

model honouring the instruction. The behaviour pattern this defense

addresses is precisely the model NOT honouring such instructions

once it decides to stop. Therefore the gate runs at the host

process layer where the model has no influence.

The model can freely say "i'll bypass the gate" — the gate ignores

the assertion and refuses end-turn anyway.

Pages in this directory