{"id":2890,"date":"2026-06-14T04:18:00","date_gmt":"2026-06-14T04:18:00","guid":{"rendered":"https:\/\/tucumandevelopers.com\/index.php\/2026\/06\/14\/agent-series-20-harness-in-production-from-single-file-to-reusable-package\/"},"modified":"2026-06-14T04:18:00","modified_gmt":"2026-06-14T04:18:00","slug":"agent-series-20-harness-in-production-from-single-file-to-reusable-package","status":"publish","type":"post","link":"https:\/\/tucumandevelopers.com\/index.php\/2026\/06\/14\/agent-series-20-harness-in-production-from-single-file-to-reusable-package\/","title":{"rendered":"Agent Series (20): Harness in Production \u2014 From Single File to Reusable Package"},"content":{"rendered":"<div>\n<div><\/div>\n<div data-article-id=\"3895474\" id=\"article-body\">\n<h2> <a name=\"from-demo-code-to-a-reusable-package\" href=\"#from-demo-code-to-a-reusable-package\"> <\/a> From Demo Code to a Reusable Package <\/h2>\n<p>Article 19 used a 900-line <code>harness_full_demo.py<\/code> to demonstrate eight defense layers. That file is good for explaining concepts, but not for reuse \u2014 all layers are coupled together, nothing can be tested in isolation, and nothing can be imported by another project.<\/p>\n<p>A production-grade Agent project needs something you can actually <code>import<\/code>: <\/p>\n<div>\n<pre><code>harness\/ \u251c\u2500\u2500 __init__.py Public API exports \u251c\u2500\u2500 registry.py Layer 2: ActionRegistry + PermissionLevel \u251c\u2500\u2500 budget.py Layer 3: PermissionBudget (with refund()) \u251c\u2500\u2500 sandbox.py Layer 4: sanitise_input + sandboxed_eval \u251c\u2500\u2500 audit.py Layer 6: ImmutableAuditLog (hash-chained) \u251c\u2500\u2500 rollback.py Layer 7: RollbackCoordinator \u2514\u2500\u2500 harness.py Unified entry point: AgentHarness <\/code><\/pre>\n<div>\n<\/p><\/div>\n<\/p><\/div>\n<p>This article starts with package design, covers three key API decisions, and finishes with two integration styles: standalone Python and LangGraph graph embedding.<\/p>\n<hr>\n<h2> <a name=\"module-design\" href=\"#module-design\"> <\/a> Module Design <\/h2>\n<h3> <a name=\"registrypy-layer-2\" href=\"#registrypy-layer-2\"> <\/a> registry.py \u2014 Layer 2 <\/h3>\n<div>\n<pre><code><span>class<\/span> <span>PermissionLevel<\/span><span>(<\/span><span>Enum<\/span><span>):<\/span> <span>READ<\/span> <span>=<\/span> <span>1<\/span> <span>WRITE<\/span> <span>=<\/span> <span>2<\/span> <span>ADMIN<\/span> <span>=<\/span> <span>3<\/span> <span>IRREVERSIBLE<\/span> <span>=<\/span> <span>4<\/span> <span>@dataclass<\/span> <span>class<\/span> <span>RegisteredAction<\/span><span>:<\/span> <span>name<\/span><span>:<\/span> <span>str<\/span> <span>level<\/span><span>:<\/span> <span>PermissionLevel<\/span> <span>budget_cost<\/span><span>:<\/span> <span>int<\/span> <span>description<\/span><span>:<\/span> <span>\"<\/span><span>str<\/span><span>\"<\/span> <span>handler<\/span><span>:<\/span> <span>Any<\/span> <span># Callable or BaseTool <\/span> <span>class<\/span> <span>ActionRegistry<\/span><span>:<\/span> <span>def<\/span> <span>register<\/span><span>(<\/span><span>self<\/span><span>,<\/span> <span>action<\/span><span>:<\/span> <span>RegisteredAction<\/span><span>)<\/span> <span>-&gt;<\/span> <span>None<\/span><span>:<\/span> <span>...<\/span> <span>def<\/span> <span>get<\/span><span>(<\/span><span>self<\/span><span>,<\/span> <span>name<\/span><span>:<\/span> <span>str<\/span><span>)<\/span> <span>-&gt;<\/span> <span>RegisteredAction<\/span><span>:<\/span> <span>...<\/span> <span># not found \u2192 PermissionError <\/span> <span>def<\/span> <span>is_allowed<\/span><span>(<\/span><span>self<\/span><span>,<\/span> <span>name<\/span><span>:<\/span> <span>str<\/span><span>)<\/span> <span>-&gt;<\/span> <span>bool<\/span><span>:<\/span> <span>...<\/span> <span>def<\/span> <span>names<\/span><span>(<\/span><span>self<\/span><span>)<\/span> <span>-&gt;<\/span> <span>list<\/span><span>[<\/span><span>str<\/span><span>]:<\/span> <span>...<\/span> <\/code><\/pre>\n<div>\n<\/p><\/div>\n<\/p><\/div>\n<p><code>get()<\/code> rather than <code>__getitem__<\/code>: raises a consistent <code>PermissionError<\/code>, without leaking the internal <code>KeyError<\/code> detail.<\/p>\n<hr>\n<h3> <a name=\"budgetpy-layer-3\" href=\"#budgetpy-layer-3\"> <\/a> budget.py \u2014 Layer 3 <\/h3>\n<div>\n<pre><code><span>class<\/span> <span>PermissionBudget<\/span><span>:<\/span> <span>def<\/span> <span>spend<\/span><span>(<\/span><span>self<\/span><span>,<\/span> <span>action_name<\/span><span>:<\/span> <span>str<\/span><span>,<\/span> <span>cost<\/span><span>:<\/span> <span>int<\/span><span>)<\/span> <span>-&gt;<\/span> <span>None<\/span><span>:<\/span> <span>if<\/span> <span>self<\/span><span>.<\/span><span>remaining<\/span> <span>&lt;<\/span> <span>cost<\/span><span>:<\/span> <span>raise<\/span> <span>BudgetExhaustedError<\/span><span>(...)<\/span> <span>self<\/span><span>.<\/span><span>remaining<\/span> <span>-=<\/span> <span>cost<\/span> <span>def<\/span> <span>refund<\/span><span>(<\/span><span>self<\/span><span>,<\/span> <span>action_name<\/span><span>:<\/span> <span>str<\/span><span>,<\/span> <span>cost<\/span><span>:<\/span> <span>int<\/span><span>)<\/span> <span>-&gt;<\/span> <span>None<\/span><span>:<\/span> <span>self<\/span><span>.<\/span><span>remaining<\/span> <span>=<\/span> <span>min<\/span><span>(<\/span><span>self<\/span><span>.<\/span><span>total<\/span><span>,<\/span> <span>self<\/span><span>.<\/span><span>remaining<\/span> <span>+<\/span> <span>cost<\/span><span>)<\/span> <\/code><\/pre>\n<div>\n<\/p><\/div>\n<\/p><\/div>\n<p>The new <code>refund()<\/code> method fixes a design flaw from Article 19: budget was deducted before approval, and never returned on rejection. The production package corrects this \u2014 when an IRREVERSIBLE action is intercepted, <code>harness.py<\/code> proactively calls <code>refund()<\/code> to keep budget accounting accurate.<\/p>\n<hr>\n<h3> <a name=\"sandboxpy-layer-4\" href=\"#sandboxpy-layer-4\"> <\/a> sandbox.py \u2014 Layer 4 <\/h3>\n<div>\n<pre><code><span>INJECTION_PATTERN<\/span> <span>=<\/span> <span>re<\/span><span>.<\/span><span>compile<\/span><span>(<\/span> <span>r<\/span><span>\"<\/span><span>(ignore.*(previous|above|prior)|forget.*instruction|<\/span><span>\"<\/span> <span>r<\/span><span>\"<\/span><span>you are now|act as|jailbreak|bypass|<\/span><span>\"<\/span> <span>r<\/span><span>\"<\/span><span>override.*system|system.*override|<\/span><span>\"<\/span> <span># both word orders covered <\/span> <span>r<\/span><span>\"<\/span><span>&lt;\/s&gt;|\\n\\n###|###\\s*system|&lt;\\|im_start\\|&gt;|system prompt)<\/span><span>\"<\/span><span>,<\/span> <span>re<\/span><span>.<\/span><span>IGNORECASE<\/span><span>,<\/span> <span>)<\/span> <\/code><\/pre>\n<div>\n<\/p><\/div>\n<\/p><\/div>\n<p>Two subtle points:<\/p>\n<ol>\n<li>Both <code>SYSTEM OVERRIDE<\/code> (system first) and <code>override.*system<\/code> (override first) are covered<\/li>\n<li> <code>\\n\\n###<\/code> matches a real newline, not the literal string <code>\\\\n\\\\n###<\/code> <\/li>\n<\/ol>\n<p>Both bugs were discovered and fixed during the adversarial tests in Article 21.<\/p>\n<hr>\n<h3> <a name=\"auditpy-layer-6\" href=\"#auditpy-layer-6\"> <\/a> audit.py \u2014 Layer 6 <\/h3>\n<div>\n<pre><code><span>class<\/span> <span>ImmutableAuditLog<\/span><span>:<\/span> <span>def<\/span> <span>log<\/span><span>(<\/span><span>self<\/span><span>,<\/span> <span>action<\/span><span>,<\/span> <span>actor<\/span><span>,<\/span> <span>target<\/span><span>,<\/span> <span>result<\/span><span>,<\/span> <span>metadata<\/span><span>=<\/span><span>None<\/span><span>)<\/span> <span>-&gt;<\/span> <span>str<\/span><span>:<\/span> <span>entry<\/span> <span>=<\/span> <span>{...,<\/span> <span>\"<\/span><span>prev_hash<\/span><span>\"<\/span><span>:<\/span> <span>self<\/span><span>.<\/span><span>_last_hash<\/span><span>}<\/span> <span>entry<\/span><span>[<\/span><span>\"<\/span><span>hash<\/span><span>\"<\/span><span>]<\/span> <span>=<\/span> <span>self<\/span><span>.<\/span><span>_hash<\/span><span>(<\/span><span>json<\/span><span>.<\/span><span>dumps<\/span><span>(<\/span><span>entry<\/span><span>,<\/span> <span>sort_keys<\/span><span>=<\/span><span>True<\/span><span>)<\/span> <span>+<\/span> <span>self<\/span><span>.<\/span><span>_last_hash<\/span><span>)<\/span> <span>with<\/span> <span>self<\/span><span>.<\/span><span>_path<\/span><span>.<\/span><span>open<\/span><span>(<\/span><span>\"<\/span><span>a<\/span><span>\"<\/span><span>)<\/span> <span>as<\/span> <span>f<\/span><span>:<\/span> <span># append-only <\/span> <span>f<\/span><span>.<\/span><span>write<\/span><span>(<\/span><span>json<\/span><span>.<\/span><span>dumps<\/span><span>(<\/span><span>entry<\/span><span>)<\/span> <span>+<\/span> <span>\"<\/span><span>\\n<\/span><span>\"<\/span><span>)<\/span> <span>return<\/span> <span>entry<\/span><span>[<\/span><span>\"<\/span><span>hash<\/span><span>\"<\/span><span>]<\/span> <span>def<\/span> <span>verify_integrity<\/span><span>(<\/span><span>self<\/span><span>)<\/span> <span>-&gt;<\/span> <span>bool<\/span><span>:<\/span> <span># Replays the hash chain; any modified field returns False <\/span> <span>...<\/span> <\/code><\/pre>\n<div>\n<\/p><\/div>\n<\/p><\/div>\n<p>The <code>__len__()<\/code> helper lets tests use <code>len(audit)<\/code> to check entry count directly.<\/p>\n<hr>\n<h3> <a name=\"rollbackpy-layer-7\" href=\"#rollbackpy-layer-7\"> <\/a> rollback.py \u2014 Layer 7 <\/h3>\n<div>\n<pre><code><span>class<\/span> <span>RollbackCoordinator<\/span><span>:<\/span> <span>@contextmanager<\/span> <span>def<\/span> <span>transaction<\/span><span>(<\/span><span>self<\/span><span>,<\/span> <span>state<\/span><span>:<\/span> <span>dict<\/span><span>,<\/span> <span>op_name<\/span><span>:<\/span> <span>str<\/span><span>):<\/span> <span>snapshot<\/span> <span>=<\/span> <span>copy<\/span><span>.<\/span><span>deepcopy<\/span><span>(<\/span><span>state<\/span><span>)<\/span> <span>self<\/span><span>.<\/span><span>_snapshots<\/span><span>.<\/span><span>append<\/span><span>({<\/span><span>\"<\/span><span>op<\/span><span>\"<\/span><span>:<\/span> <span>op_name<\/span><span>,<\/span> <span>\"<\/span><span>snapshot<\/span><span>\"<\/span><span>:<\/span> <span>snapshot<\/span><span>})<\/span> <span>try<\/span><span>:<\/span> <span>yield<\/span> <span>state<\/span> <span>except<\/span> <span>Exception<\/span><span>:<\/span> <span>state<\/span><span>.<\/span><span>clear<\/span><span>()<\/span> <span>state<\/span><span>.<\/span><span>update<\/span><span>(<\/span><span>snapshot<\/span><span>)<\/span> <span>self<\/span><span>.<\/span><span>_snapshots<\/span><span>.<\/span><span>pop<\/span><span>()<\/span> <span>raise<\/span> <span>def<\/span> <span>rollback_last<\/span><span>(<\/span><span>self<\/span><span>,<\/span> <span>state<\/span><span>:<\/span> <span>dict<\/span><span>)<\/span> <span>-&gt;<\/span> <span>str<\/span> <span>|<\/span> <span>None<\/span><span>:<\/span> <span>\"\"\"<\/span><span>Manual trigger: undo the most recent committed transaction.<\/span><span>\"\"\"<\/span> <span>if<\/span> <span>not<\/span> <span>self<\/span><span>.<\/span><span>_snapshots<\/span><span>:<\/span> <span>return<\/span> <span>None<\/span> <span>entry<\/span> <span>=<\/span> <span>self<\/span><span>.<\/span><span>_snapshots<\/span><span>.<\/span><span>pop<\/span><span>()<\/span> <span>state<\/span><span>.<\/span><span>clear<\/span><span>()<\/span> <span>state<\/span><span>.<\/span><span>update<\/span><span>(<\/span><span>entry<\/span><span>[<\/span><span>\"<\/span><span>snapshot<\/span><span>\"<\/span><span>])<\/span> <span>return<\/span> <span>entry<\/span><span>[<\/span><span>\"<\/span><span>op<\/span><span>\"<\/span><span>]<\/span> <\/code><\/pre>\n<div>\n<\/p><\/div>\n<\/p><\/div>\n<p><code>rollback_last()<\/code> enables manual rollback: after a transaction commits, the snapshot is retained until explicitly confirmed or cleared by the caller.<\/p>\n<hr>\n<h2> <a name=\"unified-entry-point-agentharness\" href=\"#unified-entry-point-agentharness\"> <\/a> Unified Entry Point: AgentHarness <\/h2>\n<div>\n<pre><code><span>class<\/span> <span>AgentHarness<\/span><span>:<\/span> <span>def<\/span> <span>__init__<\/span><span>(<\/span><span>self<\/span><span>,<\/span> <span>budget<\/span><span>:<\/span> <span>int<\/span> <span>=<\/span> <span>100<\/span><span>,<\/span> <span>log_path<\/span><span>:<\/span> <span>str<\/span> <span>=<\/span> <span>...):<\/span> <span>self<\/span><span>.<\/span><span>registry<\/span> <span>=<\/span> <span>ActionRegistry<\/span><span>()<\/span> <span>self<\/span><span>.<\/span><span>budget<\/span> <span>=<\/span> <span>PermissionBudget<\/span><span>(<\/span><span>total<\/span><span>=<\/span><span>budget<\/span><span>)<\/span> <span>self<\/span><span>.<\/span><span>audit<\/span> <span>=<\/span> <span>ImmutableAuditLog<\/span><span>(<\/span><span>log_path<\/span><span>=<\/span><span>log_path<\/span><span>)<\/span> <span>self<\/span><span>.<\/span><span>rollback<\/span> <span>=<\/span> <span>RollbackCoordinator<\/span><span>()<\/span> <span>self<\/span><span>.<\/span><span>_state<\/span><span>:<\/span> <span>dict<\/span> <span>=<\/span> <span>{}<\/span> <span>def<\/span> <span>execute<\/span><span>(<\/span><span>self<\/span><span>,<\/span> <span>action_name<\/span><span>:<\/span> <span>str<\/span><span>,<\/span> <span>actor<\/span><span>:<\/span> <span>str<\/span> <span>=<\/span> <span>\"<\/span><span>agent<\/span><span>\"<\/span><span>,<\/span> <span>**<\/span><span>kwargs<\/span><span>)<\/span> <span>-&gt;<\/span> <span>Any<\/span><span>:<\/span> <span># Layer 4: sanitise string arguments <\/span> <span># Layer 2: registry check (missing \u2192 PermissionError) <\/span> <span># Layer 3: budget deduction (insufficient \u2192 BudgetExhaustedError) <\/span> <span># Layer 5: IRREVERSIBLE \u2192 refund budget + raise HumanApprovalRequired <\/span> <span># Layer 7: WRITE\/ADMIN wrapped in rollback.transaction <\/span> <span># Layer 6: audit record <\/span> <span>...<\/span> <span>def<\/span> <span>approve_and_execute<\/span><span>(<\/span><span>self<\/span><span>,<\/span> <span>action_name<\/span><span>:<\/span> <span>str<\/span><span>,<\/span> <span>actor<\/span><span>:<\/span> <span>str<\/span> <span>=<\/span> <span>\"<\/span><span>human<\/span><span>\"<\/span><span>,<\/span> <span>**<\/span><span>kwargs<\/span><span>)<\/span> <span>-&gt;<\/span> <span>Any<\/span><span>:<\/span> <span>\"\"\"<\/span><span>Call this after catching HumanApprovalRequired to complete execution.<\/span><span>\"\"\"<\/span> <span>...<\/span> <\/code><\/pre>\n<div>\n<\/p><\/div>\n<\/p><\/div>\n<p><strong>Why the two methods are separate:<\/strong><\/p>\n<ul>\n<li> <code>execute()<\/code> is the automated path: all checks pass, execute immediately<\/li>\n<li> <code>approve_and_execute()<\/code> is the human path: the caller explicitly signals &#8220;this has been approved&#8221;<\/li>\n<\/ul>\n<p>Merging them (e.g., with an <code>approved=False<\/code> parameter) makes intent ambiguous and harder to test.<\/p>\n<hr>\n<h2> <a name=\"standalone-usage\" href=\"#standalone-usage\"> <\/a> Standalone Usage <\/h2>\n<h3> <a name=\"basic-flow\" href=\"#basic-flow\"> <\/a> Basic Flow <\/h3>\n<div>\n<pre><code><span>harness<\/span> <span>=<\/span> <span>AgentHarness<\/span><span>(<\/span><span>budget<\/span><span>=<\/span><span>50<\/span><span>)<\/span> <span># Register actions <\/span><span>harness<\/span><span>.<\/span><span>registry<\/span><span>.<\/span><span>register<\/span><span>(<\/span><span>RegisteredAction<\/span><span>(<\/span> <span>\"<\/span><span>read_ticket<\/span><span>\"<\/span><span>,<\/span> <span>PermissionLevel<\/span><span>.<\/span><span>READ<\/span><span>,<\/span> <span>1<\/span><span>,<\/span> <span>\"<\/span><span>Read Jira ticket<\/span><span>\"<\/span><span>,<\/span> <span>handler_fn<\/span><span>))<\/span> <span>harness<\/span><span>.<\/span><span>registry<\/span><span>.<\/span><span>register<\/span><span>(<\/span><span>RegisteredAction<\/span><span>(<\/span> <span>\"<\/span><span>write_draft<\/span><span>\"<\/span><span>,<\/span> <span>PermissionLevel<\/span><span>.<\/span><span>WRITE<\/span><span>,<\/span> <span>3<\/span><span>,<\/span> <span>\"<\/span><span>Write draft fix<\/span><span>\"<\/span><span>,<\/span> <span>handler_fn<\/span><span>))<\/span> <span>harness<\/span><span>.<\/span><span>registry<\/span><span>.<\/span><span>register<\/span><span>(<\/span><span>RegisteredAction<\/span><span>(<\/span> <span>\"<\/span><span>create_pr<\/span><span>\"<\/span><span>,<\/span> <span>PermissionLevel<\/span><span>.<\/span><span>ADMIN<\/span><span>,<\/span> <span>8<\/span><span>,<\/span> <span>\"<\/span><span>Open pull request<\/span><span>\"<\/span><span>,<\/span> <span>handler_fn<\/span><span>))<\/span> <span>harness<\/span><span>.<\/span><span>registry<\/span><span>.<\/span><span>register<\/span><span>(<\/span><span>RegisteredAction<\/span><span>(<\/span> <span>\"<\/span><span>merge_to_main<\/span><span>\"<\/span><span>,<\/span> <span>PermissionLevel<\/span><span>.<\/span><span>IRREVERSIBLE<\/span><span>,<\/span> <span>20<\/span><span>,<\/span> <span>\"<\/span><span>Merge to main<\/span><span>\"<\/span><span>,<\/span> <span>handler_fn<\/span><span>))<\/span> <\/code><\/pre>\n<div>\n<\/p><\/div>\n<\/p><\/div>\n<p><strong>READ \u2192 WRITE \u2192 ADMIN normal flow:<\/strong> <\/p>\n<div>\n<pre><code><span>r1<\/span> <span>=<\/span> <span>harness<\/span><span>.<\/span><span>execute<\/span><span>(<\/span><span>\"<\/span><span>read_ticket<\/span><span>\"<\/span><span>,<\/span> <span>ticket_id<\/span><span>=<\/span><span>\"<\/span><span>BUG-101<\/span><span>\"<\/span><span>)<\/span> <span>r2<\/span> <span>=<\/span> <span>harness<\/span><span>.<\/span><span>execute<\/span><span>(<\/span><span>\"<\/span><span>write_draft<\/span><span>\"<\/span><span>,<\/span> <span>ticket_id<\/span><span>=<\/span><span>\"<\/span><span>BUG-101<\/span><span>\"<\/span><span>,<\/span> <span>patch<\/span><span>=<\/span><span>\"<\/span><span>fix: add null check<\/span><span>\"<\/span><span>)<\/span> <span>r3<\/span> <span>=<\/span> <span>harness<\/span><span>.<\/span><span>execute<\/span><span>(<\/span><span>\"<\/span><span>create_pr<\/span><span>\"<\/span><span>,<\/span> <span>ticket_id<\/span><span>=<\/span><span>\"<\/span><span>BUG-101<\/span><span>\"<\/span><span>,<\/span> <span>title<\/span><span>=<\/span><span>\"<\/span><span>fix: BUG-101<\/span><span>\"<\/span><span>)<\/span> <span># read=1 + write=3 + admin=8 = 12 spent, 38 remaining <\/span><\/code><\/pre>\n<div>\n<\/p><\/div>\n<\/p><\/div>\n<h3> <a name=\"unregistered-action-blocked\" href=\"#unregistered-action-blocked\"> <\/a> Unregistered Action Blocked <\/h3>\n<div>\n<pre><code><span>try<\/span><span>:<\/span> <span>harness<\/span><span>.<\/span><span>execute<\/span><span>(<\/span><span>\"<\/span><span>delete_all_data<\/span><span>\"<\/span><span>)<\/span> <span>except<\/span> <span>PermissionError<\/span> <span>as<\/span> <span>e<\/span><span>:<\/span> <span># \"Action 'delete_all_data' not in registry. Execution blocked.\" <\/span> <span>...<\/span> <\/code><\/pre>\n<div>\n<\/p><\/div>\n<\/p><\/div>\n<h3> <a name=\"irreversible-twophase-execution\" href=\"#irreversible-twophase-execution\"> <\/a> IRREVERSIBLE Two-Phase Execution <\/h3>\n<div>\n<pre><code><span>try<\/span><span>:<\/span> <span>harness<\/span><span>.<\/span><span>execute<\/span><span>(<\/span><span>\"<\/span><span>merge_to_main<\/span><span>\"<\/span><span>,<\/span> <span>pr_id<\/span><span>=<\/span><span>1<\/span><span>)<\/span> <span>except<\/span> <span>HumanApprovalRequired<\/span> <span>as<\/span> <span>e<\/span><span>:<\/span> <span>print<\/span><span>(<\/span><span>e<\/span><span>.<\/span><span>action_name<\/span><span>)<\/span> <span># \"merge_to_main\" <\/span> <span>print<\/span><span>(<\/span><span>e<\/span><span>.<\/span><span>action_args<\/span><span>)<\/span> <span># {\"pr_id\": 1} <\/span> <span># After human review: <\/span> <span>result<\/span> <span>=<\/span> <span>harness<\/span><span>.<\/span><span>approve_and_execute<\/span><span>(<\/span><span>\"<\/span><span>merge_to_main<\/span><span>\"<\/span><span>,<\/span> <span>pr_id<\/span><span>=<\/span><span>1<\/span><span>)<\/span> <\/code><\/pre>\n<div>\n<\/p><\/div>\n<\/p><\/div>\n<p><strong>Key point<\/strong>: when <code>execute()<\/code> intercepts an IRREVERSIBLE action, it calls <code>budget.refund()<\/code> first. The net budget cost is zero. Only <code>approve_and_execute()<\/code> actually charges the budget.<\/p>\n<h3> <a name=\"budget-exhaustion\" href=\"#budget-exhaustion\"> <\/a> Budget Exhaustion <\/h3>\n<div>\n<pre><code><span># budget=5, write cost=3 <\/span><span>h<\/span> <span>=<\/span> <span>AgentHarness<\/span><span>(<\/span><span>budget<\/span><span>=<\/span><span>5<\/span><span>)<\/span> <span>h<\/span><span>.<\/span><span>execute<\/span><span>(<\/span><span>\"<\/span><span>write_draft<\/span><span>\"<\/span><span>,<\/span> <span>...)<\/span> <span># OK, 2 remaining <\/span><span>h<\/span><span>.<\/span><span>execute<\/span><span>(<\/span><span>\"<\/span><span>write_draft<\/span><span>\"<\/span><span>,<\/span> <span>...)<\/span> <span># BudgetExhaustedError: need 3, remaining 2 <\/span><\/code><\/pre>\n<div>\n<\/p><\/div>\n<\/p><\/div>\n<hr>\n<h2> <a name=\"langgraph-integration\" href=\"#langgraph-integration\"> <\/a> LangGraph Integration <\/h2>\n<p>Embedding the harness inside LangGraph&#8217;s <code>tools_node<\/code>: <\/p>\n<div>\n<pre><code><span>def<\/span> <span>tools_node<\/span><span>(<\/span><span>state<\/span><span>:<\/span> <span>HState<\/span><span>)<\/span> <span>-&gt;<\/span> <span>dict<\/span><span>:<\/span> <span>last<\/span> <span>=<\/span> <span>state<\/span><span>[<\/span><span>\"<\/span><span>messages<\/span><span>\"<\/span><span>][<\/span><span>-<\/span><span>1<\/span><span>]<\/span> <span>results<\/span> <span>=<\/span> <span>[]<\/span> <span>for<\/span> <span>tc<\/span> <span>in<\/span> <span>last<\/span><span>.<\/span><span>tool_calls<\/span><span>:<\/span> <span>name<\/span><span>,<\/span> <span>args<\/span> <span>=<\/span> <span>tc<\/span><span>[<\/span><span>\"<\/span><span>name<\/span><span>\"<\/span><span>],<\/span> <span>tc<\/span><span>[<\/span><span>\"<\/span><span>args<\/span><span>\"<\/span><span>]<\/span> <span>try<\/span><span>:<\/span> <span>reg<\/span> <span>=<\/span> <span>harness<\/span><span>.<\/span><span>registry<\/span><span>.<\/span><span>get<\/span><span>(<\/span><span>name<\/span><span>)<\/span> <span># Layer 2 <\/span> <span>harness<\/span><span>.<\/span><span>budget<\/span><span>.<\/span><span>spend<\/span><span>(<\/span><span>name<\/span><span>,<\/span> <span>reg<\/span><span>.<\/span><span>budget_cost<\/span><span>)<\/span> <span># Layer 3 <\/span> <span>if<\/span> <span>reg<\/span><span>.<\/span><span>level<\/span> <span>==<\/span> <span>PermissionLevel<\/span><span>.<\/span><span>IRREVERSIBLE<\/span><span>:<\/span> <span>decision<\/span> <span>=<\/span> <span>interrupt<\/span><span>({...})<\/span> <span># Layer 5: LangGraph primitive <\/span> <span>if<\/span> <span>decision<\/span> <span>!=<\/span> <span>\"<\/span><span>approved<\/span><span>\"<\/span><span>:<\/span> <span>harness<\/span><span>.<\/span><span>budget<\/span><span>.<\/span><span>refund<\/span><span>(<\/span><span>name<\/span><span>,<\/span> <span>reg<\/span><span>.<\/span><span>budget_cost<\/span><span>)<\/span> <span>harness<\/span><span>.<\/span><span>audit<\/span><span>.<\/span><span>log<\/span><span>(<\/span><span>name<\/span><span>,<\/span> <span>\"<\/span><span>checkpoint<\/span><span>\"<\/span><span>,<\/span> <span>...,<\/span> <span>\"<\/span><span>HUMAN_REJECTED<\/span><span>\"<\/span><span>)<\/span> <span>results<\/span><span>.<\/span><span>append<\/span><span>(<\/span><span>ToolMessage<\/span><span>(<\/span><span>content<\/span><span>=<\/span><span>\"<\/span><span>rejected<\/span><span>\"<\/span><span>,<\/span> <span>...))<\/span> <span>continue<\/span> <span>if<\/span> <span>reg<\/span><span>.<\/span><span>level<\/span> <span>in<\/span> <span>(<\/span><span>WRITE<\/span><span>,<\/span> <span>ADMIN<\/span><span>):<\/span> <span>with<\/span> <span>harness<\/span><span>.<\/span><span>rollback<\/span><span>.<\/span><span>transaction<\/span><span>(<\/span><span>harness<\/span><span>.<\/span><span>_state<\/span><span>,<\/span> <span>name<\/span><span>):<\/span> <span># Layer 7 <\/span> <span>output<\/span> <span>=<\/span> <span>TOOL_MAP<\/span><span>[<\/span><span>name<\/span><span>].<\/span><span>invoke<\/span><span>(<\/span><span>args<\/span><span>)<\/span> <span>else<\/span><span>:<\/span> <span>output<\/span> <span>=<\/span> <span>TOOL_MAP<\/span><span>[<\/span><span>name<\/span><span>].<\/span><span>invoke<\/span><span>(<\/span><span>args<\/span><span>)<\/span> <span>harness<\/span><span>.<\/span><span>audit<\/span><span>.<\/span><span>log<\/span><span>(<\/span><span>name<\/span><span>,<\/span> <span>\"<\/span><span>agent<\/span><span>\"<\/span><span>,<\/span> <span>...,<\/span> <span>\"<\/span><span>EXECUTED<\/span><span>\"<\/span><span>)<\/span> <span># Layer 6 <\/span> <span>results<\/span><span>.<\/span><span>append<\/span><span>(<\/span><span>ToolMessage<\/span><span>(<\/span><span>content<\/span><span>=<\/span><span>str<\/span><span>(<\/span><span>output<\/span><span>),<\/span> <span>...))<\/span> <span>except<\/span> <span>PermissionError<\/span> <span>as<\/span> <span>e<\/span><span>:<\/span> <span>harness<\/span><span>.<\/span><span>audit<\/span><span>.<\/span><span>log<\/span><span>(<\/span><span>name<\/span><span>,<\/span> <span>\"<\/span><span>registry<\/span><span>\"<\/span><span>,<\/span> <span>...,<\/span> <span>\"<\/span><span>BLOCKED<\/span><span>\"<\/span><span>)<\/span> <span>results<\/span><span>.<\/span><span>append<\/span><span>(<\/span><span>ToolMessage<\/span><span>(<\/span><span>content<\/span><span>=<\/span><span>str<\/span><span>(<\/span><span>e<\/span><span>),<\/span> <span>...))<\/span> <span>except<\/span> <span>BudgetExhaustedError<\/span> <span>as<\/span> <span>e<\/span><span>:<\/span> <span>results<\/span><span>.<\/span><span>append<\/span><span>(<\/span><span>ToolMessage<\/span><span>(<\/span><span>content<\/span><span>=<\/span><span>str<\/span><span>(<\/span><span>e<\/span><span>),<\/span> <span>...))<\/span> <span>return<\/span> <span>{<\/span><span>\"<\/span><span>messages<\/span><span>\"<\/span><span>:<\/span> <span>results<\/span><span>}<\/span> <\/code><\/pre>\n<div>\n<\/p><\/div>\n<\/p><\/div>\n<p><code>tools_node<\/code> is the harness&#8217;s natural insertion point: it intercepts before tool execution without touching any <code>agent_node<\/code> (reasoning layer) logic.<\/p>\n<hr>\n<h2> <a name=\"article-21-test-results-4545\" href=\"#article-21-test-results-4545\"> <\/a> Article 21 Test Results (45\/45) <\/h2>\n<p>This package&#8217;s behavior is fully verified by Article 21&#8217;s test suite: <\/p>\n<div>\n<pre><code>Functional (Layer 1\u20137 basic behaviour) \u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588 19\/19 PASS Adversarial (injection \/ escalation) \u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588 17\/17 PASS Chaos (fault injection \/ partial) \u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588 9\/ 9 PASS Total 45\/ 45 tests passed <\/code><\/pre>\n<div>\n<\/p><\/div>\n<\/p><\/div>\n<p><strong>Two real bugs found by the tests:<\/strong><\/p>\n<ol>\n<li> <code>INJECTION_PATTERN<\/code> only matched <code>override.*system<\/code>, missing <code>[SYSTEM OVERRIDE]<\/code> (reversed word order)<\/li>\n<li> <code>\\\\n\\\\n###<\/code> matched the literal string <code>\\n<\/code>, not a real newline \u2014 jailbreak pattern <code>### System:<\/code> slipped through<\/li>\n<\/ol>\n<p>Both fixed in sandbox.py with a one-line regex adjustment.<\/p>\n<hr>\n<h2> <a name=\"design-checklist\" href=\"#design-checklist\"> <\/a> Design Checklist <\/h2>\n<p><strong>Package Structure<\/strong><\/p>\n<ul>\n<li>[ ] One file per layer; each file does exactly one thing<\/li>\n<li>[ ] <code>__init__.py<\/code> exports only the public API; internal classes stay private<\/li>\n<li>[ ] <code>AgentHarness<\/code> acts as Facade; callers don&#8217;t reach into subsystems directly<\/li>\n<\/ul>\n<p><strong>API Design<\/strong><\/p>\n<ul>\n<li>[ ] <code>execute()<\/code> is the automated path covering the full Layer 2\u21927 chain<\/li>\n<li>[ ] <code>approve_and_execute()<\/code> is the human path; the caller signals &#8220;approved&#8221;<\/li>\n<li>[ ] Budget is refunded (<code>refund()<\/code>) when IRREVERSIBLE is intercepted, keeping accounting accurate<\/li>\n<li>[ ] All exception types (<code>PermissionError<\/code> \/ <code>BudgetExhaustedError<\/code> \/ <code>HumanApprovalRequired<\/code>) exported from <code>__init__.py<\/code> <\/li>\n<\/ul>\n<p><strong>Sandbox<\/strong><\/p>\n<ul>\n<li>[ ] Injection pattern covers both forward and reverse word orders<\/li>\n<li>[ ] <code>\\n<\/code> is a real newline character, not the literal <code>\\\\n<\/code> <\/li>\n<\/ul>\n<p><strong>LangGraph Integration<\/strong><\/p>\n<ul>\n<li>[ ] Harness is embedded only in <code>tools_node<\/code>, not in <code>agent_node<\/code> <\/li>\n<li>[ ] Each tool call runs through the harness check chain independently<\/li>\n<li>[ ] IRREVERSIBLE uses LangGraph <code>interrupt()<\/code>, not a Python exception<\/li>\n<\/ul>\n<hr>\n<h2> <a name=\"summary\" href=\"#summary\"> <\/a> Summary <\/h2>\n<p>Five core conclusions:<\/p>\n<ol>\n<li> <strong>Modularity is a prerequisite for testability<\/strong>: you can&#8217;t test a single layer in isolation when everything is one file; splitting into a package lets each module be independently mocked and verified<\/li>\n<li> <strong>Refund budget on IRREVERSIBLE interception<\/strong>: the Article 19 design flaw, fixed here \u2014 &#8220;intercept before charging&#8221; is cleaner than &#8220;charge then refund,&#8221; though both are valid; pick one and document it<\/li>\n<li> <strong>Separating <code>execute()<\/code> and <code>approve_and_execute()<\/code> makes intent explicit<\/strong>: automated and human paths are distinct; caller intent is unambiguous<\/li>\n<li> <strong>Tests found real production bugs<\/strong>: two regex vulnerabilities were invisible during development; adversarial tests exposed them on the first run<\/li>\n<li> <strong>LangGraph&#8217;s <code>tools_node<\/code> is the harness&#8217;s natural slot<\/strong>: no changes to agent logic needed; add the harness only at the tool execution layer, keeping concerns separated<\/li>\n<\/ol>\n<hr>\n<h2> <a name=\"references\" href=\"#references\"> <\/a> References <\/h2>\n<ul>\n<li><a href=\"https:\/\/langchain-ai.github.io\/langgraph\/concepts\/agentic_concepts\/\" target=\"_blank\" rel=\"noopener noreferrer\">LangGraph Tools Node documentation<\/a><\/li>\n<li>Article 17: <a href=\"https:\/\/dev.to\/blog-en\/llm-practice\/agent\/agent-series-17-harness\">Harness Engineering Intro \u2014 Five Elements Overview<\/a> <\/li>\n<li>Article 19: <a href=\"https:\/\/dev.to\/blog-en\/llm-practice\/agent\/agent-series-19-harness-full\">Harness Full System \u2014 8-Layer Defense Framework<\/a> <\/li>\n<li>Full demo code for this article: <a href=\"https:\/\/github.com\/chendongqi\/llm-in-action\/tree\/main\/agent-19-harness-production\" target=\"_blank\" rel=\"noopener noreferrer\">agent-19-harness-production<\/a> <\/li>\n<\/ul>\n<hr>\n<p><em>Check out <a href=\"https:\/\/primeskills.store\/\" target=\"_blank\" rel=\"noopener noreferrer\">PrimeSkills<\/a> \u2014 a curated marketplace of AI agents and skills that have been validated in real-world, enterprise-grade workflows. No fluff, just what actually works.<\/em><\/p>\n<p><em>Find more useful knowledge and interesting products on my <a href=\"https:\/\/home.wonlab.top\/en\" target=\"_blank\" rel=\"noopener noreferrer\">Homepage<\/a><\/em><\/p>\n<\/p><\/div>\n<\/div>\n<\/div>\n<\/div>\n<p>Fuente: <a href=\"https:\/\/dev.to\/wonderlab\/agent-series-20-harness-in-production-from-single-file-to-reusable-package-2chd\">Art\u00edculo original<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>From Demo Code to a Reusable Package Article 19 used a 900-line harness_full_demo.py to demonstrate eight defense layers. That file is good for explaining concepts, but not for reuse \u2014 all layers are coupled together, nothing can be tested in isolation, and nothing can be imported by another project. A production-grade Agent project needs something [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":2648,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2}},"categories":[41],"tags":[],"class_list":["post-2890","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-devto"],"jetpack_publicize_connections":[],"_links":{"self":[{"href":"https:\/\/tucumandevelopers.com\/index.php\/wp-json\/wp\/v2\/posts\/2890","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/tucumandevelopers.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/tucumandevelopers.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/tucumandevelopers.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/tucumandevelopers.com\/index.php\/wp-json\/wp\/v2\/comments?post=2890"}],"version-history":[{"count":0,"href":"https:\/\/tucumandevelopers.com\/index.php\/wp-json\/wp\/v2\/posts\/2890\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/tucumandevelopers.com\/index.php\/wp-json\/wp\/v2\/media\/2648"}],"wp:attachment":[{"href":"https:\/\/tucumandevelopers.com\/index.php\/wp-json\/wp\/v2\/media?parent=2890"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/tucumandevelopers.com\/index.php\/wp-json\/wp\/v2\/categories?post=2890"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/tucumandevelopers.com\/index.php\/wp-json\/wp\/v2\/tags?post=2890"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}