Let Your AP Clerks Do the Work That Needs a Human: ~/stubborncoder

CFO Living Examples, Part 2

A series exploring how AI agents change the daily reality of finance teams. Not theory: working systems you can try.

Where AP time actually goes

If you've ever sat next to an AP clerk you know the rhythm. Open invoice. Find the PO. Find the goods receipt. Check quantities, check prices. See a 7% variance. Open the email template. Fill in the PO number, the line, the amount, the correction request. Send. Next one.

None of that is judgment. It's pattern matching against structured data, and it's the part of the job that burns people out and causes mistakes. Worse, the invoices that actually matter get the same five minutes as everything else: the 7% variance on a strategic vendor, the cost center about to breach budget, the thing that looks like a duplicate but isn't.

The agent does the mechanical 80% end to end. Reads the invoice. Runs the 3-way match. Flags issues. Classifies them. Drafts the email or the memo. Routes the approval. Clean invoices go straight to the payment queue. Your clerks stop touching any of that and start reviewing what the agent put together.

How it makes decisions

The agent works in three layers, in order.

Layer 1 3-way match Invoice line vs PO line vs goods receipt. 2% price tolerance, exact on quantity.

Layer 2 Issue detection Price mismatch, quantity mismatch, duplicate suspect, missing PO, missing receipt, budget threshold.

Layer 3 Responsibility Vendor, internal, mixed, or informational. Drives who gets contacted.

Notice that layers 1 and 2 are entirely deterministic. No model involved. The 3-way match is arithmetic, issue detection is rules against structured data. You don't need an LLM to check whether billed quantity exceeds received quantity, and you really don't want one: that's exactly the kind of step where a hallucination would be catastrophic. The model only kicks in at layer 3, where you actually need reasoning: weighing materiality, classifying responsibility, picking the right action from context.

Which means, strictly speaking, this isn't an agent in the purist sense. It's a business process with agentic steps. A pipeline where the model is embedded at specific decision points, not in charge of the whole loop. That's not a limitation, it's the design. In finance, you want code handling what code handles well and reserving inference for the steps that genuinely need judgment. The pipeline stays reliable. The model does what it's good at. The human signs it off.

Layer three is where it gets interesting. A price mismatch is the vendor's fault, so we ask them to fix it. A cost center blowing its budget is our fault, so we go to the CC owner and the vendor never hears about it. A missing PO could go either way: maybe procurement forgot, maybe the vendor shipped without one. Different paths, different emails, different people.

Most AP automation treats every problem the same. Figuring out who's actually responsible is the difference between useful and awkward conversations with vendors.

Then the reasoning layer picks one of five actions per invoice, with a reason and a materiality call: approve, request_correction (email the vendor), hold_pending_internal (escalate internally, don't tell the vendor), reject_to_vendor, or reject_internal. Safety overrides stop the obvious mistakes. It can't approve an invoice with blocking issues, and it can't send a rejection to a vendor before a human has looked at it.

What it actually does for your team

It's not a dashboard that flags exceptions and hopes someone gets to them. It acts:

Autonomous actions

Runs a 3-way match on every invoice line. Green, yellow, red per line, inline in the UI.
Detects and reasons about issues with per-issue evidence: the detection logic, the data excerpt that triggered it, and the materiality call.
Drafts vendor correction emails that are concrete, not bracketed templates. PO numbers, line references, EUR amounts filled in.
Drafts internal memos for issues the vendor shouldn't hear about: budget overruns, duplicate-invoice investigations, procurement gaps.
Builds an approval routing tree per invoice. Tier 1 (cost center owner), Tier 2 (+ AP manager), Tier 3 (+ Finance director), rendered as a live diagram you can click through.
Clears clean invoices straight to "ready to pay" and hands them off to the payment agent, no human in the middle for the boring 80%.
Surfaces early-payment discount opportunities with the savings math and days remaining on the discount window.
Tracks cost avoided: the sum of overcharges the agent caught across price variance, quantity variance, and duplicate exposure, separate from discount opportunities.

The audit question

Every CFO asks the same thing. Can I trace what it did and why?

Yes, and AP adds a nice twist. Every issue comes with its detection logic (for example, "price variance 7.14% exceeds 2% tolerance") and the evidence behind it: the PO line, the goods receipt, the invoice row. Ask why vendor X was contacted on April 3rd and you get the match result, the issue, the rationale, and the email that went out. The actual thing, not a summary. That's what holds up in an audit.

The inference bill

An AP cycle fires a lot of model calls. One assessment per invoice, plus an email or a memo when the action needs it, plus a routing tree, plus a CFO summary at the end. Thousands of invoices a month and calling a frontier API for every one of them stops being a rounding error.

The pipeline uses a small, fast model instead. Classifying an issue or drafting a short email doesn't need a reasoning model, a lean one does the job at a fraction of the cost.

And the pipeline is agnostic about where it runs. Bring your own model, deploy it wherever fits your setup. It just needs structured output back.

Beyond AP

The loop behind this agent, detect, classify, decide, act, log, isn't specific to invoice verification. It works for anything where the hard part isn't spotting the exception but figuring out who owns it.

Expense auditing. Master-data quality. Vendor onboarding. Intercompany matching. T&E policy. Same cycle, different rules.

What's next

A look at where this is going. Four pipelines covering the working-capital cycle inside the CFO's office:

Part 1 AR Collections decides when to collect.

Part 2 Invoice Verification (this one) decides what to pay.

Part 3 Payment times each payment to capture early-payment discounts, then reports the flow back to the CFO team.

Part 4 EBS reader ingests bank statements and feeds transaction updates to the other pipelines.

Each one is useful on its own. Let's see whether they can be even more useful together.

Payment next. Then EBS.

See it work

The agent is running live with a demo dataset. You can watch it match real invoices, see which issues it labels vendor vs internal, read the emails and memos it drafts, and click through the approval diagrams.

Open the live demo at invoice-verification.stubborncoder.cloud →

Discussion

Comments live on LinkedIn. Drop a thought, ask a question, or share your own take.

Comment on LinkedIn →

Connect

LinkedIn →

Let AP Clerks Do the WorkThat Needs a Human