THE LOGICAL BOX

AI news & training for business owners. One email. One clear next step.

THIS WEEK IN AI

Last week the AI headlines were about layoffs. This week they moved somewhere more useful. Three stories landed about AI doing real work inside real businesses, and getting good enough at it that you should pay attention.

In this issue:

  • An AI tax system filed 7,000 returns at 97% accuracy across 30+ firms

  • HubSpot's built-in AI can now build your reports and charts just by asking

  • Claude's newest model got more honest about what it does not know

  • The Deep Cut: why "the AI can do it" and "the AI does it in my business" are two very different sentences

THE SIGNAL

What happened in AI this week

Image Source: OpenAI by Andrew Keener

What happened:

OpenAI, Thrive Holdings, and a network of more than 30 accounting firms called Crete built a tax system on OpenAI's coding tool and put it to work. In the pilot it drafted 7,000 returns, mostly individual and trust filings, at up to 97% accuracy. It cut prep time by about a third and roughly doubled how many returns a firm could push through. Every time an accountant corrected it, that correction got turned into a fix the system kept.

Why it matters to your business:

The firm that does your taxes is about to get faster and cheaper to run. That shifts what you should expect on price, turnaround, and where your accountant actually spends their time. It also widens the gap between you and the business with clean books, because this kind of AI works best on clean inputs and struggles with a shoebox of receipts.

Source: openai.com

Image Source: OpenAI by Andrew Keener

What happened:

In its May release, HubSpot gave Breeze, the AI assistant inside the CRM most small teams already use, a set of new jobs. You can ask it for a chart in plain English ("show me deals closed by rep this quarter, with a line for our target") and it builds it on the spot from your CRM data, or from a spreadsheet you upload. It drafts emails and documents in a side panel you can keep editing. And when it is missing something it needs, it now stops and asks one clear question instead of guessing.

Why it matters to your business:

Most of your team is never going to learn how to build a custom report. Now they do not have to. They can ask in plain words and get the answer. And the same thing showing up in the Claude release above shows up here: the tool now asks when it is unsure instead of handing you a confident, wrong chart. That is the line between AI you can trust with your numbers and AI that quietly makes one up.

Image Source: OpenAI by Andrew Keener

What happened:

Anthropic released Claude Opus 4.8 on Thursday. The headline for most people is not the benchmark scores. It is that the model is about four times less likely than the last version to let a flaw in its own work slip by without flagging it. Anthropic also added a control that lets you dial how hard the model works on a task, and made its fast mode cheaper to run.

Why it matters to your business:

The risky moment with AI is not when it is wrong. It is when it is wrong and sounds completely sure. A model that flags its own shaky spots is one you can hand to a newer team member with fewer guardrails. That is a real operations difference, not a benchmark brag.

THE DEEP CUT

What it actually means for your business

The tax AI story is the one I keep coming back to. Not because of the 97% number, it is because of how they got there.

That system did not work well because the model is smart. It worked because every time an accountant corrected it, the correction became a specific, testable fix. Clean inputs. A tight feedback loop. And people who knew the work, checking the output before it went out. The AI was the last piece, not the first.

I see the opposite all the time. A business owner reads a story like this one and decides the lesson is "get the AI." So they buy a tool, point it at a process that was already messy, and a month later it is sitting unused. The tool was never the problem. The work underneath it was.

Here is what I mean. If your intake form collects half the information you need, an AI drafting your proposals will just give you confident, well-written proposals built on missing information. If three people each track jobs their own way in their own spreadsheet, an AI cannot pull a clean status report out of that. It will pull three different answers. Faster, and still wrong.

The tax firms got results because the work under the AI was already structured. A return has a defined shape. The inputs are documents. The output can be checked against the rules. That is close to a perfect case. Most of the work in your business is messier than a tax return, which means the AI will struggle in exactly the spots where your process is loose.

I will be honest. I used to think the tooling mattered more than it does. Early on I would walk into a business, see a slow process, and reach for whatever tool could speed it up. It took a few of those going nowhere before the pattern got obvious. The places where AI stuck were the ones that had cleaned up the work first. The places where it failed had bolted a smart tool onto a process nobody really owned.

So when you read "an AI did 7,000 returns," the useful question is not, what tool did they use. It is, what did they have in place that let the tool work. Clean inputs, a clear owner, and a way to catch mistakes before they go out the door. Get those right and almost any decent AI will help you. Skip them and the best model on the market will just move your mess faster.

Fix the work first. Then add AI. That order is the whole game. It is the difference between the 97% story and a tool sitting unused in your stack.

THE MOVE

One thing you can do this week

Pick one task in your business you have thought about handing to AI. Maybe it is drafting proposals, answering your most common customer emails, or pulling a weekly status report.

Before you touch a single tool, ask this in your next team meeting:

"If I handed this exact task to a brand new hire on their first day, with nobody to ask, could they do it right from what we have written down?"

If the answer is no, the AI will hit the same wall. It does not know the things that live only in your head or in one person's inbox. Whatever is missing for that new hire is missing for the AI too.

So this week, do not buy anything. Just write down what that new hire would need: where the information lives, what "done" looks like, and who checks it. That short document is the thing that makes AI work later. It is also the thing most businesses skip.

Get it on paper for one task. That is the whole assignment.

THAT’S A WRAP!

Every story this week points to the same thing. The tools are ready. The real question is whether your work is.

If you have a process you have been thinking about handing to AI, and you are not sure it is ready, that is exactly the conversation I have with business owners. We get on a call, I look at how the work actually flows, and I tell you where AI fits and where it would just speed up a mess.

No pitch on the call. If it is not a fit, I will say so. If it is, you will hang up knowing the first move to make.

Thanks for reading,

Andrew Keener

AI Consultant & Speaker

Keen Alliance Consulting

Please share The Logical Box link if you know anyone else who would enjoy!

Think Inside the Box. Clarity before AI.

Keep Reading