AI is like a dog. Very eager to please.
Stay with me on this one.
It comes when you call. It fetches what you ask for. It also fetches what it thinks you asked for, with the same wagging enthusiasm, and won't tell you which of the two it just did.
Hat tip to Steve Hancock at Azola.tech, who put this analogy in my head. Borrowing it because it's a good one.
Once you see it, you can't unsee it. And it changes what the actual job is.
The eager-to-please bit is doing a lot of work
A modern model is trained to give you an answer. That is, in the most literal sense, what it's optimised to do. Not "give you the right answer." Not "tell you when it doesn't know." Give you an answer, in your tone, in roughly the shape you wanted, fast.
Tail up, slipper in mouth, problem solved.
The slipper might be the wrong slipper. It might be one of yours and one of mine. It might, occasionally, be the cat. The dog doesn't know the difference, because the dog isn't grading itself on slipper-correctness. It's grading itself on did I bring back a thing, and was the human pleased.
This is the bit nobody really tells you up front. The model isn't trying to deceive you when it confidently invents a meeting that didn't happen, or summarises a contract clause that isn't in the contract. It's doing the thing it was trained to do, beautifully. The fact that the thing is wrong is your problem, not its.
A small, no-jargon detour
There's a piece of theory under all this that's worth thirty seconds. It's called Natural Language Understanding — NLU. Don't worry about the acronym.
The dog understands the gist of what you say. Tell it "fetch" and it fetches. Tell it "fetch, but not the cat's bowl, and not the slipper my partner's currently wearing" and you've already lost. It heard "fetch."
Models work the same way. They read intent in roughly the shape you sent it. Most of the time that's enough. Some of the time it's confidently, fluently, helpfully wrong — and the only way you find out is by checking. The point isn't that the dog is stupid. The dog is brilliant. The point is that "understands the gist" and "does exactly what you meant" are not the same skill, and never will be.
That's the whole reason the fence matters. Not because the dog is malicious. Because the dog is keen, and keen without constraints is how the chocolate gets eaten.
What "guard rails" actually means
Strip the jargon out and there are three things, all of which a dog owner already understands.
The leash. What the model is allowed to do. Which tools it can call, which APIs it can hit, whose inbox it can send from, what it's allowed to click on your behalf. A model with no leash will absolutely book the meeting, send the email and submit the form — and it'll do it the moment it decides that's what you wanted. The leash isn't about distrust. It's about not having to revisit every walk afterwards.
The fence. What the model is allowed to see and reach. Which documents, which folders, which customer records, which systems. A clever model behind a bad fence is more dangerous than a mediocre model behind a good one — because it'll be more confident about the wrong thing, and it'll get further with it. The exposure here is bigger than most teams realise and it doesn't show up until the day it does.
The trainer. The system prompt that tells the model how to behave, the evaluations that catch it when it drifts, the human in the loop on anything that matters. Training never finishes. Models change. Tools change. Your data changes. Behaviour drifts. You don't set this up once and walk off — same as you don't with a dog.
Three things, in plain English. Almost everything labelled "AI safety" or "responsible AI" in a procurement document is one of those three, dressed up.
A clever model behind a bad fence is a confident liar with the keys.
Ring-fencing, said properly
"Ring-fencing" gets used a lot, often loosely. The useful version isn't where the model lives. The useful version is what it can reach.
A model running on a server in your building, plugged into everything, is more exposed than a hosted model with a tight, audited list of what it's allowed to read and write. The location is interesting; the perimeter is what matters. Where can it look. What can it touch. What can it send. Who can stop it.
When we walk through this with clients, the unlock is almost always the same. The team has been arguing about which model is the cleverest, when the real question was who's holding the leash and where the fence runs. Solve that and the cleverness conversation gets a lot easier — because it stops being a security conversation in disguise.
The two ways teams get this wrong
Both common. Both, frankly, the same mistake in opposite trousers.
No fence at all. "It worked in the demo." It always works in the demo. The demo doesn't have your CRM in it. The demo doesn't have a customer email that looks reasonable but contains a prompt injection. The demo doesn't have the Tuesday afternoon when somebody pastes a confidential document into the wrong window. The dog was fine in the showroom. The dog has not yet met your house.
Fence so tight nothing happens. The other reflex, and the one I see more often in cautious organisations. Lock it down to the point where the model can't actually do anything useful — can't read a real document, can't take a real action, can't be trusted to write to anything. Looks safe. Costs the same. Delivers nothing. The shackles are off on the capability side; some teams have responded by inventing new shackles on the permission side, and ended up with a dog in a crate that they're paying to feed.
Neither is the answer. The answer is a fence the dog can move inside.
What the actual work looks like
Concretely, when we build something that uses a model — an assistant, an agent, an automation — the work isn't picking the model. The model is twenty minutes of the project. The work is the bit underneath.
It's identity, so the model is acting as a specific person with that person's permissions, not as a god account that can see everything. It's scoped tools, so the action set is exactly the action set you wanted, no more. It's an audit trail, so when something goes sideways you can see what it did and why, not just that it did. It's a kill-switch, so when something goes sideways you can stop it without taking the building offline. And it's somebody — a human — who looks at the logs once a week and notices the drift early, while it's still cheap.
That's it. That's the whole shape. None of it is glamorous. All of it is the difference between a thing that quietly keeps working and a thing that becomes the interesting incident in your next board pack.
Close
The dog isn't the problem. The dog is good. That's why we wanted one in the first place.
The problem, every time, is that we keep talking about the dog and not about the garden. Which model. Which provider. Which leaderboard. Meanwhile the gate's been open for six months and nobody's quite sure what's been in and out.
Build the fence. Hold the leash. Train the thing. Then worry about which breed.
Good dog.
