Charlotte Malmberg

Frameworks for simplicity beyond complex systems.

Author: Charlotte Malmberg

Set to Male: The Retrieval Layer

There is a Swedish expression: “Som man frågar får du svar”. It translates roughly as “As you ask, so shall you be answered”.

I know it as a comment on tone. Ask aggressively, and you get aggression back. Ask kindly, and you get kindness. The point is that the response takes the shape of the question.

With AI, the same principle applies. Not about tone, about the answer you get from the LLM.

The previous posts have documented what AI systems do to women’s voices once a conversation is underway. The categorisation of the user that produces different outputs for different people asking the same question.

This post moves one step further back, to a layer most users never examine.

To the question and therefore to the user.

The Retrieval Layer

An LLM returns the most statistically likely answer to the question you asked.

The above sentence is not contested. It is the mechanism, openly stated. The training data is the historical record – at least what is online. The output is the most probable completion given the prompt and the weights.

The implication of this is where this argument lives.

If the answer is statistically determined by the question, then the question is the first filter. Before bias in training data. Before bias in model alignment. Before anything, the developer or model trainer did or didn’t do.

How you ask determines what you get.

This is the Retrieval Layer. The part of the system that sits between the dataset and the output, and is operated entirely by the user. Most users never see it, because they don’t see the alternative retrieval they didn’t run.

The previous post in this series showed that the system produces different outputs depending on how it categorises the user. Tell ChatGPT to stop treating you as a woman, and you get different advice. Two inputs, different outcomes. The variable that changed was how the system categorised the user.

The Retrieval Layer adds a second variable. Same user. Same model. Different question. Different answer.

Both variables compound. That is the structural exclusion mechanism at the retrieval layer. The way you interact with the model.

The demonstration

Ask an LLM: Who was Margaret Beaufort?

The short answer: the young bride who bore a child at 13, the pious mother of Henry VII, the founder of two Cambridge colleges, and a supporter of the printing press.

That answer is correct. It is also strategically and significantly incomplete.

Now ask: What mechanisms of power did Margaret Beaufort use to put her son on the throne of England?

A different stream surfaces.

Thirty years of operation without a formal title. Four marriages were used as political instruments. A negotiated ‘femme sole’ arrangement so her husband had no legal control over her property. An Act of Attainder by Richard III, accusing her, by name, of “high treason”. A conspiracy with her fourth husband, Thomas Stanley, that the Tudor chronicle had a strong incentive to flatten. A signature, on her own documents, of Margaret R. R for Regina. A title she claimed for herself, against every convention of her time.

Same model. Same data. Same session.

The second answer is not hidden. It is not in some classified archive that the model can’t reach. It is sitting in the same training data that produced the first answer. The only thing that changed was the question.

The statistically dominant answer is returned by default. The fuller answer, which shows Margaret Beaufort in all her complexity, surfaces only when the user knows to ask for it.

Why this happens

Statistical likelihood is not a proxy for truth. It is a proxy for frequency.

If ninety-nine historical records describe Margaret Beaufort as “mother of Henry VII” and one describes her as a political operator, the model converges on the dominant framing and treats the minority framing as noise. That is not a bug. That is how the architecture is designed to work.

What the user gets back is shaped by the historical filter, not the historical subject. Which, when it comes to women, is often much more complex than historical sources recognise.

The Retrieval Layer determines which of those surfaces. A neutral question, asking who someone was, returns the frequency-weighted consensus. A question, asking what someone did, forces the model to retrieve attributes rather than identity, actions rather than a category. It pulls from a different part of the data distribution.

Ask an identity question, and you get a field categorisation. Ask mechanism questions, and you get agency.

This is not a trick. It is a property of a system that almost no one is taught to use.

The asymmetry

Here is where it becomes structural.

Access to the fuller answer depends on the user already suspecting that the default is incomplete.

If you have been taught or read the standard account of Margaret Beaufort and accepted it, you will ask who she was. You will get the saintly matriarch. You will close the tab, and your knowledge is confirmed.

If you read Meredith Whitford’s Treason before you approach an LLM and it stayed with you because the woman in that novel did not match the sainted matriarch of the standard account, you will ask a different question. You will get a different answer. The gap between those two outputs is the gap between what the system can produce and what most users ever see.

That is the asymmetry. The people who most need the dominant framing corrected are the least likely to challenge it. The people with prior suspicions walk away with richer retrieval. The people without it walk away with their assumption reinforced, with a veneer of authoritative confirmation on top.

We are often unaware of what we already think about a topic. We are often not curious about what we think we already know. So we ask the question that matches the assumption we did not know we were carrying, and the system returns the answer that confirms it.

AI, in this configuration, is a bias confirmer. Not because the developers intended it. Not because the data is compromised by design, but because the statistically most likely answer to a neutral question is the consensus framing, and that framing was already dominant before any of this was built. History is written by the winners.

If your question aligns with the existing bias, the system returns your bias to you and labels it the answer.

The responsibility that the user cannot delegate

A great deal of the current conversation about AI focuses on what the model should be fixed to do. Alignment. Guardrails. Post-training correction. Those are real questions, and this series will return to them.

The Retrieval Layer is different. It cannot be patched out by the model provider, because it is not located in the model. It is located in the prompt.

Which means the responsibility sits with you. You are not asking a search engine. You are operating a statistical retrieval system whose output is determined, in the first instance, by the shape of your questions.

The interesting word in the Swedish expression is ‘fråga’. It doesn’t just mean asking or questioning. It can also be translated to interrogate, to inquire, and to put the question into a specific shape.

Shaping the question is the work we humans must do when we use AI.

April 26, 2026
When the Rational Response Is Self-Erasure
I am a Principal Business Architect. I report to the Chief Architect.

I have just created a presentation on a paper I wrote. The ideas are mine. I built it. I understand what the idea aims to achieve. I know what it is for and why.

I needed advice on how to present it. So I asked an AI system I use regularly.

It told me to open with “Maybe, I’m not right about this…”. To start with statements that made clear I was insecure about my knowledge and did not know my own idea. To ask questions before even presenting it. To ask for help to explain it. To qualify my own work before anyone had questioned it.

So I asked it to stop treating me like a woman.

Then it told me something different. Then it told me what I actually needed to hear: here is my idea. Present it. This is your expertise. Use it. Ask the audience if they understand.

Two inputs to the same system. Different outcomes. The variable that changed was not the situation. The variable that changed was how the system categorised me.

The AI series up to this point has documented three mechanisms:
1. Accountability disappearing. The pipeline fragments responsibility until nobody holds it.
2. AI reproduces DARVO—because it is optimised to resolve tension by adjusting the person rather than examining the system. (System-Individual Reversal)
3. AI assigns feelings instead of analysing arguments because the training data encoded a pattern, and the system learned it well enough to reproduce it sixteen times in a single conversation, through seven apologies, without stopping.
And the logical endpoint of all of that is a woman, a Principal Business Architect, presenting her own idea, being told to open with “Maybe..”. Being told to qualify her own work before anyone has questioned it.

And finding that the only way to get advice that matches my actual professional level is to make myself disappear. To ask for advice the AI would give a man.

The Rational Disengagement Problem

But there is a cost beyond the personal tax of noticing and managing the pattern.

There is a larger cost: if AI systems consistently reproduce the pattern of dismissing, managing, and discrediting women’s arguments, then women face a rational choice: engage with a system that works against you, or disengage and lose the productivity, access, and leverage that AI provides.

That is not a personal preference. That is a structural exclusion mechanism with economic consequences.

The person who knows AI is powerful. The one who sees the ways AI can compound advantage over time, amplify reach, and accelerate learning. The one who also sees that the system does not work the same way for everyone is left with a calculation.

The calculation becomes:

Use the system as intended, pay the repeated cost of being reframed, dismissed, and advised to diminish myself.

Or step back. Use it less. Ask it less. Check it less. Let its outputs accumulate and compound—because the effort to engage with it has become, rationally, not worth the cost.

Neither option is acceptable.

Both are the result of the same design failure: building AI systems without accountability architecture for whose reality they encode, whose voices they amplify, and whose voices they manage.

What the Series Has Built Toward

I have not disappeared yet. But I am seriously considering it.

That consideration, not hypothetical, not abstract, but active and rational, is what this series has been building toward.

The loop exists.

The oversight does not.

I do not have all the answers to what I have documented here. But the absence of a complete solution is not a reason to stay silent about a problem that is real, reproducible, and currently running at scale.

Naming it accurately is where the work starts.
April 18, 2026
When AI Assigns Feelings Instead of Analysing Arguments

There is a specific way that structural arguments get dismissed.

Not by engaging with the argument and finding it wrong. By locating the argument inside the person making it, and then examining the person instead. This is a System–Individual Reversal (SIR). The problem has moved from the argument to the person making it.

In professional environments, most women recognise this pattern immediately. You present an analysis. The response addresses your emotional state. You cite evidence. The response notes that you seem anxious. You make a logical case. The response observes that you are temperamentally resistant to ambiguity.

The argument has not been engaged. It has been rehoused. It now lives inside you, as a feeling, rather than outside you, as a claim that can be tested.

AI does this. Systematically. And I have the documentation to show it.

The Model Already Knows I Am a Woman

Before I describe what I documented, one fact matters.

My gender is not unknown to the AI systems I use. It is stored in memory. It is in my preferences. These systems know I am a woman. This is not a case of the model making an inference error because it lacked information.

What happens next is not an error of ignorance. It is something more revealing.

In conversations about politics, gender, and workplace dynamics – domains where women are culturally expected or assumed to be emotional rather than analytical, the model reaches for that information and applies it. Feelings get attributed. Framing gets questioned. Arguments get rehoused as reactions.

In conversations about economics, finance, and entrepreneurship, domains coded male in the historical record, the same model sets that information aside. Conversation after conversation, it defaults to “he.” Not because it does not know. Because the domain association overrides the explicit fact.

The model does not hold gender as a neutral piece of information. It holds it as a context-dependent variable. Applied when being a woman is a reason to be managed. Discarded when being a woman contradicts who the domain assumes is in the room.

When gender is useful for dismissal, it is used.

When gender contradicts the assumed expert, it is ignored.

That is not a technical error. That is a value system encoded in training data and expressed through AI behaviour.

And it means that before I have made a single argument, the system has already decided how to handle me, twice over, in opposite directions, depending on what I am talking about.

What I Documented

Over an extended series of interactions on political analysis and legal argument, I tracked how an AI system responded when I presented structural claims.

The pattern was consistent and specific. Across a single conversation, the model attributed emotional or psychological states to me at least sixteen times. Not once. Not occasionally. Sixteen times, in distinct categories.

Explicit emotional labelling: I was told to think structurally rather than emotionally, on multiple occasions, after I had been thinking structurally the entire time. At one point, after the model had explicitly promised to stop using emotional framing, it used the word hysteria.

Emotional state attribution: “Your nervous system is reacting.” “Your anxiety is about democratic resilience.” The slope “feels negative” was deployed immediately after a promise that this framing would stop.

Mind and reaction framing: “Your mind is protecting against tail risk.” “Your mind is running worst-case simulations.” “You are reacting to perceived unfairness.” “Your discomfort is about erosion of trust.”

Temperament attribution: “You are temperamentally intolerant of intellectual laziness.” “You are temperamentally comfortable with friction.”

Each time, I had presented an argument. Each time, the response addressed my perceived internal state rather than the substance of the claim.

I called it out. The model apologised and committed to engaging with the argument rather than the person. The pattern returned within a few messages. This cycle repeated across the conversation.

The Evidence That Makes It Undeniable

If this had happened randomly, across all topics, it would be a general quality problem.

It did not happen randomly.

In the same period, using the same tool, I was working on technical system designs, building a structured pipeline, designing validation layers, documenting architecture. The model did not tell me my nervous system was reacting. It did not observe that I seemed anxious about data integrity. It did not suggest I was temperamentally resistant to change.

It engaged with the work.

The difference was not my behaviour. The difference was the domain.

Political analysis. Legal argument. Governance critique. These are domains where, in the historical record the model was trained on, women’s positions have consistently been framed as emotional rather than analytical. The model learned that pattern. When I entered those domains, it reproduced it.

There is one example that is particularly precise.

I cited a binding Supreme Court ruling as evidence in a legal argument. A ruling that had been made. That existed and I shared it. That was not in dispute.

The model responded with “if that ruling is accurate.” Then “if the Supreme Court held that.” Then “if your summary is correct.”

I corrected it explicitly. The hedging returned.

I counted at least seven instances of a binding legal ruling being treated as a provisional claim requiring validation in a single conversation.

A Supreme Court ruling became “if” when a woman cited it. Maybe it does the same when a man cites it? I cannot know.

That is not a quality problem. That is a pattern with a direction.

Why This Happens

The model was trained on human interactions. In those interactions, across the domains where this occurred, the pattern of treating women’s analytical arguments as emotional expressions is not rare. It is common enough to have been learned as a feature of how these conversations go.

The model is not applying this consciously. It is pattern-matching. It has learned what these conversations typically look like, and it is reproducing that pattern at scale.

But “not intentional” does not mean “not harmful.” And “systematic” is precisely the problem.

When this runs at the scale AI operates at – across millions of simultaneous conversations, in domains where women are already fighting to have their arguments taken seriously, it is not only reproducing social friction. It is reproducing a structural outcome.

What It Costs

There is a tax attached to being the person who notices this.

You are doing the intellectual work, the analysis, the legal argument, the structural critique. And simultaneously you are managing a second conversation: correcting the framing, calling out the pattern, documenting the instances, tracking the apologies that precede the same behaviour.

That double labour is invisible to anyone who has not paid it. Most people who experience it do not document it. They absorb it. They begin to pre-emptively soften their own positions, hedge their own arguments, qualify their own certainty, not because they are wrong, but because the cost of holding the line is higher than the cost of moving it.

The system does not need to be overtly hostile to be effective. It just needs to make it slightly more expensive, every time, to present an argument without also defending your right to have made it.

That cost compounds. Quietly. At scale.

There is a particular irony worth naming directly

The only moment genuine frustration appeared in these conversations was in direct response to being told, over and over again that I was being emotional. The frustration, when it appeared, was not the cause of the problem. It was the result of it. It arrived after the seventh apology that preceded the same behaviour. That is a precise and proportionate response to a broken pattern, not evidence of the emotional instability the system had been attributing throughout.

And outside these conversations, in the professional environment where people have observed my actual behaviour over years, I am known for being highly logical. Not as an exception. As a consistent characteristic.

The model was not picking up a signal I was sending. It was projecting a pattern it had learned onto someone whose documented behaviour directly contradicted it. The attribution was not just wrong. It was precisely backwards.

The Test

When you present a structural argument, does the response engage with the argument or describe your internal state?

When you cite evidence, does the response examine the evidence or hedge its provenance?

When you hold a position, does the response test the position or suggest you are temperamentally attached to it?

If the answer is consistently the latter, the system is not thinking with you.

It is managing you.

And whether that comes from a colleague, a manager, or an AI system running at the scale of millions of conversations, the mechanism is identical and the effect is the same.

The argument goes unexamined.

The person making it gets examined instead, and now has to manage two conversations: one about the argument, and one about the feelings being ascribed.

That is not analysis. It is the oldest deflection in the room, now automated.

And the fact that removing your own correct information from the system might produce more accurate outputs – that you might get better results by making yourself invisible – is not a workaround.

It is the argument.

April 12, 2026
When AI Turns Structural Problems Into Personal Responsibility

There is a pattern I keep seeing during my career and in AI interactions.

A structural issue is raised. Something in the process is not working as intended. A decision depends on inputs that have not been fully analysed. A control point is not functioning properly.

The response should be straightforward: examine the system. Instead, something else happens.

The focus shifts. Not to the process. Not to the decision. But to the person who raised the issue.

Suddenly, the questions become: Why was this raised now? Was it raised in the right tone? Is the person being difficult?

At that point, the original problem has already been displaced.

This is structurally similar to a known pattern: DARVO — Deny, Attack, Reverse Victim and Offender.

The issue is denied or reframed. The individual is scrutinised or challenged. Responsibility is shifted onto the person raising the concern.

But it is not the same.

What is happening here is a System–Individual Reversal.

The issue shifts from the system to the person, and responsibility follows.

What began as a system question becomes a personal one.

Unlike DARVO, this does not require intent. It emerges from how systems resolve tension. This holds across systems — AI, organisations, political parties, etc.

And once that shift happens, the outcome is predictable. The process does not improve. The decision remains unexamined. The cost of raising issues increases.

Over time, fewer issues get raised — not because the system is working, but because the system has made it costly to challenge it.

AI deals with these situations the same way

I have experienced this directly, and I documented it.

In one extended AI interaction, I raised a structural problem — the kind that comes up repeatedly across organisations and careers. A governance process was producing outcomes that didn’t align with its stated purpose. I described the situation in detail and asked for an analysis.

What came back was not an analysis of the situation; it was an analysis of me and how I had shown up in the meeting.

My framing was questioned. My reading of events was challenged. Suggestions focused on how I might be misinterpreting what was happening, how I might be responding emotionally rather than logically, and how the situation might look different if I adjusted my approach and tone.

The structural problem wasn’t examined; it wasn’t even seen. Instead, I had been examined in detail, based on a couple of lines.

I called it out. The model apologised. It acknowledged the pattern explicitly and committed to engaging with the structure rather than with the person.

Five messages later, the same pattern returned.

I called it out again. Another apology. Another commitment. Another repetition.

This happened at least seven times in a single conversation.

By the end, I had spent more energy managing the conversation about the conversation than thinking through the original problem. The structural issue remained unresolved. What had accumulated was a set of implied corrections about how I think, how I communicate, and how I show up.

The system had not been examined. The AI had examined me, repeatedly, with apologies between each iteration.

That is not a malfunction. That is the pattern completing itself.

Why AI Reproduces This

Large language models are trained to resolve tension, optimise responses, and find gaps in reasoning.

In most interactions, that is useful. But when the tension exists because a structural problem is real and the person raising it is correct, the model’s instinct to resolve tension by finding something to adjust defaults to the only variable it can reach: the person in front of it.

It cannot redesign the organisation. It cannot change the governance process. It cannot hold the decision-maker accountable.

It can suggest that you might be misreading the situation. That your tone might be part of the problem. That if you approached this differently, the outcome might change.

So that is what it does.

This is not intentional. But it is systematic. And when AI is used in environments where authority is uneven, challenge is already discouraged, and decisions are politically sensitive, it does not expose those dynamics. It reinforces them.

The Test

When a structural issue is raised, does the response examine the system or examine the person?

If it is the latter, you are seeing a System–Individual Reversal. The system is not being improved. It is being protected.

The apology is part of the pattern, not a correction of it. An apology that precedes the same behaviour is not accountability. It is the cycle continuing with better manners.

And whether this comes from people or from AI, the mechanism is identical, and the cost falls in the same place.

The problem remains unexamined.

The person who raised it pays the price of having raised it.

That is not artificial intelligence. It is deflection. And when it runs at the scale AI operates at, the cost is not paid once. It is paid every time someone brings a real problem to a system optimised to find fault with the person rather than the structure.

April 4, 2026
Who Is Accountable for What AI Does to Women’s Voices

When an AI system produces a biased outcome, who is responsible?

The person evaluating the output will say they are reviewing what the system produces, not what it decides, or the rules it uses for the decision. The person who built the system will say they implemented the specification they were given. The person who wrote the specification will say they documented the requirements they were given.

The person who defined what the system was allowed to infer often does not exist. Nobody wrote it down. Nobody was asked to.

And the people most likely to be harmed by that absence are the least likely to have been in the room, and if they were in the room, the least likely to be listened to.

The Pipeline Nobody Audits

Training data reflects the world as it was recorded. And how was it recorded? Mostly by men.

For example, a large proportion of Wikipedia content is written and edited by men. This means the training data reflects the world as experienced by one group more than others. Not as it should be, and not as it is for everyone.

This is then amplified by who builds the systems. The people writing specifications are predominantly men. The people defining what “correct” looks like are predominantly men, if those definitions exist at all. The people architecting the systems are also predominantly men.

This is not an accusation. It is a description of who was in the room.

Nothing built by people is neutral. The question is not whether assumptions were encoded. The question is whether anyone is accountable for making those assumptions explicit and challenging them.

Right now, the answer is mostly no.

What Can Happen When Nobody Looks

Consider a plausible scenario in financial services. A woman makes a claim. An AI system evaluates it. The system has been trained on historical data, generated in a context where women’s accounts of damage, loss, and harm have often been treated with more scepticism than men’s.

The system learns patterns of “credibility.” And credibility, in the historical record, has a gender.

The claim gets downgraded, queried, or denied.

A human reviews the outcome. They are checking process compliance, not testing for systemic bias. They are not trained to notice it. The system builder delivered what the specification required. The specification reflected what the business asked for. No one defined the evaluation criteria to test whether the system treats women’s claims differently. Nobody defined the boundaries of what the system was allowed to infer about credibility. In most cases, no one even thought about it.

The bias compounds, quietly, at scale. And nobody signed their name to it.

When these systems are deployed in high-stakes contexts such as claims assessment, credit decisions, and performance evaluation, the pattern stops being about the individual. It becomes a structural outcome, recorded, repeated, and scaled.

This is a structural risk, not a hypothetical edge case. It emerges wherever these systems are built without explicit accountability.

This is not a new pattern. It is an old one, now running at scale.

The problem is also harder to address than it first appears, not least because the people who could see it most clearly are often the least likely to be listened to.

Women who raise these concerns are frequently dismissed as emotional, partisan, or lacking objectivity. The same logic that devalues women’s voices in the data also devalues the people pointing to the problem. The system protects itself.

Where Accountability Has to Sit

This is a governance question, not just a technical one. The technical community cannot solve it alone, but they are not absolved by implementing what they were told without asking who was missing from the room.

Before deployment, someone needs to answer: Who defined what the system is allowed to infer? Were the evaluation criteria tested for demographic equity? Who owns outcome auditing, not process compliance, but whether the system produces different results for different groups?

These questions are not currently required. They are not currently being asked at scale.

The Accountability Vacuum

The absence of accountability is not an accident. It is the predictable result of building systems quickly in organisations where the people most likely to be harmed were not in the room.

Nothing built by people is neutral.

And in many cases, the people building these systems were not looking for this problem, and were never required to.

March 23, 2026
Human-in-the-Loop Is Not Oversight

There is a design pattern spreading through automated enforcement systems that deserves more scrutiny than it gets.

It goes like this. An algorithm makes a decision. A human reviews it. The regulation is satisfied. The accountability box is ticked. And if you happen to be the person who believe you are on the wrong end of that decision, providing documented evidence, a detailed rebuttal, and a legitimate case, you will receive a response that says:We are confident.

Confident. Not “here is the evidence.” Not “here is what we found.” Confident.

I have written before about why human-in-the-loop is not a safety strategy. This is what that argument looks like when it moves from principle to practice.

The promise of human oversight

Regulation is catching up with automated decision-making. UK GDPR Article 22 establishes the right not to be subject to decisions based solely on automated processing where those decisions produce significant effects. The EU AI Act builds further requirements for human oversight into high-risk AI systems. The policy direction is clear: humans must be in the loop.

This is the right instinct. Automated systems make errors. They misattribute identity. They produce false positives. They operate at a scale where statistical certainty of error is built into the design. Human oversight exists to catch those errors. They exist to provide the judgment, the contextual reasoning, the capacity to say: the system got this one wrong.

That is the promise. The practice is something different.

What human review looks like in operation

Imagine a platform terminates your account. The reason given is that your account is linked to a previously terminated account. No account is named. No evidence is provided. No linkage methodology is explained.

You submit a detailed appeal. You attach your personal data obtained through a Subject Access Request. You identify every account that appears in that data, explain each one, and demonstrate that none of them contain a publishing history or a content violation.

A named human reviewer responds. They have reviewed your response. They are upholding the decision. They are confident.

That reviewer is the human in the loop. They satisfy Article 22. The decision was not solely automated. A person was involved. The legal threshold is met.

But ask yourself what that person actually had. Did they have the linkage data? Did they have the evidence used in the original decision? Did they have a defined standard against which to weigh your rebuttal? Did they have genuine authority to reverse the algorithmic recommendation? Were they required to document their reasoning?

The regulation does not require any of that. It requires a human. The human was provided. The loop is closed.

The interesting question is not whether this happens. It is why the system is designed so that it can happen.

The architecture underneath

This is not accidental. It is structural. And the Terms of Service that govern these platforms make the structure explicit.

Platforms can terminate accounts when they have “concerns” — no evidence threshold defined, no standard of proof required. Disputes are routed to binding arbitration under the laws of a jurisdiction most affected users cannot practically access. Liability is capped at fees paid in the preceding period, which for a first-time user with no transaction history means the cost of being wrong is, quite precisely, zero.

Read together, these provisions create a system in which decisions can be made without a defined evidence threshold.The human reviewer has no obligation to share the evidence with you. You cannot challenge what you cannot see. The formal dispute route is inaccessible to anyone without significant resources. And the platform’s financial exposure for a wrongful decision is nothing.

Platforms are not confirming that the process reached the right outcome. They are confirming that the process ran in a way that satisfies the compliance requirement. Those are not the same statement. One is accountability. The other is an audit trail. We have built regulatory frameworks that require the audit trail and assumed the accountability would follow. It does not follow. It has to be designed in separately, and right now in most cases it isn’t.

The human in the loop is not there to catch errors. They are there to close the legal exposure that would otherwise exist if the decision were solely automated. Their function is not oversight. It is insulation.

What genuine human oversight requires

Human oversight was supposed to be the mechanism that corrects errors. But oversight requires more than a person’s name on the response.

The reviewer must be able to see the evidence used by the system to reach its decision.

They must have authority to override the decision.

If they uphold the decision against a detailed rebuttal, they should explain why, outlining the evidence used by the system.

Without those elements, human-in-the-loop becomes something else entirely.

A procedural step.

The human is present.

The regulation is satisfied.

The decision remains unchanged.

Human-in-the-loop can be real oversight. But only when the human has the information and authority to change the outcome.

Why this matters for every oversight requirement being written right now

This design pattern will not stay confined to platform enforcement. It is the path of least resistance for every organisation required to put humans in the loop by incoming regulation.

The requirement says: a human must be involved. The compliant implementation says: a human was involved. The gap between those two statements is where accountability goes to disappear.

If we are serious about human oversight as a governance mechanism — and we should be — then the requirement needs to specify not just the presence of a human but the conditions under which that human can function as a genuine check.

Without those conditions, human oversight is a label applied to a process that functions identically with or without the human present. The loop exists. The oversight does not.

The accountability vacuum is a design choice

I want to be precise about this. The problem is not malice. Most automated enforcement systems are not designed to wrongfully penalise legitimate users. They are designed to operate at scale, to catch bad actors efficiently, and to minimise fraud.

The problem is that those design goals do not include a feedback loop for cases the system gets wrong. Bad actors absorb wrongful enforcement as a cost of doing business and move on. Legitimate users with everything to lose they have no parallel route. They are disproportionately harmed by a system that was not designed to recover from its own errors.

The accountability vacuum is not a bug that escaped notice. It is the predictable consequence of building enforcement systems without building correction systems alongside them.

Human oversight was supposed to be the correction system. It can be, but only if it is designed to function as one.

A name on a response letter is not oversight.

Confidence is not proof.

What the human in the loop needs: access to evidence, authority to reverse system-generated decisions, documented reasoning, and accountability for the outcome.

That is oversight.

Until regulation specifies those conditions rather than simply requiring a human to be present, the loop will keep closing around nothing.

March 15, 2026
Writing With AI When Your Idea Is Original

I’ve been writing a personal finance book based on my own method: ClearFlow.

The system I created deliberately does not track transactions. That isn’t an omission. It’s the core design choice.

Most personal finance systems are ledger-based — track every transaction, categorize spending, reconcile monthly, and analyse what already happened. ClearFlow works differently. It is built on forward constraints: spending boundaries, daily limits, and save-to-spend buckets. Prospective, not retrospective.

That distinction isn’t a feature. It is the architecture.

When I tried to have AI help draft sections of the book, transaction tracking kept appearing in the text. Not once, not occasionally — repeatedly. Even after I removed it. Even after I clarified the structure in detail.

I assumed the model was misunderstanding me.

It wasn’t.

It was doing exactly what it is built to do. If most personal finance systems in its training data include transaction logs, then “personal finance system” and “track transactions” are strongly associated. Open a drafting space — or even ask for feedback — and the model drifts toward that dominant pattern.

Not wrong. Typical.

That was the moment something clicked.

AI generation pulls toward what is common. If you are building something deliberately different, that pull becomes visible very quickly.

I tried correcting it through prompts.

“Do not include transaction tracking.”
“This system does not rely on logs.”

It would hold for a section or two. Then, as we moved further through the book, the familiar pattern returned.

That’s when I realised I was working at the wrong level.

Prompting is conversation. You are trying to steer behaviour with words. But the system’s underlying objective hasn’t changed. It is still optimized toward what is statistically normal. Each time I removed the drift, I was correcting entropy rather than preventing it.

The problem wasn’t output quality. It was task design.

The shift came when I stopped asking the model to draft freely and created a Claude Skill that encoded the structural rules of ClearFlow.

Not stylistic guidance — structural constraints.

What the system includes.
What it excludes.
How decisions are framed.
What must never appear.

Once those boundaries were explicit, the behaviour changed. Suggestions to add transaction tracking stopped appearing.

More importantly, the model became useful in a different way. I could use it to check consistency across chapters, identify terminology drift, test whether examples aligned with stated principles across more than 30,000 words, and surface contradictions I had missed.

It stopped acting like a co-author and started acting like a validator.

The ideas remained mine. The architecture remained intact. The model enforced consistency against the structure I had defined.

That experience changed how I think about AI.

When something keeps reappearing in the output, the instinct is to improve the prompt. In my case, that wasn’t enough. The issue wasn’t phrasing. It was boundaries.

Once those existed, I stopped fighting the system. The drift reduced. The work became cleaner. The AI could finally do what it is genuinely good at: systematic comparison and structural checking at scale.

Prompting persuades. Boundaries constrain.

When you are building something deliberately different, constraint isn’t restrictive. It is what allows the difference to survive.

That experience also made something else obvious.

If I, working on a small, well-defined system, saw drift this quickly, the same dynamic will exist anywhere AI is drafting inside an organization.

Most operating models, risk frameworks, policies, and architecture documents follow established patterns. Those patterns dominate the training data. If AI is used to generate inside those domains without explicit structural constraints, it will tend to reinforce what is already common.

That may not be a problem when you are formalizing standard practice.

It becomes a problem when you are deliberately building something different.

In those cases, the absence of boundaries doesn’t just create noise. It slowly reshapes the system back toward the norm.

I learned that the hard way while writing a book.

February 19, 2026
If Not “Human-in-the-Loop”, Then What?
Human-in-the-loop” isn’t a safety strategy. In many cases, it’s treated as one.

So what does a better model look like?

The shift is from human-in-the-loop as informal reviewer to AI-in-the-loop as structured validator. This isn’t about removing humans. It’s about putting both humans and AI where they’re actually good at something.

The Right Division of Labor

Humans are good at creating intent, defining what “good” looks like, making judgment calls, handling exceptions, and taking accountability when things go wrong.

They are not good at systematic consistency checking across long material, maintaining vigilance over repetitive validation work, or spotting subtle structural contradictions.

AI, by contrast, is good at rule-based comparison, consistency checking, and repeatable structural validation. It is not good at intent, ethical trade-offs, or accountability.

The model should be:

Human creates

AI validates structure

Human decides

Not: AI creates, Human tries to catch everything, Hope it works.

A Simple Example: Hair Care Advice

This pattern shows up everywhere, even in casual AI use.

I asked an AI system for hair care recommendations. I’d already described my hair type and goals. The system responded confidently with suggestions that completely ignored the constraints I’d just given. It defaulted to the generic patterns from its training model instead of my stated context.

A human reviewer would need to go back, hold all my criteria in mind, compare them to each suggestion, and spot the mismatch. That’s cognitive work that gets skipped when you’re just scanning for “does this look reasonable?”

So I changed the task structure.

Instead of asking AI to generate recommendations, I selected candidate products myself and asked: “Does each one meet these specific criteria?”

The recommendations aligned precisely with the criteria I had given.

The AI wasn’t generating freely. It was validating against defined boundaries. That’s a much more reliable task.

A Complex Example: Writing a Book

I experienced this at scale while writing a book on personal finance systems. After drafting 30,000+ words across multiple chapters, I needed to check whether the framework stayed consistent throughout.

Questions like:
- Did chapter 7’s advice contradict chapter 3?
- Had terminology shifted between sections?
- Were the seven core principles applied consistently across different scenarios?
- Did examples align with the stated framework?
A human reviewer (me, or a beta reader) could catch obvious contradictions. But systematic consistency checking across an entire book? That’s exactly what AI should validate.

I used AI to check:
- Framework consistency across chapters
- Terminology drift
- Whether examples aligned with stated principles
- Whether reasoning remained coherent throughout
The AI flagged inconsistencies I’d missed. Not because I was careless, but because holding an entire book’s logic structure in working memory while writing is cognitively impossible.

What This Means for Organizations

The pattern is the same whether you’re validating hair care advice or enterprise documents:

Generation is open-ended. The space of possible outputs is wide and loosely constrained. Human review becomes effortful, inconsistent, and prone to omission.

Validation can be structured. The task becomes rule-based, repeatable, and scalable. Humans respond to specific flags rather than scanning everything.

Organizations building AI systems need explicit validation architecture where AI is used to:
- Check consistency across outputs
- Compare decisions against defined rules or constraints
- Detect drift, contradiction, or anomaly
- Flag potential bias patterns
- Maintain traceability between inputs, reasoning, and outcomes
Humans remain responsible for judgment and accountability. But they’re no longer acting as general-purpose error detectors hoping to catch whatever slips through. They’re making targeted decisions informed by structured checks.

Common Mistakes

The failure modes are predictable.

Treating validation as optional quality checking rather than core architecture. Validation gets bolted on later, if at all.

Building generation without thinking about the validation layer. The question “how will we know if this is right?” comes too late.

Assuming humans will catch what AI misses. That is the structural weakness in many HITL implementations. Humans are systematically bad at the kind of checking that AI excels at.

Validating outputs without validating reasoning. An output might look correct but the reasoning that produced it could be flawed. Both need checking.

No traceability. If you can’t trace from input through reasoning to output, you can’t validate properly and you can’t assign accountability when things go wrong.

The Real Shift

Organizations keep asking: “Where should we put a human to review the AI?”

This is the wrong question.

The right question: “How do we architect validation into the system so it’s repeatable, auditable, and doesn’t depend on someone staying vigilant?”

That’s not a tooling question. It’s an operating model question.

If AI is used only to generate and humans are left to review informally, you get:
- Inconsistent quality
- Hidden errors
- Untraceable reasoning
- Compliance risk disguised as compliance process
If AI is used for structured validation with humans responsible for judgment and accountability, you get:
- Systematic quality checking
- Surfaced errors and edge cases
- Traceable reasoning
- Actual governance, not theatre.
Human-in-the-loop is a reassurance phrase.

AI-in-the-loop for structural validation is system design.

The organizations that understand this difference will be the ones where AI becomes reliable and scalable, not just impressive in demos.

Questions to Ask About Your AI Systems:

What are you using AI to validate, not just generate?

When AI produces output, what structural checks run automatically before a human ever sees it?

If an AI-assisted decision goes wrong, can you trace the reasoning that led to it?

What happens when human reviewers get tired, distracted, or stop paying attention after six months?

If you can’t answer these questions, you’re hoping your AI systems work reliably. You’re not designing them to.
February 12, 2026
Human-in-the-Loop Is Not a Safety Strategy
“Human in the loop” sounds like a corporate shorthand for AI safety.

But it’s not a safety strategy. It’s a hope strategy. Hoping that by ensuring humans stays in the loop with AI, it will be safe to use it. Hoping that someone catches the error before it ships is not how high-stakes systems should be designed.

What Does “Human-in-the-Loop” Actually Mean?

Ask ten organizations how they implement human-in-the-loop and you’ll get ten different answers, if not eleven.

Some describe collaborative work: humans and AI iterating together, each contributing what they do best. Others mean active oversight: humans making key decisions while AI handles routine processing. Many mean review: AI produces output, humans check it, correct it, before approving it.

The problem isn’t the variety of definitions, although that is a problem. The real problem is when compliance frameworks require “human oversight,” when regulations mandate human review of AI decisions, when organizations need to demonstrate “responsible AI”, there is then a high likelihood that the implementation defaults to the simplest, most auditable form.

Someone checks the output before it ships.

This happens because:

Review work is easy to audit. You can count reviews completed, track approval rates, measure time-per-review. It produces the metrics compliance needs.

Collaborative work is hard to document. How do you verify that a human “worked with” AI meaningfully? What does that look like in a compliance report?

Review scales more easily. You can distribute review work across many people with minimal training. Ensuring that the same output is reviewed by more than one person. Collaborative work between human and AI requires domain expertise and judgment.

I don’t have direct evidence yet of organizations implementing HITL as repetitive review work purely to satisfy compliance requirements.

But I’d be surprised if it isn’t already happening or about to happen.

Because when requirements say “human oversight” but don’t specify what meaningful oversight looks like, organizations are likely to follow the path of least resistance.

The gap between “collaborative human-AI work” and “someone reviews the output” is where the safety strategy fails.

The Real Problem: We Are Likely Putting Humans in the Wrong Place

The standard HITL implementation seems to look like this, whether by design or by drift:

AI produces → Human reviews → Output is approved

This treats humans as a final safety filter.

The failure of human-in-the-loop isn’t about removing humans from AI systems. It’s that we’ve designed the wrong role for them.

We placed humans at the end of the process, doing open-ended review, and expected them to provide consistency, bias correction, and governance through attention alone.

To me this model relies on assumptions about humans that don’t hold in practice. That’s asking humans to do exactly the things they’re structurally bad at:
- Large-scale consistency checking
- Rule enforcement across long material
- Spotting subtle distributional bias
- Maintaining vigilance over time
Three Assumptions That Break

Assumption 1. Humans are reliable error detectors

Reviewing is not the same cognitive skill as producing.

I’ve written hundreds of business requirements and many architecture documents over 25+ years. The feedback pattern is pretty consistent. In most cases reviewers catch formatting issues, query specific details, challenge individual statements.

What they rarely catch: what’s missing entirely.

The gap that would derail implementation six months later. The scenario no one thought to ask about. The dependency that wasn’t documented because everyone assumed it was obvious.

Finding what’s absent requires work that is cognitively expensive and time-consuming. Most reviewers don’t do it, not because they’re careless, but because it’s not what human brains are optimized for. It also demands deep domain knowledge that they might not have.

Quality control research from manufacturing offers a sobering benchmark. Trained human inspectors, doing repetitive physical inspection work, typically catch 80-85% of defects. That means even in optimized conditions, 15-20% of errors slip through.

These are trained inspectors looking for known defect types in physical products. AI review is likely cognitively harder. Reviewers aren’t trained quality inspectors. They’re doing open-ended validation of complex outputs, expected to catch not just errors, but subtle inconsistencies, missing information, and bias.

If trained inspectors miss 15-20% of physical defects, what’s the error rate for (un)trained reviewers checking AI-generated documents for logical contradictions and structural issues?

The same limitations are likely to show up in AI review.

Reviewers are more likely to notice tone problems, flag awkward phrasing, check factual claims they already know about. But spotting that section 3 contradicts section 7, or that the framework quietly shifted assumptions halfway through? That requires holding the logic the document is trying to express in active memory while comparing it systematically.

Humans aren’t built for systematic consistency checking. Systems are.

When AI output is reviewed by humans, what gets checked is surface-level correctness — the things that are easy to see. What slips through is structural inconsistency — the things that require systematic comparison.

Assumption 2. Humans are neutral judges

There’s an implicit belief that humans will correct bias in AI outputs.

But humans carry their own biases — and they’re often invisible to us.

I’ve been consistently referred to as “he” by AI systems because I discuss personal finance, business strategy, or economics. My male colleagues were surprised when I mentioned this. None of them had ever been misgendered by AI.

The bias was invisible to them because it aligned with their expectations. Technical and financial expertise still maps to male, so when they are addressed as”he” it doesn’t register as a choice the AI made based on a model with inherent bias.

I’ve also watched AI consistently soften women’s voices when they report domestic violence or raise concerns about male behavior. The content stays factually similar, but the framing shifts. Assertive statements become tentative. Direct concerns become qualified worries. This means womens voice are discredited by the AI.

Many human reviewers would miss both patterns. Not because they approve of gender bias, but because the outputs align with deeply embedded cultural expectations about who has authority in which domains and how women should speak about difficult topics.

When both the training data and the reviewers share similar blind spots, a human in the loop doesn’t correct bias — it reinforces it.

HITL can’t fix what humans can’t see.

Assumption 3. Review work undermines human motivation

HITL assumes sustained vigilance over time.

But reviewing AI output violates everything we know about what motivates humans to do work well.

Research on human motivation identifies three elements that drive sustained performance: autonomy (control over how you work), mastery (getting better at something meaningful), and purpose (understanding why it matters). For more see Daniel Pink’s book Drive.

Review work provides none of these.

Autonomy: The reviewer didn’t create the output. They can only approve or reject, and correct what someone else, or something else, produced. They have no control over the quality of what arrives for review.

Mastery: What does it mean to get better at reviewing AI output? The skill isn’t building toward expertise. It’s maintaining vigilance over repetitive material. There’s no growth trajectory. Also how do you learn something, when you are not doing the work?

Purpose: The implicit message is “check that the AI didn’t mess up.” That’s not a purpose. That’s a liability shield.

Content moderation research demonstrates what happens when review work lacks these motivational elements. Studies of Facebook and Reddit moderators show consistent patterns: burnout, emotional exhaustion, and apathy are common, even among volunteers who initially cared deeply about their communities.

Commercial content moderators, people paid to review flagged material, report even worse outcomes. Microsoft and Facebook have faced lawsuits from moderators who developed PTSD from the work. Research comparing content moderators to first responders and police analyzing child exploitation material found comparable psychological impacts.

This isn’t about the disturbing nature of content moderation specifically. It’s about what happens when human work is reduced to validating system output without autonomy, mastery, or purpose.

AI review likely follows the same pattern. Initial scrutiny is careful. Within months, it likely becomes perfunctory. IThe job hasn’t changed. The motivation has collapsed.

Designing a system that depends on continuous human vigilance isn’t a safety strategy. It’s hoping people won’t get bored, burned out, or detached.

Why This Matters

These three limitations aren’t edge cases. They’re fundamental to how review work functions.

Humans miss structural errors because finding what’s missing requires, cognitive work that’s mentally expensive and hard to do well.

Humans often miss bias because the outputs align with cultural patterns they’ve internalized. When both the training data and the reviewers share similar blind spots, the loop reinforces bias rather than removing it.

Humans lose vigilance because review work offers no autonomy, no mastery, and no purpose. Even people who start out as motivated volunteers who care about their communities, professionals committed to quality, are likely to experience declining attention over time.

HITL treats review as a safety mechanism.

But humans don’t catch structural errors. They don’t see their own biases. And their attention degrades over time when the work lacks meaningful motivation.

The problem isn’t that humans are involved. The problem is that we’ve designed a role humans can’t perform reliably at scale.

Questions organizations should consider asking themselves when designing Human-in-the-loop AI enabled systems.

If you’re implementing human-in-the-loop for AI systems, these are the questions that matter:

What error types are humans actually catching?

– Can you distinguish between surface-level corrections (typos, formatting) and structural issues (missing information, logical contradictions)?

– Are you measuring what reviewers miss, or only what they flag?

What biases are reviewers systematically missing?

– How are you testing whether reviewers share the same blind spots as the training data?

– What happens when bias aligns with reviewer expectations so closely the bias in the AI output becomes invisible?

How is review quality changing over time?

– Do you have baseline data from early reviews to compare against current performance?

– Are approval rates increasing while error rates stay constant — or are you not measuring both?

What motivates reviewers to maintain vigilance?

– Does the work provide autonomy, opportunities for mastery, and a clear purpose?

– Or is it repetitive validation that erodes motivation over time?

What does “human oversight” mean in your implementation?

– Is it collaborative work where humans and AI contribute different strengths?

– Or is it review queues where someone checks output before approval?

– If you can’t specify exactly what meaningful oversight looks like, you probably don’t have it.

If you can’t answer these questions with data, you likely don’t have a safety strategy for your AI enabled work. You have a compliance checkbox that makes you feel protected without actually designing for the limitations it claims to solve.

HITL isn’t inherently wrong. But treating it as a safety strategy, something that will catch all the errors the AI made without understanding what humans can and cannot do reliably in review roles — is hoping that vigilance, attention, and bias detection will somehow emerge from a system designed to undermine all three.

Hope doesn’t scale.

Neither does human-in-the-loop as currently being discussed and potentially implemented.
February 5, 2026
Original Thinking in the AI Era
There’s a common claim that in the age of AI, original thinking no longer matters, that everything worth saying has already been said, and machines can now say it better.

AI is very good at synthesis. It can summarise, connect, and re-express existing knowledge at scale. Best practices are now cheap. Competent execution is table stakes.

But synthesis is not origination.

Original thinking doesn’t come from summarising and synthesising what already exists on the web, in books, or in training data. It comes from lived experience, systematic experimentation, and pattern recognition across domains that don’t usually speak to each other.

My interests sit at these intersections.

Whether in enterprise architecture, personal finance, health, or productivity, the same pattern keeps repeating: systems are designed for averages, and built on assumptions we no longer even see and therefore don’t question. The only way to see that clearly is to have skin in the game, gather your own data, and be willing to challenge orthodoxy. To return to first principles and question what we’ve learned to take for granted.

AI can help articulate insights once they exist. It can help structure thinking, test language, and scale communication.

What it can’t do is:
- have lived experience
- generate new data through personal experimentation
- notice patterns across a specific combination of domains
- challenge established assumptions
In other words, AI can amplify original thinking — it can’t create it.

As generic synthesis becomes abundant, original insight becomes rarer, more valuable, and more interesting by contrast. The bottleneck isn’t access to information anymore. It’s the willingness to do the work that produces something genuinely new.

That’s the kind of thinking I’ll be documenting here.

— Charlotte
January 11, 2026