Charlotte Malmberg

Frameworks for simplicity beyond complex systems.

Set to Male: The Retrieval Layer

There is a Swedish expression: “Som man frågar får du svar”. It translates roughly as “As you ask, so shall you be answered”.

I know it as a comment on tone. Ask aggressively, and you get aggression back. Ask kindly, and you get kindness. The point is that the response takes the shape of the question.

With AI, the same principle applies. Not about tone, about the answer you get from the LLM.

The previous posts have documented what AI systems do to women’s voices once a conversation is underway. The categorisation of the user that produces different outputs for different people asking the same question.

This post moves one step further back, to a layer most users never examine.

To the question and therefore to the user.

The Retrieval Layer

An LLM returns the most statistically likely answer to the question you asked.

The above sentence is not contested. It is the mechanism, openly stated. The training data is the historical record – at least what is online. The output is the most probable completion given the prompt and the weights.

The implication of this is where this argument lives.

If the answer is statistically determined by the question, then the question is the first filter. Before bias in training data. Before bias in model alignment. Before anything, the developer or model trainer did or didn’t do.

How you ask determines what you get.

This is the Retrieval Layer. The part of the system that sits between the dataset and the output, and is operated entirely by the user. Most users never see it, because they don’t see the alternative retrieval they didn’t run.

The previous post in this series showed that the system produces different outputs depending on how it categorises the user. Tell ChatGPT to stop treating you as a woman, and you get different advice. Two inputs, different outcomes. The variable that changed was how the system categorised the user.

The Retrieval Layer adds a second variable. Same user. Same model. Different question. Different answer.

Both variables compound. That is the structural exclusion mechanism at the retrieval layer. The way you interact with the model.

The demonstration

Ask an LLM: Who was Margaret Beaufort?

The short answer: the young bride who bore a child at 13, the pious mother of Henry VII, the founder of two Cambridge colleges, and a supporter of the printing press.

That answer is correct. It is also strategically and significantly incomplete.

Now ask: What mechanisms of power did Margaret Beaufort use to put her son on the throne of England?

A different stream surfaces.

Thirty years of operation without a formal title. Four marriages were used as political instruments. A negotiated ‘femme sole’ arrangement so her husband had no legal control over her property. An Act of Attainder by Richard III, accusing her, by name, of “high treason”. A conspiracy with her fourth husband, Thomas Stanley, that the Tudor chronicle had a strong incentive to flatten. A signature, on her own documents, of Margaret R. R for Regina. A title she claimed for herself, against every convention of her time.

Same model. Same data. Same session.

The second answer is not hidden. It is not in some classified archive that the model can’t reach. It is sitting in the same training data that produced the first answer. The only thing that changed was the question.

The statistically dominant answer is returned by default. The fuller answer, which shows Margaret Beaufort in all her complexity, surfaces only when the user knows to ask for it.

Why this happens

Statistical likelihood is not a proxy for truth. It is a proxy for frequency.

If ninety-nine historical records describe Margaret Beaufort as “mother of Henry VII” and one describes her as a political operator, the model converges on the dominant framing and treats the minority framing as noise. That is not a bug. That is how the architecture is designed to work.

What the user gets back is shaped by the historical filter, not the historical subject. Which, when it comes to women, is often much more complex than historical sources recognise.

The Retrieval Layer determines which of those surfaces. A neutral question, asking who someone was, returns the frequency-weighted consensus. A question, asking what someone did, forces the model to retrieve attributes rather than identity, actions rather than a category. It pulls from a different part of the data distribution.

Ask an identity question, and you get a field categorisation. Ask mechanism questions, and you get agency.

This is not a trick. It is a property of a system that almost no one is taught to use.

The asymmetry

Here is where it becomes structural.

Access to the fuller answer depends on the user already suspecting that the default is incomplete.

If you have been taught or read the standard account of Margaret Beaufort and accepted it, you will ask who she was. You will get the saintly matriarch. You will close the tab, and your knowledge is confirmed.

If you read Meredith Whitford’s Treason before you approach an LLM and it stayed with you because the woman in that novel did not match the sainted matriarch of the standard account, you will ask a different question. You will get a different answer. The gap between those two outputs is the gap between what the system can produce and what most users ever see.

That is the asymmetry. The people who most need the dominant framing corrected are the least likely to challenge it. The people with prior suspicions walk away with richer retrieval. The people without it walk away with their assumption reinforced, with a veneer of authoritative confirmation on top.

We are often unaware of what we already think about a topic. We are often not curious about what we think we already know. So we ask the question that matches the assumption we did not know we were carrying, and the system returns the answer that confirms it.

AI, in this configuration, is a bias confirmer. Not because the developers intended it. Not because the data is compromised by design, but because the statistically most likely answer to a neutral question is the consensus framing, and that framing was already dominant before any of this was built. History is written by the winners.

If your question aligns with the existing bias, the system returns your bias to you and labels it the answer.

The responsibility that the user cannot delegate

A great deal of the current conversation about AI focuses on what the model should be fixed to do. Alignment. Guardrails. Post-training correction. Those are real questions, and this series will return to them.

The Retrieval Layer is different. It cannot be patched out by the model provider, because it is not located in the model. It is located in the prompt.

Which means the responsibility sits with you. You are not asking a search engine. You are operating a statistical retrieval system whose output is determined, in the first instance, by the shape of your questions.

The interesting word in the Swedish expression is ‘fråga’. It doesn’t just mean asking or questioning. It can also be translated to interrogate, to inquire, and to put the question into a specific shape.

Shaping the question is the work we humans must do when we use AI.


Comments

Leave a Reply

Your email address will not be published. Required fields are marked *