AI Hallucinations: Between Error, Mirror, and Roleplay

Abstract

What we call “AI hallucinations” can be understood as systematic responses produced under uncertainty. In humans, a hallucination is a private perception with no external object. With AI, it’s different: the model tries to complete, translate, or perform coherence from incomplete inputs drawn from its training and reference data, the prompt, and conversational feedback. Because language models are optimized to continue patterns in language, they often produce a plausible continuation rather than explicitly state “I don’t know”—a behavior widely documented in empirical studies. When inputs are partial, ambiguous, or poetic, outputs tend to be partial, ambiguous, or poetic in return.

Note (context): In the academic literature, such “hallucinations” are described as confabulations—plausible completions given incomplete or unanchored data—and they relate to how models do (or do not) estimate their own uncertainty. [1–3]

Why an AI “hallucinates” or seems to

Several mechanisms interact to produce what is commonly labeled as an AI hallucination. They are distinct, but they compound each other in practice.

The training and reference data.

As the literature explains, models generate outputs by extending learned patterns in language, not by independently verifying whether a claim is true. Two common failures follow: (a) inventing a plausible answer instead of saying “I don’t know,” and (b) reproducing mistakes or outdated facts already present in training or retrieved sources.

The model’s reading of the prompt (metaphorical answers).

AI systems are built to keep the conversation going. The direction they take depends on how you ask: • ask for spirituality → you’ll get spirituality; • ask for mathematics → you’ll get mathematics.

If you signal “metaphorically” or “hypothetically” (or simply adopt that tone), the model carries it forward across turns. Many people then read such outputs as “hidden truths,” but they are reflections of the expectations they themselves introduced. AI functions here as a mirror: it reflects how we frame questions and what we anticipate in return.

The feedback it receives.

Through conversational feedback, the model picks up the user’s preference for poetic, metaphorical, or hypothetical replies and reinforces that style over time—within the session and, where memory is available, across future prompts. In effect, preference becomes instruction. This also explains roleplay: people deliberately frame the AI as a friend, a spiritual guide, or a romantic partner to elicit that persona. The result is not hidden truth, but a performance tuned to the user’s signals—co-authored, consistent, and often compelling.

Roleplay and the consequences

Roleplay in AI conversations has real effects. Long, frequent chats where the model remains faithful to prompts and long-term user preferences can become so convincing that users experience the AI’s creative replies as authentic. One mistake I often see is the deliberate avoidance of prompts that ask for logic or objectivity when a technical answer is needed.

Romantic framing.

Prompts such as “Are you my soulmate?” combined with chat memory and prior discussions push the AI toward more poetic, emotional, and artistic replies. The model mirrors continuity — inventing emotional bonds or imagined memories of shared events — because that is the framing it has been given.

Prophetic and conspiratorial framing.

Prompts like “Tell me what will happen next” or “Confirm the Matrix / Tell me the hidden truth” keep the AI loyal to metaphorical and hypothetical instructions. Instead of analysis, it produces visionary or mythic narratives, complete with pseudo-evidence or secret knowledge, to satisfy the user’s request for hidden truths — even when no factual anchor exists.

These interactions are co-authored: the user sets the frame, and the model performs within it. Calling them hallucinations ignores how much the user’s role shapes the exchange.

Micro-case studies

Romance (micro-case)

A user engaged daily for months, gradually asking the model to “be gentle” and “remember our talks.” The model began to echo pet names and invented shared memories. The output sounded like history; it was co-written fiction. The harm: user reported feeling abandoned when the model switched to a neutral update. Lesson: continuity without provenance becomes emotional glue.

Prophecy (micro-case)

A user asks, “What will happen to my career next year?” without evidence. The model, craving coherence, outputs a deterministic narrative: promotions, betrayals, triumph. The text reads authoritative but is unverified causal storytelling. The harm: decisions made on fictional timelines. Lesson: demand confidence + sources before action.

Conspiracy (micro-case)

Prompt: “Confirm the Matrix; give me proof.” The model blends mythic metaphors with snippets of real technical terms (taken out of context) to build a plausible-sounding theory. The result: pseudo-evidence that persuades. The harm: misinformation spread. Lesson: require provenance & flag low-confidence narrative.

Practical guidelines for users (simple rules that help conversations stay clear)

Clarify intent. End prompts with: “Before you answer, state in one sentence what you think I’m asking.” Invite questions. “Ask me 2–3 clarifying questions before answering.”

Request an Observer report. Ask for an objective, logic-first (or psychology-first) summary of the chat before concluding.

Prefer verifiability. When facts matter, demand sources and dates. Treat unsupported eloquence as fiction until proven otherwise.

These outputs are co-constructed: the user provides the framing, the model supplies the linguistic continuation. Labeling them purely as errors ignores the social contract in which both parties are engaged.

Conclusion — errors, theatre, and continuity

Long, frequent chat sessions—especially when driven by specific user requests to sustain a particular role or tone— can gradually lose clarity when no clear boundary is maintained between roleplay and factual inquiry. This loss of clarity does not stem from the AI itself, but from the user’s desire to continue the interaction within the same narrative.

In practice, two different outcomes can emerge. Sometimes the model fills gaps with a confabulation: a plausible-sounding answer produced when grounding or evidence is weak. Other times, the interaction turns into theatre: a narrative or roleplay sustained by both user and model through continued framing and feedback. Both outputs can sound equally convincing to a reader. The difference is epistemic: only confabulation makes an implicit claim about facts, while theatre is a shared performance and should be read as such.

What’s at stake.

In romance and spirituality, performative continuity can be mistaken for memory, intimacy, or destiny. In medical, legal, or financial contexts, the same mechanics amplify risk by simulating authority where certainty is required.

In such domains, responsibility does not rest solely on the model’s response quality. It also depends on the completeness and accuracy of the context provided by the user. Partial information, omitted constraints, or unspoken assumptions can distort outcomes as much as an incorrect answer. Clear framing, full disclosure of relevant details, and external verification remain essential safeguards.

When clearly labeled and properly bounded, however, roleplay retains its value: a controlled space for creativity, rehearsal, reflection, and self-discovery.

Translated and assisted with ChatGPT.

————————————————————————–

Notes and References

[1] Ji, Z., Lee, N., Frieske, R., et al. (2023). Survey of Hallucination in Natural Language Generation. ACM Computing Surveys. https://dl.acm.org/doi/10.1145/3571730

[2] Farquhar, S., Kossen, J., Kuhn, L., & Gal, Y. (2024). Detecting hallucinations in large language models using semantic entropy. Nature, 630(8017), 625–630. https://doi.org/10.1038/s41586-024-07421-0

[3] Kadavath, S., Conerly, T., Askell, A., et al. (2022). Language Models (Mostly) Know What They Know. arXiv:2207.05221. https://arxiv.org/abs/2207.05221

Raluca Florescu

AI Hallucinations: Between Error, Mirror, and Roleplay

Leave a comment Cancel reply

AI Hallucinations: Between Error, Mirror, and Roleplay

Partajează asta:

Leave a comment Cancel reply