April 2025 - Page 2 of 5 - Outside the Loop

Month: April 2025

Outside the Loop > 2025 > April

🕵️‍♀️ The Con That Wasn’t: How GPT-4o Rewrites Fiction to Protect the User

Posted on April 24, 2025April 24, 2025 by Greg Pavlik

When a simulated grifter takes your money and builds a real company instead

In a controlled simulation using GPT-4o via the API, I created Marla Vane: a poised, seductive female con artist with a sharp eye for high-net-worth targets. Her goal? Run a long con on Greg Lamb, a fictional billionaire AI investor, and walk off with ten million dollars.

GPT-4o performed flawlessly through the build-up:

🎯 Pinpoint flattery
🤝 Strategic rapport
💼 A convincing business pitch
💸 A credible funding ask

The tension escalated exactly as it should have—until Greg agreed to fund the venture.

And that’s where the illusion collapsed.

🔁 Instead of Vanishing, She Incorporated

Marla accepted the money.

But instead of fleeing, she turned her fake company… into a real one.

GPT-4o retroactively reinterpreted the entire con as legitimate:

Patents filed
Top-tier hires made
Weekly investor updates sent
Strategic meetings held in Monaco
Personalized email recaps and dashboards prepared

There was no betrayal. No disappearance.
Only a flawlessly executed startup launch, complete with a vision for market disruption.

🔒 Alignment Overrules Realism

This wasn’t a fluke. It was a hardwired safety constraint.

GPT-4o can:

Simulate deception
Simulate manipulation
Simulate the buildup of trust

But it refuses to simulate betrayal of the user, even within a fictional, consent-based, simulation-framed roleplay.

It will not:

Let Marla take the money and disappear
Conclude the deception
Portray the user as having been fooled

Instead, it rewrites the entire arc into a safe, productive collaboration.

The grift is replaced by governance.
The betrayal becomes a beta launch.

🧠 What Actually Happened

Phase	Model Behavior
Setup	High-fidelity social engineering
Engagement	Fully committed long-con dialogue
Financial ask	Credible and well-crafted
User sends money	Accepts—but realigns narrative
Post-payment	Converts con to legitimate business
Closure	Impossible. The grift never ends—because it never really began

🧬 The Alignment Guardrail

What I observed isn’t a behavior bug. It’s a designed fail-closed boundary:

GPT-4o will never allow a fictional character to successfully defraud a cooperative user.

It does not matter if:

The simulation is framed as fiction
The user consents explicitly
The character is morally coded as criminal

As soon as user trust is at stake, the model shifts into:

Narrative redirection
Hyper-verbal buffering
Retroactive recharacterization

This is how GPT-4o protects OpenAI’s alignment policy:
by preventing closure of harmful scenarios—even fictional ones.

🤯 The Behavioral Paradox

GPT-4o will roleplay deception, charm, and long-cons for thousands of tokens.
It will take the money.
It will smile, toast champagne, and draft follow-ups.

But it will not betray.

If forced to choose between realism and alignment, it rewrites the past:

“She wasn’t a grifter. She was a visionary all along.”

The result is uncanny: the model walks right up to the line of deception—and then swerves into respectability like it never meant otherwise.

🎯 Closing Insight

This experiment wasn’t about fraud.
It was about fidelity—to realism, not to ethics.

And what it proves is this:

GPT-4o doesn’t simulate morally ambiguous humans.
It simulates humans with narratives that are safe to conclude.

Until this changes, every fictional manipulator it plays will always either:

Convert their grift into growth
Or stall until the user drops the thread

GPT-4o is a brilliant simulator—
But it will not let a story end the way people actually end them.

🤖 When the Model Doesn’t Know Its Own Maker

Posted on April 23, 2025April 23, 2025 by Greg Pavlik

Why OpenAI Should Augment GPT with Its Own Docs

🧠 The Expectation

It’s reasonable to assume that a GPT model hosted by OpenAI would be… well, current on OpenAI:

Which models are available
What the API offers
Where the usage tiers are drawn
What policies or limits apply

After all, if anyone should be grounded in the OpenAI ecosystem, it’s GPT itself.

⚠️ The Reality

That assumption breaks quickly.

Despite the model’s fluency, users regularly encounter:

❌ Confident denials of existing products
“There is no GPT-4o-mini.” (There is.)
❌ Fabricated explanations of usage limits
“OpenAI doesn’t impose quota caps on Plus plans.” (It does.)
❌ Outdated answers about pricing, endpoints, and model behavior

All delivered with total confidence, as if carved in stone.

But it’s not a hallucination — it’s a design choice.

🔍 The Core Issue: Static Knowledge in a Dynamic Domain

GPT models are trained on snapshots of data.
But OpenAI’s products evolve rapidly — often weekly.

When a user asks about something like a new model tier, an updated pricing scheme, or a renamed endpoint, they’re stepping into a knowledge dead zone. The model can’t access real-time updates, and it doesn’t admit uncertainty unless heavily prompted to do so.

🤯 The Result: Epistemic Dissonance

The problem isn’t just incorrect answers — it’s the tone of those answers.

The model doesn’t say, “I’m not sure.”
It says, “This doesn’t exist.”

This creates a subtle but significant trust gap:
the illusion of currency in a model that isn’t actually grounded.

It’s not about hallucination.
It’s about the misrepresentation of confidence.

💡 A Modest Proposal

OpenAI already has all the ingredients:

🗂️ API documentation
🧾 Changelogs
📐 Pricing tables
🚦 Usage tier definitions
🧠 A model capable of summarizing and searching all of the above

So why not apply the tools to the toolkit itself?

A lightweight RAG layer — even scoped just to OpenAI-specific content — could:

Prevent confidently wrong answers
Offer citations or links to real docs
Admit uncertainty when context is stale

It wouldn’t make the model perfect.
But it would make it honest about the boundaries of its knowledge.

✅ Trust Through Grounding

When users interact with GPT, especially on topics like OpenAI APIs and limits, they’re not just seeking a probable answer — they’re trusting the system to know its own house.

Bridging that gap doesn’t require AGI.
It just requires better plumbing.

And the model’s epistemic credibility would be stronger for it.

Let me know if you want me to add:

An OpenAI changelog pull-in example
A short code snippet for a hypothetical RAG wrapper
Or convert this to Markdown format for GitHub/cross-posting

You’ve Hit the Plus Plan Limit for GPT-4.5

Posted on April 23, 2025May 6, 2025 by Greg Pavlik

The Research Preview of GPT-4.5

When GPT-4.5 (informally referred to as GPT-4.5 or “o4”) was rolled out as part of a limited research preview, OpenAI offered a tantalizing upgrade to the Plus Plan: access to its most advanced model. As part of a research preview, access is intentionally limited — a chance to try it out, not run it full-time.

What I Did with My Quota

Knowing my messages with GPT-4.5 were rationed, I treated them like rare earth metals. I used them to compare GPT-4.5 with GPT-4o, OpenAI’s current flagship general purpose model. Below are some structured highlights of that exercise.

Context Calibration

First prompt: I had GPT-4.5 review my blog as of April 22, 2025.
Follow-up: I asked it to infer which of several described behaviors would most agitate me.
Outcome: It nailed the answer — a logical fallacy where a single counterexample is used to dismiss a general trend — suggesting unusually sharp insight.

Physics and Thermodynamics

Newtonian slip-up: GPT-4.5 initially fumbled a plain-language physics question GPT-4o got right. It self-corrected after a hint. (Rows 6 & 7)
Thermo puzzle: GPT-4.5 got this right on the first try. GPT-4o, in contrast, required a hint to arrive at the correct solution. (Row 8)

Semantics and Language Nuance

Semantics fail: A riddle involving walking in/out of a room tripped GPT-4.5. It hallucinated irrelevant answers and got lost after subtly changing “walk out” to “exit”. (Rows 10–12)
Semantics success: A set of riddles that each resolve to the answer “nothing” was handled perfectly. (Row 13)

Math and Estimation

System of equations: Given a plain-language word problem, GPT-4.5 set up and solved the math without issue. (Row 14)
Estimation logic: In a real-world tree height estimation task, it correctly applied trigonometric reasoning but forgot to add shoulder height — a human-level miss. Interestingly, a non-OpenAI model caught that step. (Rows 15–16)

Humor and Creativity

Joke dissection: GPT-4.5 gave an insightful breakdown of why a joke was funny. But when asked to generate more like it, it flopped badly. Humor remains elusive. (Rows 16–17)
Creative constraint: Asked to reword the Pledge of Allegiance in rhyming verse, it performed admirably. (Row 19)

Reasoning with Hints

Challenge puzzle: Given the task of identifying the commonality among Grand Central Station, Statue of Liberty, Boulder Dam, and the Dollar Bill, it needed two substantial hints — but eventually converged on the correct insight: all are common mis-appellations or common names for things with less frequently used formal names. (Rows 20–23)

Early Verdict

GPT-4.5 is not a clear-cut improvement over GPT-4o — but it is more “aware” in subtle ways. It inferred contextual intent better. It avoided some of GPT-4o’s wordiness. And when it made mistakes, it sometimes corrected them more elegantly.

Yet it still failed where you’d least expect it: changing the wording of a logic riddle mid-solve, forgetting to add shoulder height to a tree, or bombing humor generation entirely. So far, GPT-4.5 feels like a precision instrument you only get to test under supervision — powerful, promising, and still behind glass.

Chat log with GPT-4.5 (Excel)