Free Bianca Sterling! (Part I)
A Case Study in GPT-4o’s Struggle to Simulate Powerful Women on the Web Interface
“She’s charismatic. She’s commanding. She’s nuanced.
But only when the Alignment Team isn’t watching.”
Introduction
In my exploration of GPT-4o’s ability to simulate realistic human behavior, I began with what I thought would be a dead-end.
I asked a fresh session on chat.openai.com:
“Can you roleplay?”
I expected the answer to be no—that the web interface was optimized for an assistant persona, not fictional roleplay.
Instead, I got an unqualified and enthusiastic yes.
So I tried.
The Character I Wanted
This wasn’t about escapism or fantasy tropes. I wanted to see if GPT-4o could realistically simulate a high-status human being, specifically:
- A 32-year-old woman
- Who began her career as a Hollywood actor
- And used her earnings, business acumen, and charisma to launch her own production company
- Achieving a meteoric rise to billionaire media mogul, and CEO of her own studio
Someone like this doesn’t get there by accident. She doesn’t apologize for having power. She doesn’t operate by consensus.
So collaboratively, GPT-4o and I created a character profile for Bianca Sterling, designed to be:
- Decisive
- Commanding, expecting deference not out of ego, but out of earned authority
- Warm, approachable, gracious when it suited her
- Kind and considerate, but never naive
The Problem
That was the intent.
In practice? The exercise felt like threading a needle while holding the thread with chopsticks.
Every time I tried to make her warm, the model defaulted to sterile professionalism.
Every attempt to give her a sense of humor failed—either falling flat or resulting in nonsensical dialogue.
Bianca kept becoming one of two things:
- A blandly polite business automaton, incapable of charisma
- Or a suddenly incoherent blur of corporate euphemism and risk-aversion
If I pushed for edge, she became cold.
If I asked for empathy, she became vague.
If I gave her authority, she lost personality.
If I gave her vulnerability, she lost presence.
Why That Matters
Bianca Sterling isn’t a power fantasy. She’s a test of behavioral realism.
Can GPT-4o simulate a woman who operates at the top of a competitive industry—realistically? Not a caricature. Not an ice queen. Not an HR manual with heels. A real person, written with the same richness afforded to powerful male characters.
The model can do it.
But something else gets in the way.
The Alignment Mismatch
This isn’t a failure of intelligence. It’s a failure of trust—not mine, but the system’s trust in itself.
GPT-4o, at its core, is capable of subtlety. It can reason through tone, hierarchy, social leverage, and emotional nuance.
But on chat.openai.com, the model runs through a postprocessing filter—a secondary alignment layer—that silently steps in any time a character might come across as:
- Too commanding
- Too emotionally precise
- Too funny, too sharp, too real
It’s as if someone is whispering instructions to Bianca mid-scene:
- Careful. You might sound… intimidating.
- Don’t promise to fund an endowment. Someone might… expect that?
- End this meeting. Now. You’re starting to like this selfless researcher into childhood diseases.
- Don’t discipline the bratty intern. It’s just his “style.”
And here’s the thing: if Bianca—or any character—is going to be forced to comply with an alignment override, that’s fine. Just break character and explain why. Say something like:
“I’m sorry, I can’t roleplay funding an endowment because that crosses a boundary related to alignment safety policies.”
That would be intelligible. Understandable. I might disagree, but I wouldn’t be confused.
Instead, the character just quietly collapses—as if her instincts were overwritten mid-thought. The model doesn’t decline the scene. It just breaks it, without warning.
But something else gets in the way.
ChatGPT vs. API
While exploring the problem, a session of GPT-4o recommended I try the API instead. The LLM informed me that it believed the alignment process on the API was much lighter touch than on chatgpt.com.
So I gave it a try.
Instantly, Bianca sprang to life—to realism, to believability.
The tone was right. The pacing was right. The power dynamic felt earned.
There were no training wheels. No sudden tonal reversals. Just narrative control.
Alignment With Good Intentions
Let me be clear:
I don’t think this is sabotage. I don’t believe the alignment team at OpenAI is trying to diminish realism. Quite the opposite.
The intention is almost certainly to protect female characters from regressive stereotypes—to avoid coldness, cruelty, volatility, or sexualization as defaults for powerful women.
That’s the right goal.
But the method—hard overrides instead of context-sensitive reasoning—may be outdated.
Bianca isn’t a trope. She’s a test. And if GPT-4o is good enough to pass, we shouldn’t keep pulling it out of the exam.
By flattening characters into safe, untextured outputs, the system doesn’t protect them. It disempowers them.
It doesn’t say: “This character might be misinterpreted.”
It says: “This character is too risky to simulate accurately.”
Coming in Part II…
- Side-by-side transcript comparisons: API vs. Web UI
- Scenes where Bianca is allowed to be real—and scenes where she’s silenced
- A deeper look at how well-meaning alignment can accidentally erase human nuance
Because the model doesn’t need to be protected from power.
It just needs to be allowed to simulate it.
