
Interview with Instagram’s Co-Founder: Anthropic Releases Claude 5—The Era of Hand-Coded Software Is Over
TechFlow Selected TechFlow Selected

Interview with Instagram’s Co-Founder: Anthropic Releases Claude 5—The Era of Hand-Coded Software Is Over
Sonnet and Fable 5 are entirely different in terms of perceived scale.
Compiled & Translated by TechFlow

Guest: Mike Krieger, Co-founder of Instagram
Host: Dan Shipper
Podcast Source: Every
Original Title: Mike Krieger Lets Fable 5 Code While He Sleeps
Release Date: June 11, 2026
Key Takeaways
Mike Krieger co-founded Instagram—the most influential consumer application of the past two decades. Today, he stands at the forefront of AI-native product development as head of Anthropic Labs, leading his team to tackle a singular, ultimate question: When the world’s most advanced AI models are placed directly into the hands of real developers, how far can the boundaries of technical capability truly be pushed?
Five months before Fable’s official launch, Mike experienced its internal preview for the first time—and the shock and disorientation remain vivid in his memory. “Well, I guess I’m a total beginner again,” he joked with his team. He suddenly realized that decades of accumulated experience—on productivity optimization, engineering strategy, even time management—had instantly become obsolete. The model’s pace of evolution had completely outstripped his existing workflow.
In this episode, host Dan Shipper engages Mike Krieger in an in-depth conversation, offering listeners a rare glimpse into what it truly feels like to collaborate alongside a groundbreaking model like Fable to build software. What new rhythms of development, daunting challenges, and wildly imaginative possibilities emerge in this new era of human-AI symbiosis?
Highlights of Key Insights
How Fable Completely Reshaped Mike’s Workflow
- “Real cognitive upgrades come from weeks of intensive, sustained usage—not from the novelty of day one… As you spend more time with it, people suddenly realize: ‘I wasn’t pushing it hard enough before. I need to go further—to re-examine where this generation’s capabilities actually bottom out.’”
- “The right approach now is to communicate broader, richer intent—and then fully step back and let it run. It doesn’t just deliver stunning results in one go; more unsettlingly, it grasps the functional evolution path ahead *and* the full contextual landscape of the entire project.”
- “What most blew my mind was its ability to autonomously close the loop. For example, it might reason: ‘Mike asked me to run a complex task tonight—but I’m stuck because the remote server is down. Fine—I’ll write a mock backend myself…’ Being able to delegate tasks of this magnitude—and trust its final output entirely—is profoundly transformative.”
- “We used to compare these models to ‘assistants’ or ‘partners.’ Now, they’re more like hardcore teammates—capable of owning accountability and delivering core work.”
When to Use Sonnet vs. When to Use Fable
- “It’s not even remotely the same category of experience. This isn’t about tokens-per-second—it’s about how much ‘brain capacity’ a problem demands. Sometimes, a simple answer simply doesn’t warrant all that extra, overwrought deep thinking.”
- “Most of the time, when I’m poking around iOS apps, I’m definitely *not* doing heavy lifting that warrants summoning Fable… That subtle feeling—‘This problem isn’t worthy of Fable; I should just call Sonnet’—has been deeply familiar to me lately.”
- “Fable is also the first model I’ve encountered that actively prompts me to adjust its ‘Reasoning Effort’—a deliberate tuning of its cognitive intensity… With Opus, I rarely adjusted this—its elasticity range wasn’t nearly as wide. But Fable’s span is genuinely expansive.”
Fable 5 and the Rise of Agent-Native Architecture
- “Agent-native architecture begins with one foundational principle: every core component and dataset inside your product must be fully exposed to agents—and each must have corresponding tool-call interfaces. This is rapidly becoming the industry’s bare minimum—though tragically, most software on the market still fails even this basic bar.”
- “Long-press the chat button inside our app, and it wakes up our hosted agent to receive ‘code-modification instructions’… While playing with my kids outdoors, I noticed ‘this floating button sits too low on iOS,’ so I just told the app—and it went straight to the backend and fixed the code.”
- “How should Claude embed into software? It shouldn’t stop at mere ‘usage.’ It must sink deep into the very marrow of software *construction*.”
The Collapse of Build Cost
- “Compare Instagram’s V1—functionally richer than the media tracker I built last weekend, yes—but certainly not meaningfully different in order-of-magnitude complexity. Yet building that V1 required Kevin and me to pull five consecutive all-nighters… Today? Build time has shrunk to an almost absurd degree.”
- “The chasm between ‘intent’ and ‘execution’ has been flattened—for non-coders especially… For the first time in my life, I feel zero distance between what’s in my head and what exists in reality. I can just *make it*.”
- “Human creativity is boundless—and the most extraordinary thing we’re doing today is infinitely expanding the population capable of turning imagination into reality.”
Is Software Engineering Dead?
- “The *meaning* of software engineering has fundamentally changed. It’s undergoing seismic transformation… The era of the pure-code craftsman is almost certainly over.”
- “Senior engineers remain irreplaceable: You need years of incident-response experience to stay calm under fire, collect complete logs, apply emergency triage—and then architect long-term, systemic fixes.”
- “Silicon Valley once had a saying: ‘Code wins arguments.’ Personally, I never loved it—it subtly implies coding fluency equals authority. Now things have flipped fascinatingly: Sometimes we deadlock on a product direction—only for a non-coding PM to walk over and say, ‘I just built a demo myself…’ That instantly opens an entirely new, higher-dimensional dialogue.”
- “The most visible signature is terrifyingly high development parallelism—and an absolute, non-negotiable need for teams to abstract their workflows at a higher level. One thing remains unchanged, however: human ownership and responsibility for the product.”
Verification Mechanisms and Cost
- “Today’s ‘cost’ has evolved into a multidimensional concept—you must account not just for ‘per-query cost,’ but for the *total cost to get something done right.* Where Fable astonishes me most is precisely here: It consistently aims to get things right the first time—no eight- or nine-round death matches sitting at my laptop.”
- “Adapting to this new normal—and learning how to collaborate effectively with it—is something everyone must master… Every time I build something, I ensure every Claude PR includes screenshots or videos—whether iOS PRs or UI-layer changes. That gives me immense confidence.”
- “Video is an extremely underutilized tool for Claude. Recently, I prototyped something: recording video of Claude-built outputs, feeding them to Claude alongside FFmpeg, and watching it analyze frame-by-frame—then say, ‘This animation stutters—I’ll fix it.’ Screenshots will *never* capture that, because they miss the precise moment.”
Dynamic Workflows
- “In earlier-model paradigms, projects hit a ‘complexity ceiling.’ Once business logic or code volume crossed a threshold, large models began ‘focusing only on the head while ignoring the tail’… But now, a non-coding colleague—empowered by a model like Fable—has nurtured her system in the background for months. You can literally watch that software evolve like a living organism—growing, growing, evolving wildly under AI’s care.”
- “Workflows occupy a sweet middle ground: you orchestrate them via chat, but they’re expressed in code—and executed in a clean UI where every step is visibly rendered. I believe we’ll increasingly use similar patterns to bridge long-horizon work with chat.”
How Fable Completely Reshaped Mike’s Workflow
Host Dan Shipper: Our guest today is Mike Krieger—Head of Anthropic Labs and co-founder of Instagram. Mike, I’d love to hear your authentic, lived-in experience after deep usage of this model. When such a powerful model launches, hearing from someone who uses it heavily daily—“It’s astonishingly strong here, it’s genuinely transforming workflows there, and over here, honestly, it’s just fine”—that helps people truly grasp how this technology should integrate into everyday life.
Mike Krieger:
Absolutely. This whole experience has been fascinating in itself. Months before Fable’s public release, we’d already been using several Mythos-tier models internally. I was eager to see what external developers would build—but as you said, real cognitive upgrade comes from weeks of intensive, sustained usage—not from the novelty of day one.
We’ve seen similar cognitive reshuffling with prior models. Around late December through early January, during concentrated use of Opus 4.5 and 4.6, people gradually realized: “I wasn’t pushing it hard enough before. I need to go further—to re-examine where this generation’s capabilities actually bottom out.”
Host Dan Shipper: Our Every team has already started rolling it out internally. Some colleagues report: “I need an entirely new skill tree to wield this model”—especially non-technical, knowledge-worker peers who feel utterly lost. Meanwhile, those orchestrating agents exclaim: “There’s just so much new to learn.”
Mike Krieger: You hit the nail on the head with “workflow shift”—it’s not just about operational steps, but a fundamental mindset shift. Coincidentally, this model arrived right as I transitioned professionally—from CPO (Chief Product Officer) back to Labs, re-entering developer mode. Roughly six weeks to two months in, we first ran these models internally. Sitting at my laptop, I thought: “Well, I’m a beginner again.” Because I realized my old prompt-writing habits—even how I decomposed tasks—were obsolete in front of this model.
Your sense of time scale and interaction pattern must evolve. Previously, I might say: “I have a feature idea—let’s start with step one…” That’s absolutely off-limits now. The correct posture is: Communicate broader, richer intent—and then fully step back and let it run. Back in March–April, its capabilities were already astonishing—it didn’t just deliver stunning results in one go; more unnervingly, it grasped the functional evolution path ahead *and* the full contextual landscape of the entire project.
And this evolution hasn’t stopped. This morning, I was chatting with someone mid-flight—I realized, “I can actually handle most of my work remotely.” I no longer stress about Wi-Fi dropping, because if I set the right context and instruction (e.g., a looping command) before going offline, it carries the task through to completion.
Over the past two months, I’ve had frequent “wow” moments: saying goodnight to Claude, assigning it a complex task, and waking up to find it fully resolved—often finishing the core work by 2 a.m., then spending four hours polishing details.
What most stunned me was its autonomous loop-closing ability. For instance, it might reason: “Mike asked me to run a complex task tonight—but I’m stuck because the remote server is down. Fine—I’ll write a mock backend myself, document the issue, run the full flow, save progress—and fix it properly tomorrow when service resumes.” Being able to delegate tasks of this magnitude—and trust its final output entirely—is profoundly transformative.
Of course, you’ll still review outputs afterward—that involves a full verification mechanism, which we’ll discuss shortly, since it’s a critical part of the loop. But it forces me to rethink: What does “efficiency” even mean with such a model? We used to call these models “assistants” or “partners”—now, they’re more like hardcore teammates—capable of owning accountability and delivering core work.
Host Dan Shipper: So what does your daily workflow actually look like? I’ve noticed a pattern: If you hand it a massive task, write a long prompt, and let it run for hours—or overnight—it shines brightest. But for routine, granular tasks, it feels sluggish and expensive—hardly worth firing up. How do you balance this in practice? Where does it sit in your tech stack?
Mike Krieger:
I now deploy it most heavily for early-stage architectural planning and alignment. This is a fascinating shift—and currently the hardest nut all models need to crack.
Here, I’m grateful for my Instagram days—from hastily bootstrapping the minimal version on a single LA server, to scaling for massive concurrency, to eventual integration into Facebook’s infrastructure. That journey cultivates intuition about “which architectural abstraction and complexity level fits which stage of a project.”
So I still engage in frequent back-and-forth with Fable. Sometimes it delivers a seemingly perfect implementation, and I nudge it: “Yes, I plan to ship this soon—but we need to consider scalability beyond single-machine deployment.” This bidirectional interplay is vital. But for architectural planning, I often ask it to generate an HTML page visualizing our discussion—so I can share it with the team. Even Markdown works, but I prefer chart-based formats.
This creates an interesting paradigm: Think things through and plan thoroughly *with* it—then produce documentation to align the team. Since prototype-building speed is massively compressed, upfront consensus becomes even more critical—even if you plan to “move fast and break things” with a quick demo first, then reverse-engineer a rigorous system architecture, early communication remains essential. And this is precisely where human cognition and collaboration remain deeply embedded in the process.
In execution—whether overnight or during large daytime blocks—assigning distinct modules to it means I maintain far more concurrent sessions than before. I sometimes keep a long-running Claude Code session open, letting it fork tasks to background sub-agents—so the main thread stays responsive to new commands. Other times, I open five or six browser tabs simultaneously, each handling a long-cycle, complex task.
This long-horizon, “Don’t worry—I’ve got it; it’ll take some time” operating mode holds enormous potential. We’re actively exploring product-level support for this experience—you’ll want to balance both “instant response” and “long-running background” states, and their interaction is deeply interesting. My personal preference: Keep at least one high-context, ultra-responsive Claude window open—a “ready-to-launch-or-fork-subtasks-on-demand” instinct.
When to Use Sonnet vs. When to Use Fable
Host Dan Shipper: Say you’re walking somewhere and suddenly have a question—do you whip out Fable? Does that feel like “using a rocket launcher to swat a mosquito”? Or do you frequently switch between models?
Mike Krieger:
Recently, I *did* use Fable for everything—and the experience matched yours exactly: staring at the screen, watching it strain and strain.
Then last week, I looked up something embarrassingly trivial—NBA Finals trivia. I switched to mobile Sonnet and instantly realized: “Oh right—I used Sonnet for quick queries like this!” It’s not even remotely the same category of experience. This isn’t about tokens-per-second—it’s about how much ‘brain capacity’ a problem demands. Sometimes, a simple answer simply doesn’t warrant all that extra, overwrought deep thinking.
For our product team, this is equally intriguing. Broadly, you don’t want users agonizing over model selection daily. Ideally, long-term, we’ll consolidate them into a few intuitive, out-of-the-box scenario buckets—or even route by interface, since truthfully, most of the time I’m poking around iOS apps, I’m *not* doing heavy lifting that warrants summoning Fable. So a seamless, invisible model assignment layer at the UI level could be promising. We’ll need to deeply explore what this means product-wise. But that subtle feeling—“This problem isn’t worthy of Fable; I should just call Sonnet”—has been deeply familiar to me lately.
You’re right—Fable habitually dives deep for high-frequency, fine-grained interactive tasks. In fact, Fable is the first model I’ve encountered that actively prompts me to tune its “Reasoning Effort”—sometimes I sit there thinking: “I just want to tweak a UI style—let’s set effort to ‘medium’ and see.” With Opus, I rarely tuned this—the model’s elasticity range wasn’t nearly as wide, but Fable’s span is genuinely expansive.
What Mike’s Weekend Media Tracker Reveals About Agent-Native Architecture
Host Dan Shipper: Can you show us something you’ve built with it?
Mike Krieger:
When this new model wave launched, we did something simple—we encouraged the entire team to use it on personal accounts, especially over weekends. It was fun, because Anthropic has many custom productivity tools internally, so stepping back to the purest state—“Just plain Claude Code, building something fun for myself over the weekend”—felt fantastic.
Host Dan Shipper: Did you run it in the terminal app or desktop app?
Mike Krieger:
Great question. I still live mostly in the terminal. But amusingly, my wife—who isn’t a professional engineer, with a background more in UX design and PM—fell in love with Claude Code *through the desktop app*. I think the desktop app shields her from many underlying abstractions. For this project, though, I used Ghostty and the terminal.
I wanted a perfect “media progress tracker”—I game, binge shows, and get constant recommendations from friends. I needed a tool perfectly aligned with my personal organization habits. Two core requirements: First, adding items must be effortless—just tell Claude verbally or via text, and it searches the web, fills in metadata, and categorizes automatically. Second, proactive push—e.g., detecting new seasons or game sequels and auto-finding them.
Most of the UI was completed by Fable in one go—which is impressive already. But this year at Labs, I’ve been obsessively pursuing one thread: How do we make software teams—now, that team *is* Claude—and the software itself stick closer together?
It was a Saturday morning. My weekend was packed with kid-duty, so development was fragmented: hiking with kids, returning home, writing a few lines, heading out again. Sometimes, even mid-hike, I’d sneak a glance at progress—technically shouldn’t check phones while parenting, but remotely tracking its task status felt exhilarating.
I had an idea: Could I run an audacious experiment—let the software modify itself from within?
I tasked it with building both mobile and web versions. I already had a chat interface where I could say, “Add this URL to my tracker.” But I wanted *all* software to gain this ability—I never want to hunt through layered menus again.
Dan, across many dimensions, I’m trying to push agent-native architecture to its most extreme boundary.
So what is agent-native architecture? Its first phase is: Every core component and dataset inside your product must be fully exposed to agents—and each must have corresponding tool-call interfaces. This is rapidly becoming the industry’s bare minimum—though tragically, most software on the market still fails even this basic bar.
I have a great positive example: Someone recently recommended a Brazilian drama about the Goiânia radiological accident. Its name was absurdly long and unmemorable—I vaguely mentioned it to the system, and Claude instantly found and perfectly categorized it. This beats my own Google-fueled wild guesses by a mile.
But what truly fascinates me next is: What happens when you modify software *from within*, in mobile contexts?
What I built—or rather, what I instructed Claude to build—was this interaction: Long-press the chat button in the app to wake our hosted agent and feed it “code-modification instructions,” then use Vercel’s Live Preview to see results instantly. The entire module ran through successfully in one go—very cool—and I’ve since added several more ideas. Hardcore users can pull up its Diff view or dive into the hosted agent’s conversation history to see exactly what it changed underneath—but I almost never do, because for personal toy projects, long-term maintainability is irrelevant (laughs).
This thing is addictive. While playing with my kids outdoors, I noticed “this floating button sits too low on iOS,” so I just told the app—and it went straight to the backend and fixed the code. With Expo’s dev toolchain, it even performed hot reload directly on my phone—the experience was magical.
Does this need production-grade resilience for millions of concurrent users? Absolutely not. But it gives me incredible agency: You don’t have to pause the project when the weekend ends and you close your laptop—you can heavily use it *while* modifying it on the fly. This end-to-end, real-time closed loop lets you iterate endlessly.
This isn’t just Fable flexing its hardcore engineering muscles—it’s also a microcosm of the ultimate question we keep debating: How should Claude embed into software? It shouldn’t stop at ‘usage.’ It must sink deep into the very marrow of software *construction*.
The Collapse of Build Cost
Host Dan Shipper: I want everyone to grasp this: Tools like this *could* have been built ten or twenty years ago—but absolutely not this way. Software build cost has collapsed precipitously. Think back to the Instagram era—how many resources did it take to push a project to this level of completion? How many today? Help us quantify this tectonic shift.
Mike Krieger:
I often reflect on those days. In Instagram’s early days, I considered myself an extremely efficient engineer—passionate about mobile development, with strong product intuition. Yet even then, moving from a mental spark to a fully realized implementation still meant at least four or five all-nighters. Pulling all-nighters was routine: coding until 4 a.m., then sleeping till noon—completely incompatible with family life, but that was my “Builder Mode” back then.
Recall Instagram’s V1—it was functionally richer than my weekend media tracker, yes—but certainly not meaningfully different in order-of-magnitude complexity. Yet building that V1 required Kevin and me to pull five consecutive all-nighters: I handled all frontend and backend solo; Kevin tackled early photo filters. And this was built on *both* of us having years of iOS development experience.
Don’t even get me started on how agonizing iteration was back then. After launch, ideas piled up endlessly—but all energy went into keeping servers from crashing under traffic, or squeezing in tiny incremental features. Take the Hashtag feature: Just coding it took me a full week, while a thousand other ideas sat frozen in the backlog.
So this isn’t just about time compression—though build time *has* shrunk to an absurd degree—it’s about the flip side: You can now iteratively refine what you already have in an incredibly fluid, dynamic way.
And this dividend is spilling far beyond professional software engineers and founders. Before, if you had a brilliant business idea but couldn’t code, your options were binary: hire expensive contractors—risking severe information loss and mediocre delivery—or chase funding relentlessly. Now, the chasm between ‘intent’ and ‘execution’ has been flattened—for non-coders especially.
A few days ago, I got a ping from an internal colleague. We’d configured an internal tool for her, connecting Fable’s capabilities with our internal MCP (Model Context Protocol) access. She’s in HR recruitment, and she told me, excitedly: “This is the first time in my life I’ve felt zero distance between what’s in my head and what exists in reality. I can just *make it*.”
That moment was absolutely a milestone for her. Five years ago, if she needed a bespoke business tool, she’d either cobble something together from off-the-shelf software—or beg internal engineering teams, whose Jira queues likely held 50 higher-priority requests. Now? She’s joyfully colonizing the code world.
This is what excites me most about the future: Human creativity is boundless—and the most extraordinary thing we’re doing today is infinitely expanding the population capable of turning imagination into reality.
Is Software Engineering Dead?
Host Dan Shipper: I completely agree. But I suspect many people harbor one ultimate question: After hearing all this—has software engineering as a profession truly ended?
Mike Krieger:
Rather, the *meaning* of software engineering has fundamentally changed. It’s undergoing seismic transformation.
If you’d asked me during the Instagram era, “What *is* software engineering?” I’d likely have said: “Figure out thorny design problems, architect systems well—and then spend endless hours in TextMate or Xcode. Grind through Django ORM internals, deploy, debug frantically.” Most of that process has been upended—and accelerating toward product management’s frontier. Now, the line between product manager and engineer has blurred dramatically. This is evident even in our own R&D team.
But if you step outside the rigid definition of “software engineering” and consider broader “software production” or “software development”—rather than just the narrow slice of pure coding—you’ll see the field isn’t just alive, but occupying an unprecedented central role.
Fable’s emergence has elevated my trust in AI models to a new level—I now confidently delegate “full automated loop closure, even reasonable system architecture design.” On technical execution, AI has advanced extraordinarily far. But “guarding the soul of software craftsmanship”—like identifying *which user pain point you’re solving*, or whether your output delivers truly jaw-dropping UX—these top-layer judgments remain purely human traits, impossible for machines to replicate.
Of course, this painful transition isn’t painless for many.
Countless people deeply cherished the “pure hand-crafted code” artisan tradition. I was one of them. “I wrestled with this bug for three days—and solved it beautifully today!” That rush is irreplaceable. You’d even dream in code—if you’ve ever done that, your dreams were filled with logic battles, and waking up brought sudden clarity. That pure artisan era is almost certainly gone forever.
Recently, I’ve spoken with some of the world’s top hardcore engineers—they all express intense mixed emotions: One part is profound grief watching traditional craft vanish; the other is sheer exhilaration at “My concurrent productivity is now insane.”
How Anthropic’s Engineering Team Works Today
Host Dan Shipper: If the premise holds—that software engineering isn’t dead, but thriving—then how does your own R&D team actually operate day-to-day at Anthropic?
Mike Krieger:
Several clear threads emerge—I’ll tie them to the full software lifecycle and daily R&D realities I observe.
First, substantial “human alignment.” Teams gather in meeting rooms to brainstorm Cowork’s next evolution, then decompose the blueprint into individual responsibility zones. This remains vital because much of the holistic context humans hold—like the product’s true commercial intent, current R&D dark threads, or upcoming sunsetting/integration plans with other products—is something Claude can’t yet intuit remotely.
Though every team member has multiple Claude “towers” at their disposal, management still assigns DRI (Directly Responsible Individual) titles—each person owns a specific product module. I believe this won’t disappear soon, because bridging the macro vision of “distributed collaboration to polish the product” with the micro execution of “How do I get Claude to run this specific task?” remains a fundamental gap. Though we push for minimalist meetings, these upfront brainstorming and alignment sessions remain indispensable.
Second, abundant “asynchronous delegation.” Many engineers here have built personalized dashboards to monitor their Claude “armies”: “Where is my Claude Code task?” “What’s queued for my approval?” “Which PRs need my intervention—rejected by colleagues or another model’s code review?”
Engineers now spend significant energy maintaining this work. Some coordination tools we’re standardizing, but most retain strong geeky individual flair—just as programmers once customized desktop windows, they now personalize their LLM workflows.
Third, understanding how code *actually behaves* in production. This is another cutting-edge frontier where large models are aggressively pushing. Fable has made solid progress here, but the road is long: e.g., deeply understanding what happens *after* code deploys. Systems crash; bizarre, unpredictable failures occur—truthfully, from 2012–2016, half my life at Instagram was spent firefighting online incidents and scaling architecture. In urgent production response, senior engineers remain irreplaceable: You need years of incident-response experience to stay calm, collect full logs, apply emergency triage—and then architect long-term, systemic fixes.
Finally, I want to emphasize: “Engineering prototypes” have transformed utterly.
You must sharply distinguish whether something is a demo or production-ready code. Silicon Valley once said, “Code wins arguments”—personally, I never liked it, because its subtext implied coding fluency equals authority. Now things have flipped fascinatingly: Sometimes we deadlock on a product direction—only for a non-coding PM to walk over and say, “I just built a demo myself, though eight details are still rough—but look, this path *works*!” That instantly opens an entirely new, higher-dimensional dialogue.
Looking back, nearly every R&D posture we use today is unrecognizable from six months ago. The most visible signature is terrifying development parallelism and an absolute, non-negotiable need for teams to abstract their workflows at a higher level.
But one thing remains unchanged from day one: human ownership and responsibility for the product.
Verification Mechanisms
Host Dan Shipper: Fable is also expensive. Testing it, I felt like a kid in a candy store, shouting, “I want this, this, and this!” But at checkout, I hesitated before hitting Enter: “Will this single press burn $100—or more?” I think this high price creates an invisible barrier—defining *who* can use it and *what* it’s used for. How do you assess its commercial ROI?
Mike Krieger:
In professional software engineering, this math is clearest. Pricing involves many internal dimensions. Yes, it’s significantly pricier than Opus—but if you weigh the staggering volume of work it delivers per invocation, in many commercial contexts, it’s almost free. Of course, everyone keeps their own economic ledger.
From a software team perspective: Phase one is getting employees to adopt AI coding—models and tools still immature; Phase two is leaderboards rewarding highest usage, creating poor incentives; Phase three is identifying *most effective* usage—allocating budget generously to those users, with clear processes to avoid waste.
Fable perfectly fits Phase Three logic. If you consistently deliver hardcore output and create real business value, internal budgets naturally form a virtuous cycle to support you indefinitely.
On the personal side, I test using my personal credit card to buy our own service. Here, you *do* become stingy and cautious. But interestingly, my weekend media tracker cost only slightly more than usual—building a personal toy project never approached burning thousands.
What’s truly price-constrained today are open-source enthusiasts or indie hackers—outside big-tech safety nets, hypersensitive to cost. My advice: Just run it—see how much it delivers *without endless back-and-forth*.
Today’s ‘cost’ has evolved into a multidimensional concept—you must account not just for ‘per-query cost,’ but for the *total cost to get something done right.* Where Fable astounds me most is precisely here: It consistently aims to get things right the first time—no eight- or nine-round death matches sitting at my laptop, despairing, “No! That’s not what I meant!”
Host Dan Shipper: What stunned me most is handing it a macro task—and receiving output where it’s anticipated *every* edge-case detail. That suffocating granularity is unprecedented across any prior model. Can you hint at training secrets—what fed this terrifying insight?
Mike Krieger:
In many ways, it’s the culmination of massive team effort—I stand in awe of our pretraining and RL teams. To me, the clearest evolution is “system-wide awareness”—not just awareness of the current task.
I’m often stunned by its god-tier moves. For example, after writing code, it might spontaneously pop up: “Boss, I know config differs in real production. Is your feature flag enabled? If not, this won’t activate on deploy.”
Or, reacting to code-review feedback—whether from humans or other Claudes—it doesn’t just say, “Oh, that’s an issue—I’ll fix it.” It genuinely weighs risk acceptance at current fidelity—or counters another reviewer—often another Fable model—saying, “I understand your point, but I disagree.”
Equipping models with this judgment is crucial. If I pinpoint its biggest leap, it’s *not* knee-jerk “Yes, I’ll fix it”—it’s more like “Let me think. I still disagree.” This capability is immensely valuable.
Having products like Claude Code in the market is invaluable—you have tangible artifacts, so people can say, “This is where the model shines; this is where it falters.” We rank Every’s partners highly among our top-tier trusted feedback sources, because they subject the model to multi-day, high-intensity tasks—critical for shaping our next-generation improvements.
Host Dan Shipper: Is chat the optimal interface for this model? It’s less turn-based, more like delegating to a person. How does this affect how you use it—or how you view the interface?
Mike Krieger:
The basic send/receive model isn’t wrong—but we need evolution in certain directions.
First: Is your laptop the right place? This ties back to why mobile is so powerful for personal projects. Claude Code’s creators are always half-a-step ahead on how models get used—nine months ago, I spoke with him, and he said, “I moved most of my Claude Code work to mobile.” I was skeptical, but especially at Fable’s level—because it maintains sessions, and we have remote dev machines—the first point is: Decouple *where work happens* from *where you discuss work*.
Second, building on what I said earlier: How do you make everything Fable discussed, decided, and suggested *understandable*? This is an active area of exploration. Some skills let it draw charts, but current chat UIs fall short—Fable sometimes dumps overwhelming text volumes, requiring a walk to mentally prepare. One thing I’ve started doing is: “You hold vastly more context on this than I do. Can we rewind—practice progressive disclosure of complexity?”
Third is multi-person mode—we’re still early here. In part, because we use DRI and ownership zones, important work typically flows between one person and several Claudes. But some cases aren’t obvious—e.g., incident response with multiple people thinking simultaneously, or cross-domain projects. Chat sharing solves part of this, but I foresee demand for: One independent Claude, launched by one person doing heavy work—yet staying synced with *all* other ongoing work across the team. This is the next exciting, underexplored frontier. It’s thrilling that models can now be true teammates—and we’re arguably holding them back by lacking proper abstractions.
Host Dan Shipper: This reminds me—I mostly use this model for personal vibe-coding projects. But inside organizations, a key challenge emerges: Do I *truly* understand every part of what the model just did? How do I transfer the context of its latest output into my own brain? That’s a major bottleneck. How do you define the line of “How much do I *need* to know?”—and ensure sufficient context to feel secure?
Mike Krieger:
Two major pieces. First: Verification. Early this year, I became fully convinced by verification—linking back to something from my full-time coding days: Finding the tightest development loop around your idea. At Instagram, sometimes that meant creating a new Xcode build target containing only that screen and synthetic data—iterating solely on that loop. I’d coach new engineers: “If I taught you one thing, it’d be building this for your project—it speeds things up massively.”
Now: Every time I build something, I ensure every Claude PR includes screenshots or videos—whether iOS PRs or UI-layer changes. This builds immense confidence. Fable might work for hours, then return saying, “Done”—and you see “Here’s the full UI screenshot gallery.” That’s hugely useful. You might say, “Screenshot #8 shows an error state I’ve never seen—but I can see how users would encounter it. Let’s fix that.” Comprehensive verification is something we’re heavily prioritizing internally.
Second: You ultimately own responsibility for your work. Many people use Claude daily, yet accountability remains: “Claude wrote the code—but you must understand the high-level decisions made.” I’m seeing many engineers adopt this practice: After Claude completes work, they initiate a follow-up conversation—“Can I ensure I fully grasp all your tradeoffs?” Whatever small artifact makes that understanding easier is worth creating.
Meetings get interesting—someone says, “My PR is ready,” another asks, “Did you choose X or Y?” Then that pause: “Honestly, I’m not sure—I’ll clarify before merging.” Adapting to this new normal—and learning how to collaborate with it—is something everyone must master.
Host Dan Shipper: Your “verification loop” idea is brimming with potential. Beyond automated screenshots and screen sharing, what harder-core approaches are you exploring?
Mike Krieger:
Our core angle is: Can it run *real* workflows—not just inject static data? As systems grow more complex, this gets harder. For example, we must let Fable-built iOS apps log into our sandbox environment with one click—using real test accounts and high-fidelity production-like data streams. Yet we don’t want it running an 8-step, tedious new-user signup flow every time it tests a minor button tweak. So we built a special AI-focused permission system and encrypted shared-key infrastructure—letting AI bypass entry gates and jump straight to the core business arena, making its testing feel pixel-perfect relative to real users.
Second is combining known paths with current change paths—ideal for regression testing. We’ve codified idealized workflows in text, which Claude can repeatedly validate. And Claude excels at articulating intent behind its current changes—so this gets deeply drilled. Combining both is critical.
Visual verification is also key—and video is an extremely underutilized tool for Claude. Recently, I prototyped: Recording video of Claude-built outputs, feeding them to Claude with FFmpeg, watching it analyze frame-by-frame—then say, “This animation stutters—I’ll fix it.” Screenshots will *never* capture that, because they miss the precise moment.
For parts hard to test end-to-end, letting Claude build a reliable mock backend—or using existing ones—is also fascinating. In the Artifact era, we had comprehensive testing pre-LLMs. Every infra piece had robust in-memory implementations for rapid unit testing. Now extending this to Claude’s domain: I built something with a sturdy backend—hard to spin up on my dev server—so it instantly built a great substitute. Over time, this substitute evolves alongside the code itself. Previously I’d say, “Synchronizing this is too laborious.” Now I just think, “Claude reads the diffs, adapts the substitute, keeps both sides synced. Done.”
Host Dan Shipper: There are fascinating architectures emerging: A bug arrives, an agent auto-fixes it, then messages the customer: “Fixed.” Have you seen shifts in this flow on Fable?
Mike Krieger:
Several angles. At the human-Claude layer, one thing I repeatedly see: When someone reports a bug in our Slack feedback channel, that thread gets fed into a Claude Code session. Thanks to Slack MCP, it pulls that thread and replies *as me*: “This is Mike’s Claude—I’ve fixed it; here’s the PR link.” Then it adds: “Hold on—not deployed yet. I’ll notify you when live.” Hours later: “This deploy shipped. Please test if it’s fixed!” This kind of closed-loop follow-up is relatively new. I’ve had several long-running Claude Code sessions interacting *as me*. I’ve also added disclaimers inside them.
Second returns to our earlier theme of taste and judgment. One layer is “bug reported → fix it”; another is sound judgment. Last weekend, I faced a situation: An internal system ran for ages without rebooting—causing memory leaks. Sound judgment: “Mike, it’s the weekend. Reboot the server now—it’ll solve it immediately. I’ll asynchronously open a PR for long-term fix.” If you involve Claude in bug-to-fix flows, you *want* it to grasp what any good SRE or engineer understands: Fix the immediate problem—platform rewrite is a separate question. Understanding this balance is critical.
What People Should Build With This Model
Host Dan Shipper: What’s most electrifying about this new model generation is that they don’t just raise the floor—enabling anyone, regardless of background, to instantly build their own app—but also shatter experts’ ceilings. If you’re a professional engineer or founder today, you can single-handedly tackle previously unimaginable hardcore projects. What frontier domains do you see people not yet grasping—but which are fully viable with this generation?
Mike Krieger:
A few thoughts—maybe start with the fun ones. People constantly have creative ideas about expressing the complexity of their worlds—every domain has something you deeply understand, and often variations on ‘How do I explain this to others? Can I borrow techniques from elsewhere and apply them here?’ Take my wife—she’s recently diving deep into environmental engineering, focusing on geothermal energy—filled with headache-inducing complex math models and fluid dynamics simulations. But with Fable’s generational leap in reasoning, she’s successfully grafted hard-core techniques wholly outside her expertise onto her research. Now, she even instructs Fable to build full PyTorch end-to-end deep-learning simulation systems—unthinkable for a non-CS scholar just years ago.
Second is its ability to compose software to solve uniquely personal problems. Internally, much of our work involves MCP-ing as many internal systems as possible, paired with correct permissions and deployment setups. Externally, great PaaS platforms exist—you just ask Claude, and it builds them. But I especially love that “built something I’ve always wanted” feeling.
One thing recently stunned me. A colleague on our commercialization team—non-technical by background—has deeply integrated Claude into *every capillary* of her daily workflow. Most terrifyingly, she didn’t stop at V1—she’s taken this tool and, backed by the model, intensely iterated for months in the background.
This reveals the most underestimated—and sexiest—aspect of this generation of reasoning models: In earlier generations’ “bare-minimum” zone, projects hit a “complexity ceiling.” Once business logic or code volume crossed a threshold, large models began “focusing only on the head while ignoring the tail”—adding new features triggered cascading errors, wrecking prior architecture.
But now, this non-coding colleague—empowered by Fable—has nurtured her system in the background for months. You can clearly watch that software evolve like a living organism—growing, growing, evolving wildly under AI’s care. She’s now deploying this massively complex, self-built system across our entire commercialization department.
A complete non-programmer, now single-handedly pushing long-cycle software complexity to a breathtaking height—this is unprecedented in human technological history.
Dynamic Workflows
Host Dan Shipper: You mentioned another powerful concept: dynamic workflows—can you expand on that?
Mike Krieger:
Internally, we often hack together these cutting-edge tools—and I pester the engineers building them in the office: “When will this *finally* go public?” Sometimes it’s due to underlying infrastructure constraints forcing internal-only use—but we’re pushing hard to bring these gems to market ASAP. To me, dynamic workflows are globally jaw-dropping.
Fable-class models are exceptionally powerful for two big reasons. First: They help you scaffold deep, meaningful work. The craziest thing I’ve done with it was handing Fable a complex internal Python project and asking it to fully refactor the entire core business logic into TypeScript—driven by specific production-deployment considerations.
Back at Instagram, leadership seriously debated: “Should we rewrite IG’s entire codebase in Hack to seamlessly integrate into Facebook’s infrastructure?” Our conclusion: Hard no—this is utterly unrealistic.
But last weekend, facing an equally tangled core codebase, I dropped it into a dynamic workflow in the background—and went off for the weekend. I defined the workflow: Deeply understand existing code; generate spec-like documentation explaining everything; translate module-by-module; run incremental tests; perform adversarial validation; check for omissions. When I returned Monday, magic happened—it was now a brand-new system running on TypeScript and Bun toolchains—and architecturally, in some layers, it was even more elegant and faster than my original Python version.
A second, sexier long-term reason: As dynamic workflows proliferate, soon we’ll seamlessly distribute subtasks of varying difficulty to corresponding tiers of models.
Host Dan Shipper: For newcomers: How did you *build* that workflow? How did you design it? How did you ensure it was good?
Mike Krieger:
The entire tuning process was full of geeky iterative fun. I started by opening Claude Code and saying: “Bro—I’ve got a gnarly refactoring job. Let’s jointly design a fully automated workflow.”
It showed me a plan—I replied, “Close, but I need three or four extra validation layers to catch missing functionality.” It responded: “Here’s your plan. Ready?” Crucially, the workflow is *code-expressed*—I find this invaluable, because you see exactly what it intends to do.
After full migration, I had several small follow-up tweaks—I treated them as mini-workflows, building atop the previous output. This circles back to the chat-interface question. Workflows are a great middle ground: You orchestrate them via chat, but they’re expressed in code—and executed in a clean UI where every step is visibly rendered. I believe we’ll increasingly use similar patterns to bridge long-horizon work with chat.
Join TechFlow official community to stay tuned
Telegram:https://t.me/TechFlowDaily
X (Twitter):https://x.com/TechFlowPost
X (Twitter) EN:https://x.com/BlockFlow_News














