Microsoft Build 2026 Developer Conference: The “Agent-First” Era Has Arrived—Seven In-House Models Launched at Once

2026.06.03

Microsoft Build 2026 Developer Conference: The “Agent-First” Era Has Arrived—Seven In-House Models Launched at Once

Microsoft Build Unveils Seven New Models; First Flagship Inference Model Challenges Anthropic.

2026.06.03 - 01:45:42

AgentAI

Navigating Web3 tides with focused insights

Microsoft Build Unveils Seven New Models; First Flagship Inference Model Challenges Anthropic.

By Li Hailun

Edited by Xu Qingyang

On June 2, 2026, local time in the United States, Microsoft’s Build 2026 Developer Conference opened at Fort Mason in San Francisco. Centered on practical applications of cutting-edge AI technologies, the conference unveiled a suite of new products and updates spanning proprietary AI models, agent-based applications, OS-level security, developer tools, cloud services, and novel hardware platforms.

At last year’s Build 2025, Microsoft established the vision of an “AI Agent Era,” launching Copilot Studio for multi-agent orchestration, Windows AI Foundry, and announcing full support for the Model Context Protocol (MCP). GitHub Copilot introduced its programming agent—Coding Agent.

In Microsoft’s narrative, Build 2025 addressed the question of “what standards and frameworks should define the Agent Era,” while Build 2026 focuses on “how to truly operationalize agents using our own models and products”—bolstering the model layer with production-ready proprietary models, and advancing agents from demos to full-stack deployment across systems, hardware, and cloud.

This year’s keynote announcements can be grouped into six core areas: the MAI family of proprietary models; the agent ecosystem represented by Scout and GitHub Copilot applications; Windows’ system-level AI security sandbox, MXC; Surface RTX Spark Dev Box and system optimizations for developers; Project Solara—a new agent-native device platform; and developer tools and governance frameworks including Microsoft IQ, Rayfin, ASSERT, and ACS.

01 Seven Models Trained from Scratch—No Distillation

The keynote speech unfolded around CEO Satya Nadella’s strategic vision. After introducing the “Agent-First” framework, executives from various business units took the stage to deliver concrete products that bring this vision to life.

At the event, Mira Murati announced seven new models developed entirely in-house by Microsoft AI, collectively branded as the MAI family.

She described MAI’s mission as building a “mountain-climbing machine”—continuously improving itself through ever-greater compute investment, higher-quality data, and more precise evaluation—so users remain perpetually at the forefront of technology.

On training scale, Murati noted that compute used for training state-of-the-art models has grown one trillion-fold—and is projected to increase another thousand-fold over the next three years. All MAI models are trained “from scratch, zero distillation,” meaning they rely exclusively on original training data without leveraging outputs from third-party models.

Mira Murati, Head of Microsoft AI, introduces the seven proprietary models

The specific models are as follows:

MAI-Thinking-1—the flagship reasoning model—is a mid-size model. Microsoft claims it matches top-tier models on key software engineering benchmarks. In blind human evaluations, its preference score was on par with Sonnet 4.6. Trained from scratch on clean data, it incorporates no distilled knowledge from external models.

MAI-Code-1-Flash—a highly efficient, agentic coding model optimized for inference—features 5 billion parameters and is deeply integrated with GitHub Copilot, VS Code, and Microsoft’s broader tech stack. Microsoft says it rivals Haiku in performance but at lower cost.

MAI-Image-2.5 and its ultra-efficient Flash variant—text-to-image and image-editing models—reportedly outperform Google’s Nano Banana Pro on Arena benchmarks.

MAI-Transcribe-1.5—a transcription model—achieves state-of-the-art accuracy. It runs five times faster than competing models and natively supports domain-specific terminology recognition across 43 languages.

MAI-Voice-2—a high-fidelity, natural-sounding voice generation model—supports 15 languages, adapts voices from short audio samples, and includes anti-abuse safeguards. Its Flash variant is forthcoming, delivering equivalent functionality at reduced cost.

All models share identical data specifications, infrastructure, and evaluation frameworks. Beyond distribution via Azure Foundry and optimization for Microsoft first-party products, these models will also be available to developers on Open Router, Fireworks, and Baseten. For the first time, developers can fine-tune model weights themselves.

Nadella introduced Microsoft Frontier Tuning—a method enabling enterprises to customize models using their own operational data. The underlying principle is that the most valuable data isn’t generic corpora, but real-world task traces, steps, and decisions generated by agents inside an enterprise.

Microsoft CEO Nadella introduces Frontier Tuning

This mechanism integrates MAI models directly into real business workflows, enabling them to learn-by-doing in live environments. Murati said: “You’re building your own model—trained in your environment, on your data, under your control. Your institutional knowledge becomes part of the model—and belongs solely to you.”

In practice, the MAI model fine-tuned for Excel performs on par with GPT-5.4 while operating ten times more efficiently. McKinsey reported that after adopting Frontier Tuning, MAI achieved the highest win rate among all tested models—and cut costs by roughly tenfold.

In healthcare, Microsoft announced a collaboration with Mayo Clinic to co-develop a frontier AI model for healthcare. This model combines Mayo Clinic’s clinical expertise, de-identified clinical data, and longitudinal insights with Microsoft’s foundational AI capabilities.

Microsoft also revealed that MAI models are co-designed with its proprietary Maia 200 chip, achieving a 1.4x efficiency boost through software-hardware co-optimization.

02 Full-Scale Deployment of the Agent Ecosystem

Microsoft declared a sweeping “Agent-First” transformation aimed at automating how knowledge workers interact with software—embedding AI assistants directly into everyday workplace interactions.

Scout is the centerpiece agent product launched at the event. Dubbed “always-on,” this AI Agent is built atop the OpenClaw framework and interacts in Microsoft Teams just like a human colleague.

Scout browses users’ work messages, calendars, and email inboxes to automatically complete tasks, reschedule conflicting meetings, and draft professional-sounding replies. Users can issue commands directly in Teams or even assign Scout a custom name.

Omar Shahin, newly appointed Corporate Vice President at Microsoft, explained Scout’s design philosophy: “Your company, essentially, has hired your assistant. The whole point of having a personal assistant is that they keep working—even when you’re not.”

Scout is delivered via Microsoft’s Frontier Program and requires a GitHub Copilot subscription. Microsoft is currently testing a dedicated Scout desktop app, which will roll out to subscribers who opt into “Frontier” feature access. Internally, Shahin noted sales teams are the largest and fastest-growing user group.

The GitHub Copilot desktop application is another major announcement. Mario Rodriguez, GitHub’s Chief Product Officer, described it as “an agent-native desktop experience, built on top of GitHub.”

Through a unified “My Work” view, developers see dynamic activity across connected repositories—including active sessions, issues, pull requests, and background automation. Each session runs in its own Git worktree, ensuring parallel agents operate independently. The app features Agent Merge, guiding pull requests through review, checks, and merge. Its Canvas interface enables bidirectional human-agent interaction, allowing developers to inspect, guide, and validate agent-performed work.

The GitHub Copilot desktop app is available in technical preview for Windows 11, Windows 11 on Arm, macOS, and Linux. A GitHub Copilot subscription is required, with plans to extend access to Copilot Free users later. The app supports both cloud and local sandboxes, code review, and policy-enforced controls for both.

For agent safety and governance, Microsoft released the Agent Control Specification (ACS)—a new open-source standard designed to give developers more consistent, granular control over AI agent behavior. ACS allows development, compliance, and security teams to define policy files specifying what agents may do, what they must never do, when human approval is required, and what evidence must be logged for auditing.

ACS is released as an SDK with plugins for LangChain, OpenAI Agents SDK, Anthropic Agents SDK, AutoGen, CrewAI, Semantic Kernel, Microsoft.Extensions.AI, and MCP tools. Because policies are defined in single files, they can be bundled with agents and travel seamlessly across frameworks and environments.

ASSERT (Adaptive Spec-driven Scoring for Evaluation and Regression Testing) is another new testing tool—an open-source framework that uses AI to translate high-level natural language descriptions of goals, policies, or expected behaviors into structured scoring tests.

ASSERT accepts concise natural-language descriptions of expected AI model behavior, then generates sets of acceptable and unacceptable behaviors, scenario definitions, and test cases. It executes tests against target systems and scores outcomes. It also logs the AI system’s decision paths—including intermediate actions and tool calls—so developers can pinpoint failure points.

03 As Agents Grow More Autonomous, MXC Draws the Line at the System Level

As AI agents grow increasingly powerful and autonomous, Microsoft identified a critical challenge: the more autonomous agents become, the more useful they are—and the more dangerous it is to let them run unbounded across enterprise networks. Microsoft’s official blog describes this as a “multi-layered systems problem,” where every interaction between agents and humans, tools, applications, models, or other agents “exposes new attack surfaces and introduces distinct failure modes.”

To address this, Microsoft unveiled Microsoft Execution Containers (MXC)—a policy-driven execution layer built directly into the Windows OS. Pavan Davuluri, Microsoft’s EVP of Windows & Devices, emphasized that MXC is essential to making AI agents commercially viable: “It centers on security, containment, isolation, and user control”—ensuring agents are safe for both consumers and enterprise deployments.

Microsoft CEO Nadella introduces the system-level security sandbox MXC

MXC is fundamentally an SDK and policy model embedded in Windows and the Windows Subsystem for Linux (WSL), delivering what Microsoft calls a “composable spectrum of sandboxes.” This spectrum ranges from lightweight process isolation—already adopted by GitHub Copilot’s CLI—to micro VMs, Linux containers, and full cloud instances running on Windows 365.

The system isolates agent execution from the user’s desktop, clipboard, UI, and input devices. Each agent is bound to an identity—either a local ID or a cloud-provisioned identity backed by Microsoft Entra—ensuring every agent action is attributable, auditable, and governable.

MXC is now available in early preview. Agent 365—a version tightly integrated with Microsoft’s enterprise security stack—will enter preview in July 2026, layering Entra identity services, Intune device management, Defender threat protection, and Purview data governance onto MXC, enabling centralized IT management of agent isolation.

Partners already building on MXC include OpenAI, NVIDIA, Manus, Nous Research (creator of Hermes Agent), and the OpenClaw open-source project.

Notably, the OpenClaw partnership originated when its creator, Peter Steinberger, proactively reached out to Microsoft expressing interest—eventually evolving into a comprehensive, platform-level alliance.

04 Three Edge Updates Enable “Offline-First” Local AI

Microsoft Edge browser also received upgrades to its local AI capabilities. Since introducing Phi-4-mini at Build 2025, the team expanded on-device AI features based on feedback from web developers.

First is Aion-1.0-Instruct—a smaller, faster, and more efficient local LLM than Phi-4-mini. It runs on PCs with modest GPU and CPU resources and is now available in developer preview, with public release on Hugging Face scheduled for July.

Second is the Language Detection and Translation API, shipping with Edge 148. Powered by Edge’s on-device AI models and accessible via JavaScript, these APIs let websites and browser extensions detect text language and translate between language pairs. Microsoft claims they deliver “fast, high-quality translations across 145+ languages, optimized for web translation workloads”—and are offered free of charge.

Third is speech recognition via the Web Speech API, available experimentally in Edge Canary and Dev channels. This API helps developers integrate speech/audio input into websites and browser extensions—running locally on-device, while optionally falling back to cloud-based speech-to-text and text-to-speech services.

05 Iterations in Developer Tools and Cloud Services

At the data intelligence layer, Microsoft launched Microsoft IQ—unifying four previously independent context sources into a shared foundation for agents.

Amir Netz, CTO of Microsoft Fabric, drew an analogy: “The green code waterfall in The Matrix isn’t just decoration—it’s the bedrock of that world. What we’re doing in the data world is building a data-grounded reality for agents.”

Microsoft IQ comprises four context sources: • Work IQ—captures how organizations operate daily, drawing from emails, documents, meetings, and calendars; • Foundry IQ—manages institutional knowledge by curating and indexing knowledge bases; • Fabric IQ—models real-time business operations using data, defining entities, relationships, and business rules anchored to real-time signals powered by Fabric’s real-time intelligence (expected to launch publicly in the coming months); • Web IQ—adds real-time global context sourced from the web.

With this contextual architecture, agents evolve beyond simple command executors into virtual employees who understand how the company operates.

But a shared “foundation” alone isn’t enough. When agents begin generating applications, each needs a backend—and left unchecked, those apps risk forming new data silos outside the context layer. To prevent this, Microsoft launched Rayfin—an open-source SDK and CLI that deploys agent-built applications directly onto the Fabric platform as governed production backends. Application data defaults into the unified OneLake data lake and feeds back into Microsoft IQ—rather than accumulating externally.

Positioned as a competitor to Supabase and Neon, Rayfin’s core differentiator is governance: all applications flow through the same data and compliance pipeline. Netz described it as a two-way loop—agents draw information from corporate data policies when building apps, and the resulting app data updates those policies in turn, so the next agent benefits from the latest rules.

Microsoft also introduced WSL container support, enabling developers to create and manage Linux containers directly on Windows. It ships with a CLI and API, letting Linux containers run inside native Windows applications. Public preview launches in the coming months.

To eliminate setup friction, Microsoft released Windows Developer Configurations—a quick way to configure new machines with developer-optimized settings—automatically installing WSL, PowerShell 7, and Visual Studio Code, while enabling Git version control and showing hidden files in File Explorer.

06 Two New Hardware Devices Bring Heavy AI Workloads Back On-Device

This Build wasn’t just about models, agents, and dev tools—hardware had a strong presence too. As AI computing demands escalate and agentic workflows require sustained compute, Microsoft turned its attention to developers’ local devices: rather than constantly renting expensive cloud GPUs, why not run these workloads locally?

Andrew Hill, VP of Surface Products, announced two new devices:

Surface RTX Spark Dev Box is a compact developer PC powered by NVIDIA’s RTX Spark superchip—combining NVIDIA Blackwell RTX GPU and NVIDIA Grace CPU—to deliver up to 1 petaflop of AI compute and 128 GB unified memory.

Its aluminum chassis doubles as a heatsink, engineered for extended training jobs, large-model inference, and complex agentic workflows. Preinstalled with Windows 11 Pro, the device ships with developer-optimized configurations baked into the OS image: dark theme enabled, simplified taskbar, widgets removed, Do Not Disturb activated, Developer Mode enabled, and PowerShell 7 set as default shell. WSL 2 is preconfigured with GPU passthrough and CUDA support; VS Code, GitHub Copilot, Git, Python, and Node.js come preinstalled.

Security-wise, Surface RTX Spark Dev Box is built on Microsoft’s end-to-end zero-trust architecture—from silicon to cloud—including Secured-core PC design, BitLocker encryption, and Microsoft Defender protection, with seamless integration into Entra ID and Intune for large-scale management and governance.

Hill explained: “How developers build software is undergoing a fundamental shift. AI models are growing more capable and complex; agentic workflows demand continuous compute power; and even tasks that don’t require state-of-the-art models can incur cloud costs with every iteration.”

The other device, Surface Laptop Ultra—a high-performance laptop tailored for developers, creators, and technical professionals—launched earlier this year. Together, these devices represent the next evolution of Surface: purpose-built hardware for those building the future. Surface RTX Spark Dev Box will launch later this year in the U.S., exclusively on Microsoft.com.

07 A New Platform for Running AI Agents—Not Apps—on Devices

Stevie Bathiche, Head of Microsoft Applied Sciences, introduced Project Solara—an internal initiative.

It’s a new chip-to-cloud platform built on Android—not Windows—designed to run AI agents instead of traditional applications. Bathiche explained its origin: “Boundaries are collapsing. You don’t necessarily need conventional app paradigms—or traditional ways of building experiences.”

Two concept devices debuted at Build:

A desktop hub—placed beside a PC—responds to voice commands, logs users in via facial recognition, and surfaces today’s most urgent items. When connected to a monitor, it transforms into a full Windows machine running in the cloud.

A wearable badge—reimagining the standard employee ID card—activates its agent with a fingerprint tap. A light touch records and transcribes conversations, while its built-in camera lets the agent act on what the user sees.

In a healthcare demo, the badge ran an agent designed for clinicians—scanning patient QR codes, recording and transcribing visits, logging vital signs, and issuing prescriptions. In another use case, its camera scanned a brainstorming board listing office renovation ideas—and suggested adding greenery.

Bathiche emphasized Microsoft won’t manufacture these devices itself. Instead, it envisions hardware makers and industry partners turning these reference designs into products—each customized for specific industries, companies, or scenarios.

08 Quantum Chip Upgrade Boosts Reliability 1,000x

Microsoft also unveiled its next-generation topological quantum chip: Majorana 2.

Compared to Majorana 1, the key upgrade is replacing aluminum with lead as the superconducting material—boosting qubit reliability by 1,000x. Average qubit lifetime now reaches 20 seconds, with some instances lasting up to one minute.

Other quantum approaches typically achieve only microsecond-scale coherence times. Leveraging this breakthrough, Microsoft halved its projected timeline for scalable quantum computers—now targeting delivery before 2029.

The entire Majorana 2 development cycle leveraged Agentic AI capabilities from the Microsoft Discovery platform. AI agents handled manufacturing management, automated quantum-state measurements, and cross-disciplinary data analysis—compressing measurement cycles from weeks down by several orders of magnitude, and uncovering subtle correlations hidden in nearly two decades of accumulated data.

Chetan Nayak, Microsoft Technical Fellow, stated: “Agentic AI permeates nearly everything we do.” Yet he stressed AI serves only as guidance: “Scientists remain firmly in the loop.”

The Microsoft Discovery platform was formally launched at this conference—a company-wide platform for frontier R&D that empowers researchers to deploy autonomous agent teams guided by human experts—enabling hypothesis generation, experimental optimization, and theoretical validation. Microsoft also released an early preview of the Microsoft Discovery application—freely downloadable for individuals to run locally using a GitHub Copilot account.

Special translation contribution by Jin Lu.

Join TechFlow official community to stay tuned

Telegram:https://t.me/TechFlowDaily

X (Twitter):https://x.com/TechFlowPost

X (Twitter) EN:https://x.com/BlockFlow_News

Source

Add to Favorites

Share to Social Media

Author

腾讯科技

Microsoft Build 2026 Developer Conference: The “Agent-First” Era Has Arrived—Seven In-House Models Launched at Once

TechFlow Selected TechFlow Selected