Microsoft Revolutionizes AI Productivity Overnight; Altman Teases New Model! Copilot Now Fully Customizable, Developers Thrilled

Agents & Coding · Published: May 22, 2024 · Elena Volkov · ~8 min read

Author

Elena Volkov · Machine Learning Research Editor

Papers, benchmarks, and training economics — with the caveats spelled out.

The core technical claim is that Microsoft has successfully embedded AI into every layer of its productivity stack, moving from static tools to dynamic, customizable agents via GitHub Copilot Extensions and Phi-3. This would be falsified if the “natural language” interactions fail to resolve complex, multi-step developer workflows without significant human intervention or hallucination.

I read through the Build announcements with a critical eye, looking for substance beneath the hype. The filing shows a clear pivot toward making AI agents first-class citizens in the development lifecycle, but the reproducibility of these “custom” experiences depends entirely on the quality of the third-party connectors developers can build.

Microsoft Revolutionizes AI Productivity Overnight; Altman Teases New Model! Copilot Now Fully Cu… — figure 2

For over thirty years, Microsoft has had two dreams: First, can computers understand us, rather than requiring us to understand computers? Second, in this world of ever-increasing information, can computers help us reason, plan, and act more effectively based on all that data?

This wave of AI is the answer to those dreams.

The diverse productivity scenarios showcased at Build represent the stage where Microsoft realizes these visions.

Towards the end of the opening ceremony, Sam Altman appeared on stage to respond to questions and tease details about new models.

Microsoft Revolutionizes AI Productivity Overnight; Altman Teases New Model! Copilot Now Fully Cu… — figure 3

The market responded positively to Microsoft’s stock price, which surged to $431.84 at one point. It must be said that Microsoft has seen significant gains over the past two days.

Microsoft Revolutionizes AI Productivity Overnight; Altman Teases New Model! Copilot Now Fully Cu… — figure 4

Let’s start with the continuous upgrades to Copilot.

GitHub Copilot Extensions: Natural Language Interaction Across the Board

Targeting developers and teams, Microsoft has introduced GitHub Copilot Extensions, allowing users to customize their GitHub Copilot experience through natural language interactions via third-party service functionalities.

Microsoft Revolutionizes AI Productivity Overnight; Altman Teases New Model! Copilot Now Fully Cu… — figure 5

These extensions can be deployed immediately to Azure. Users can manage Azure resources through language interactions—for example, asking Azure where a web application is deployed and troubleshooting related code with a single click:

Microsoft Revolutionizes AI Productivity Overnight; Altman Teases New Model! Copilot Now Fully Cu… — figure 6

Any developer can create extensions for GitHub Copilot, incorporating various tools within the stack as well as internal proprietary tools.

Microsoft Revolutionizes AI Productivity Overnight; Altman Teases New Model! Copilot Now Fully Cu… — figure 7

By opening the Copilot Workspace, developers can view the entire codebase and request modification suggestions. Copilot will automatically apply these customizations:

Microsoft Revolutionizes AI Productivity Overnight; Altman Teases New Model! Copilot Now Fully Cu… — figure 8

Microsoft also introduced Copilot Connectors, enabling developers to customize Copilot using third-party business data, applications, and workflows.

Microsoft Revolutionizes AI Productivity Overnight; Altman Teases New Model! Copilot Now Fully Cu… — figure 9

I think the promise of “natural language” control over Azure resources assumes the LLM can reliably map intent to API calls without dangerous side effects. From the paper, custom Copilots are only as good as the proprietary data they ingest, which introduces significant privacy and leakage risks for enterprise teams.

Team Copilot: A Key Member of the Team

Microsoft has expanded its Copilot ecosystem with Team Copilot, shifting the paradigm from a personal assistant to an integrated team member. The filing shows this feature integrates directly into team chat groups, where it acts as a meeting moderator by recording content in real-time.

Microsoft Revolutionizes AI Productivity Overnight; Altman Teases New Model! Copilot Now Fully Cu… — figure 10

The system captures the entire meeting content live, then organizes topics and takes notes based on discussion progress with a single click. Other group members retain the ability to modify Copilot’s recorded content.

Microsoft Revolutionizes AI Productivity Overnight; Altman Teases New Model! Copilot Now Fully Cu… — figure 11

Team members can query Copilot directly if issues arise during discussions. When consensus is reached on a discussion point, the system automatically updates previous notes in real-time.

Microsoft Revolutionizes AI Productivity Overnight; Altman Teases New Model! Copilot Now Fully Cu… — figure 12

Microsoft Revolutionizes AI Productivity Overnight; Altman Teases New Model! Copilot Now Fully Cu… — figure 13

Microsoft Revolutionizes AI Productivity Overnight; Altman Teases New Model! Copilot Now Fully Cu… — figure 14

One caveat: real-time transcription accuracy in noisy office environments remains a significant technical hurdle not addressed here. I think the assumption that users will trust an AI to auto-update consensus notes ignores potential liability for misinterpretation. From the paper, without explicit version control, the “real-time update” feature risks overwriting human corrections with hallucinated summaries.

Agents Can Be Customized

Simultaneously, Microsoft Copilot Studio has introduced features allowing developers to customize Agents. Developers can now define an Agent’s role or select from existing templates.

Microsoft Revolutionizes AI Productivity Overnight; Altman Teases New Model! Copilot Now Fully Cu… — figure 15

Microsoft Revolutionizes AI Productivity Overnight; Altman Teases New Model! Copilot Now Fully Cu… — figure 16

The update allows developers to delegate permissions to Copilots assigned to specific roles, aiming to automate business processes. If an Agent encounters a problem it does not understand or cannot handle, it is designed to proactively present the issue and seek assistance. Furthermore, Agents possess the ability to learn from user feedback.

Microsoft Revolutionizes AI Productivity Overnight; Altman Teases New Model! Copilot Now Fully Cu… — figure 17

Nadella stated on stage:

I believe this is one of the key factors that will truly drive transformation in the coming year.

One caveat: “Learning from user feedback” requires strict guardrails to prevent agents from reinforcing biased or incorrect behaviors. I think the claim of driving transformation relies on the unproven assumption that enterprise workflows are ready for autonomous agent delegation.

The Push for Small Language Models

I followed Microsoft’s latest update to its Phi-3 family, a move that reinforces their commitment to Small Language Models (SLMs). This isn’t just about size; it’s about efficiency and accessibility. The lineup now includes five distinct models: Phi-3-mini (3.8B parameters), Phi-3-small (7B), Phi-3-medium (14B), Phi-3-vision (4.2B), and the new Phi-3-Silica (3.3B). Context lengths vary, with most supporting 128k tokens, though some variants offer shorter 4k or 8k options depending on the specific configuration.

Microsoft Revolutionizes AI Productivity Overnight; Altman Teases New Model! Copilot Now Fully Cu… — figure 18

Phi-3-mini, originally unveiled in April, already drew attention for matching Llama 2’s performance. With the addition of small and medium variants, these models are now accessible via Azure Machine Learning’s catalog. Phi-3-Silica, the smallest at 3.3B parameters, is slated to embed into Copilot+ PCs starting in June.

From the paper, claims of efficiency often ignore the overhead of running multiple background services on consumer hardware.

Microsoft asserts that Phi-3-Silica delivers a first-token output speed of 650 tokens per second while consuming only 1.5 watts. This low power draw supposedly ensures it doesn’t interfere with normal workloads or memory usage. However, during sustained operation, the model reuses the NPU’s KV cache and shifts to CPU processing, dropping to just 27 tokens per second.

Phi-3-vision serves as the multimodal entry point for this family, designed specifically for mobile devices. Building on Phi-2, it handles everyday visual reasoning but is currently limited to reading images—it cannot generate them. Microsoft optimized it heavily for charts, allowing it to analyze graphs and answer questions based on visual data.

One caveat: chart analysis is a narrow task; general visual reasoning remains significantly harder than these demos suggest.

During the presentation, Satya Nadella demonstrated this by feeding Phi-3-vision a chart showing AI tool usage across different workplace age groups. The model accurately extracted data, compared demographics, and generated a detailed report.

Microsoft Revolutionizes AI Productivity Overnight; Altman Teases New Model! Copilot Now Fully Cu… — figure 19

In terms of evaluation, the pure-text Small and Medium models outperformed competitors of similar size. Notably, Phi-3-mini, with under 4 billion parameters, surpassed Llama 3-8B, which has twice its parameter count.

Microsoft Revolutionizes AI Productivity Overnight; Altman Teases New Model! Copilot Now Fully Cu… — figure 20

Phi-3-small defeated GPT-3.5-Turbo in tests involving languages, reasoning, and mathematics. However, it lagged behind in coding capabilities and showed a more noticeable gap in knowledge retention.

Microsoft Revolutionizes AI Productivity Overnight; Altman Teases New Model! Copilot Now Fully Cu… — figure 21

The Medium version competes directly with Claude 3 Sonnet and Gemini 1.0 Pro. Its strengths mirror the Small model’s: strong language understanding, reasoning, and math skills. Yet, knowledge retention remains a consistent weakness across the family.

Microsoft Revolutionizes AI Productivity Overnight; Altman Teases New Model! Copilot Now Fully Cu… — figure 22

Microsoft Revolutionizes AI Productivity Overnight; Altman Teases New Model! Copilot Now Fully Cu… — figure 23

Altman’s Brief Return to the Stage

Nadella reaffirmed that OpenAI remains Microsoft’s primary strategic partner. Two hours into the keynote, Sam Altman appeared to close the session, though not alongside Nadella; he stood with Microsoft CTO Kevin Scott instead.

Microsoft Revolutionizes AI Productivity Overnight; Altman Teases New Model! Copilot Now Fully Cu… — figure 24

In a nine-minute address, Altman covered OpenAI’s roadmap, GPT-4o, and developer guidance. He framed the GPT-4o launch as part of a “crazy week,” noting unprecedented adoption velocity for such meaningful technology. While he avoided explicit mention of the recent voice feature controversy (the “Black Widow” incident), he emphasized the capabilities of their new speech modality:

As AI speeds up and costs decrease, OpenAI has been able to introduce new modalities like speech;

The speech modality was actually a real surprise for me.

He urged developers to build now rather than wait, comparing this moment to the advent of smartphones or the internet. However, he cautioned that AI will not automatically dismantle existing business rules. Altman also teased an imminent release of OpenAI’s most powerful model, describing its new modalities and general intelligence as currently mundane but crucial.

I think the comparison to the smartphone era ignores the current lack of standardized evaluation metrics for multimodal reasoning. From the paper, claiming unprecedented adoption velocity requires transparent, auditable usage data which remains opaque.