Microsoft 365 Copilot’s new agent uses Claude to fact-check GPT’s work

Summary

Microsoft 365 Copilot mixes GPT drafting with Claude fact-checking for stronger research outputs.
Researcher’s Critique scores 13.8% higher on DRACO by combining drafting and citation checks.
Council shows multiple model answers and disagreements, letting you assemble the best workflow.

Things are getting interesting in the world of AI. First, companies used each other’s AI models. Then, we had a moment where everyone battened down the hatches and began focusing purely on making their model the best one. Now, we’re entering an era where each AI model does something the others can’t, so the only way to provide a truly stellar service is to mix different LLMs together.

Such is the case with Microsoft 365 Copilot’s newest agent, which mixes together the power of GPT and Claude when performing research. While GPT will be the one handling all the drafting, Claude will act as the strict editor who will fact-check the result and ensure everything is up to par. And the best part is, it works.

I tested Claude’s new interactive visuals, and they’re changing how I explain things

Most LLMs suffer with visualisation, Claude doesn’t

Microsoft 365 Copilot Researcher’s new feature delivers the best of two worlds

Two experts are better than one

samsung smart monitor m9 showing microsoft 365

In a press release, Microsoft revealed that Copilot Cowork is moving into the Frontier preview program. Copilot Cowork combines Claude’s capabilities with Microsoft’s own to create an agent you can delegate work to. It goes beyond the simple ‘chatbot style’ of LLMs and becomes a digital assistant of sorts.

One of the more exciting new features involves a new tool for Researcher, which combines two LLMs so that each one works in harmony with what it does best. As Microsoft describes it:

Researcher’s new Critique feature takes this even further, putting GPT and Claude to work together on every response: GPT drafts, Claude reviews for accuracy, completeness, and citation integrity before it’s delivered. […] The results are measurable—Researcher now scores 13.8% higher on the Deep Research Accuracy, Completeness, and Objectivity, or DRACO benchmark, the industry standard for deep research quality.

Researcher will also come with a tool called ‘Council’ that hands your prompt to several models and lets you see what each one says and where they agree or disagree. As such, Microsoft’s plan seems to be less on relying on Copilot to do all the heavy lifting and more tapping into the might of different AI companies to create a service that can handle every step of the workflow.

Microsoft 365 Copilot’s new agent uses Claude to fact-check GPT’s work

Taylor Sheridan’s next Paramount movie is based on a video game, not Yellowstone

The USB trick that bypasses your smart TV’s 100Mbps Ethernet limit

Firefox Nightly for Developers 151.0a1 APK Download by Mozilla

Operation PowerOFF identifies 75k DDoS users, takes down 53 domains

Taylor Sheridan’s next Paramount movie is based on a video game, not Yellowstone

Data Breach at Tennessee Hospital Affects 337,000

Gen Z Workers Pick Human-Only Output Over AI-Assisted

Our Picks

Operation PowerOFF identifies 75k DDoS users, takes down 53 domains

Taylor Sheridan’s next Paramount movie is based on a video game, not Yellowstone

Data Breach at Tennessee Hospital Affects 337,000

Microsoft 365 Copilot’s new agent uses Claude to fact-check GPT’s work

Summary

I tested Claude’s new interactive visuals, and they’re changing how I explain things

Microsoft 365 Copilot Researcher’s new feature delivers the best of two worlds

Two experts are better than one

Related Posts