Imagine telling your computer to look up vacation rentals, compare five sites, fill out the booking form, and confirm the one closest to the beach. You go make coffee. It’s done when you get back. That is the promise of "computer use agents"—AI that reads your browser screen and clicks, scrolls, and types exactly as a human would, with no special plugins required.
The family comes in three sizes: 4 billion, 9 billion, and 27 billion parameters, all built on Qwen3.5, an Alibaba base model that Microsoft fine-tuned for browser work, with all weights publicly released. (Parameters are what determine an AI model's breadth of knowledge, with more generally meaning a higher capacity.)
The benchmarksOnline-Mind2Web is the benchmark that matters in the task Microsoft wanted to excel. It tests how often an AI agent correctly completes 300 diverse, real-world tasks across 136 popular live websites—things like comparing products, filling forms, and booking services—scored as a percentage of tasks finished correctly on the actual, changing internet.
Fara1.5-27B scored 72%. OpenAI Operator scored 58.3%. Google's Gemini 2.5 Computer Use scored 57.3%. Yutori's Navigator n1, the top proprietary alternative, reached 64.7%. Even Fara1.5-9B, the mid-sized model, hit 63.4%—ahead of both OpenAI and Google.

Open-source rivals also fell short. Alibaba's GUI-Owl-1.5 at 8 billion parameters scored 48.6%. AI2's MolmoWeb scored 35.3%. Microsoft's own previous model, Fara-7B, scored 34.1%—making this release nearly double its predecessor at a comparable size.
On WebVoyager, a second benchmark measuring task success on the live web scored the same way, Fara1.5-27B hit 88.6%, edging OpenAI Operator's 87.0% and beating H Company's 30-billion-parameter Holo2 at 83.0%.
How it learnedThe secret sauce is the training pipeline. Microsoft used a system called FaraGen1.5 to generate the training data. Here's the clever part: they used GPT-5.4—OpenAI's model—as a "teacher agent" to demonstrate how to complete browser tasks. Those demonstrations become the training data for Fara1.5. You're essentially using OpenAI's most capable model to train a rival open-source one.
They also created six fake, fully functional replicas of real websites—email clients, calendars, marketplaces—so the model could practice tasks that require logins or irreversible actions (like actually sending an email or booking a flight) without touching real accounts. That's called synthetic domain training, and it's a significant part of why Fara1.5 handles "gated" tasks better than its predecessors.
Fara1.5 runs everything through MagenticLite, a sandboxed browser environment that logs every action and lets users halt the agent at any point.


















