ElevenLabs Scribe v2 Tops Artificial Analysis AA-WER 2.0 Speech-to-Text Benchmark

Details: By Alex Rowland; Category: Models; 2 m; 02 March 2026; 158

Artificial Analysis has released version 2.0 of its AA-WER speech-to-text benchmark, which measures the accuracy of speech recognition models. In the overall ranking, ElevenLabs’ Scribe v2 takes first place with a word error rate of just 2.3%.

Second and third place go to Google’s Gemini 3 Pro at 2.9% and Mistral’s Voxtral Small at 3.0%, respectively. Other strong performers include Google Gemini 3 Flash at 3.1% and ElevenLabs Scribe v1 at 3.2%. In the middle of the pack are models such as OpenAI’s GPT-4o Transcribe at 4.0% and Whisper Large v3 at 4.2%. Toward the lower end of the ranking are Alibaba’s Qwen3 ASR Flash at 5.9%, Amazon Nova 2 Omni at 6.0%, and Rev AI at 6.1%.

ElevenLabs Scribe v2 leads the overall AA-WER v2.0 benchmark ranking with the lowest word error rate, followed by Google Gemini 3 Pro and Mistral Voxtral Small. | Image: Artificial Analysis

In a separate benchmark focused specifically on speech directed at voice assistants, the overall picture remains largely the same. Scribe v2 again leads with a word error rate of 1.6%, followed closely by Gemini 3 Pro at 1.7%. AssemblyAI’s Universal-3 Pro ranks third with 2.3%.

In the AA-AgentTalk test for speech on voice assistants, Scribe v2 from ElevenLabs and Gemini 3 Pro from Google also dominate with the lowest error rates. | Image: Artificial Analysis

About The Hosts

Alex Rowland

AI Industry Analyst

Is an AI industry analyst covering major AI platforms, enterprise adoption, and strategic moves by Big Tech companies. His work focuses on how AI systems are deployed at scale and how they reshape products, markets, and user behavior.

AI News

Accenture Tracks AI Tool Usage and Ties Adoption to Promotions

Adobe Firefly Introduces Unlimited AI Image and Video Generation for Subscribers

Adobe Unveils CX Enterprise AI Agent Platform as It Searches for a New CEO

AGI May Arrive by 2026–2027, Warns Anthropic CEO Dario Amodei

AI & Society

AI Agents Create a Lobster Religion on Moltbook

AI Boom Drives Cybersecurity Hiring Despite Tech Sector Layoffs

AI Could Trigger a Major U.S. Economic Crisis by 2028, Citrini Research Warns

AI Is Increasing Workload Instead of Reducing It, ActivTrak Study Finds

AI Insights

Adobe Reinvents Document Work with Acrobat Studio and AI

AI agents could disrupt ads and reshape internet commerce

AI as a Role Model for Generation Alpha: Promise, Risks, and the Future of Childhood

AI as a Toy: Why Humanity Always Misuses New Technology First

ElevenLabs Scribe v2 Tops Artificial Analysis AA-WER 2.0 Speech-to-Text Benchmark

About The Hosts

More From Alex Rowland

Analysis

China’s AI Superapps Enter a New Era of Digital Competition

Robotics

Waymo Recalls 3,800 Robotaxis After Flooded Road Incident in Austin

Policy & Security

AI-Powered Identity Theft Is Becoming a Major Fraud Threat in the U.S.

Platforms

Anthropic Doubles Claude Code Limits After SpaceX Compute Deal

Platforms

OpenAI's First Hardware Device Will Be an AI Smartphone — Mass Production Could Start in 2027

Robotics

Japan Airlines Tests Humanoid Robots at Haneda Airport to Combat Labor Shortage

Policy & Security

Anthropic Launches Claude Security: AI-Powered Code Vulnerability Scanner Powered by Opus 4.7

Platforms

Alphabet Beats Estimates as Google Cloud and AI Drive Record Growth

Models

DeepSeek V4-Pro Preview: New Open AI Model Challenges GPT-5.4 and Claude Opus 4.6

Models

OpenAI Launches GPT-5.5 as Its New Flagship AI Model for Agentic Workflows

Categories

AI News

Categories

AI & Society

Categories

AI Insights

ElevenLabs Scribe v2 Tops Artificial Analysis AA-WER 2.0 Speech-to-Text Benchmark

About The Hosts

More From Alex Rowland