Gemini in Google Sheets scores 70.48% on public spreadsheet benchmark

What Happened

Google announced on March 10, 2026 that Gemini inside Google Sheets has achieved a 70.48% success rate on SpreadsheetBench, a publicly available benchmark that tests an AI model's ability to autonomously perform complex, real-world spreadsheet tasks — including building formulas, restructuring data, and executing multi-step sheet designs. According to Google, no competing model has matched this score on the same benchmark.

The feature lets users type a plain-language description of what they want — for example, "create a budget tracker that flags overspending by category" — and Gemini handles the execution: constructing formulas, pulling in data, and setting up the sheet structure. Advanced optimization problems that previously required either deep spreadsheet expertise, manual formula construction, or third-party add-ons can now be addressed through a conversational prompt, according to Google's announcement.

The capability is powered by a combination of Google DeepMind research and Google Research OR-Tools, the latter being a well-established open-source library for combinatorial optimization. The feature is currently in beta and is available to subscribers on Google's AI Ultra and AI Pro plans.

Why It Matters

Spreadsheets are the most widely deployed data tool on the planet. They sit at the center of financial planning, inventory management, project tracking, and data analysis in organizations of every size — yet the gap between what spreadsheets can do and what most users know how to build has always been substantial. Complex formulas, nested logic, and optimization models have historically required either dedicated training or a specialist to write them.

Editorial sketch of a benchmark scoreboard showing a rising bar chart with the tallest bar labeled 70.48 percent, representing Gemini's state-of-the-art SpreadsheetBench performance

A 70.48% score on SpreadsheetBench is a concrete, testable number tied to a public benchmark rather than an internal evaluation. That distinction matters: public benchmarks can be independently reproduced and challenged, which makes the claim verifiable rather than purely promotional. "State-of-the-art" is Google's framing; the percentage is the fact to watch.

The underlying use of OR-Tools is worth noting. OR-Tools is a proven solver for constraint satisfaction and linear optimization problems — the kind of problems that arise when you ask a spreadsheet to allocate resources, minimize costs, or optimize a schedule. Combining a large language model with a dedicated optimization engine addresses a known weakness of general-purpose AI models on structured, mathematically precise tasks.

For businesses and individuals, the near-term implication is practical: tasks that required a spreadsheet-proficient colleague or hours of manual work may now take a single prompt. Whether Gemini's performance holds up across the full diversity of real-world spreadsheet problems — especially proprietary or domain-specific ones — remains to be tested beyond the benchmark. Adoption will also depend on how reliably users can trust Gemini's output without needing to audit every formula it generates.

If benchmark performance translates consistently to everyday use, AI-assisted spreadsheets could meaningfully lower the floor for data analysis work across non-technical teams.

Gemini in Google Sheets scores 70.48% on public spreadsheet benchmark

What Happened

Why It Matters

Related

Google's Lyria 3 Pro extends AI music from a jingle to an actual song

OpenAI's next model is nearly ready, and Altman says it can move the economy

OpenAI shutters AI video generator Sora