Skip to main content
Editorial sketch of a large spreadsheet grid with cells being filled in autonomously by invisible hands while a person watches from a chair, conveying AI doing complex data work
Surface
SHALLOWS

Gemini in Google Sheets scores 70.48% on public spreadsheet benchmark

Google's Gemini AI has hit a verifiable milestone in spreadsheet automation, scoring 70.48% on the SpreadsheetBench public benchmark — outperforming all competing models. The capability lets users describe what they need in plain language and Gemini builds the formulas, gathers the data, and designs multi-step sheets end to end. For the hundreds of millions of people who use spreadsheets daily, this could mark a meaningful shift in what non-technical users can accomplish without manual formula work.

VERIFIEDConfidence: 80%

What Happened

Google announced on March 10, 2026 that Gemini inside Google Sheets has achieved a 70.48% success rate on SpreadsheetBench, a publicly available benchmark that tests an AI model's ability to autonomously perform complex, real-world spreadsheet tasks — including building formulas, restructuring data, and executing multi-step sheet designs. According to Google, no competing model has matched this score on the same benchmark.

The feature lets users type a plain-language description of what they want — for example, "create a budget tracker that flags overspending by category" — and Gemini handles the execution: constructing formulas, pulling in data, and setting up the sheet structure. Advanced optimization problems that previously required either deep spreadsheet expertise, manual formula construction, or third-party add-ons can now be addressed through a conversational prompt, according to Google's announcement.

The capability is powered by a combination of Google DeepMind research and Google Research OR-Tools, the latter being a well-established open-source library for combinatorial optimization. The feature is currently in beta and is available to subscribers on Google's AI Ultra and AI Pro plans.

Why It Matters

Spreadsheets are the most widely deployed data tool on the planet. They sit at the center of financial planning, inventory management, project tracking, and data analysis in organizations of every size — yet the gap between what spreadsheets can do and what most users know how to build has always been substantial. Complex formulas, nested logic, and optimization models have historically required either dedicated training or a specialist to write them.

Editorial sketch of a benchmark scoreboard showing a rising bar chart with the tallest bar labeled 70.48 percent, representing Gemini's state-of-the-art SpreadsheetBench performance

A 70.48% score on SpreadsheetBench is a concrete, testable number tied to a public benchmark rather than an internal evaluation. That distinction matters: public benchmarks can be independently reproduced and challenged, which makes the claim verifiable rather than purely promotional. "State-of-the-art" is Google's framing; the percentage is the fact to watch.

The underlying use of OR-Tools is worth noting. OR-Tools is a proven solver for constraint satisfaction and linear optimization problems — the kind of problems that arise when you ask a spreadsheet to allocate resources, minimize costs, or optimize a schedule. Combining a large language model with a dedicated optimization engine addresses a known weakness of general-purpose AI models on structured, mathematically precise tasks.

For businesses and individuals, the near-term implication is practical: tasks that required a spreadsheet-proficient colleague or hours of manual work may now take a single prompt. Whether Gemini's performance holds up across the full diversity of real-world spreadsheet problems — especially proprietary or domain-specific ones — remains to be tested beyond the benchmark. Adoption will also depend on how reliably users can trust Gemini's output without needing to audit every formula it generates.

If benchmark performance translates consistently to everyday use, AI-assisted spreadsheets could meaningfully lower the floor for data analysis work across non-technical teams.

Newsletter

Stay informed. The best AI coverage, delivered weekly.

Related