2026-04-02Google ADKGemini AIAI automationAI agentsGoogle Geminideveloper toolsvibe codingClaude Code

Google ADK: Gemini Hits 96% Accuracy, 90% Cheaper

Google's ADK Skills toolkit boosted Gemini coding accuracy from 28% to 96% while cutting AI automation token costs by 90%. Now live in Python, Go & Java.

Just weeks before Google I/O 2026 (May 19–20), Google dropped a toolkit update that fundamentally changed what its AI agents can do. The Agent Development Kit (ADK) — Google's open framework for building multi-step AI automation workflows — now ships with a feature called Skills, and the results are striking: Gemini 3.1 Pro's success rate on developer coding tasks leapt from 28.2% to 96.6%, while token costs (the fee you pay per chunk of text processed by an AI model) dropped by up to 90%.

If you've ever watched an AI assistant confidently give the wrong answer — or winced at a cloud bill after running an automation pipeline — this update directly addresses both problems at once.

Google ADK Skills architecture for AI automation — progressive disclosure design pattern

The AI Token Cost Problem in Automation

Traditional AI agents use "monolithic prompts" (one massive instruction block that tries to pre-load everything the model might need). Think of it as handing a new employee a 600-page manual before asking them a single yes-or-no question. Most of that context is useless for the specific task — and in AI, you pay for every token (word-like unit of text) whether it helps or not.

ADK Skills solve this with a progressive disclosure architecture (a design pattern where relevant information is loaded only when the agent actually needs it, rather than front-loaded all at once). The agent starts lean, then pulls in domain expertise on demand. The result: up to 90% fewer tokens consumed per task run.

There are four distinct skill patterns available to developers:

Inline checklists — step-by-step guidance embedded directly in the prompt flow at runtime
Structured skill references — reusable knowledge modules pulled from a library only when relevant
Skill queries — the agent requests specific information the moment it encounters a knowledge gap
Skill factories — advanced agents that write their own code to handle novel, unexpected problems

Gemini AI Accuracy: From 28% to 96% with ADK Skills

Before Skills, Gemini 3.1 Pro (Google's latest flagship reasoning model) completed developer coding tasks correctly just 28.2% of the time in controlled benchmarks. That's worse than a coin flip. After equipping the same model with a developer Skill — live SDK documentation and code guidance served on demand — the success rate hit 96.6%.

That's a 342% improvement from the identical underlying model, with zero changes to the model weights (the internal numerical values that determine how an AI reasons and responds). The model didn't get smarter overnight. It got dramatically better context delivery.

Gemini AI accuracy with Google ADK Skills: 28.2% baseline vs 96.6% — coding task benchmark

Google's engineering team put it directly: "Strong reasoning capabilities and access to a source of truth can effectively eliminate outdated coding patterns." Skills provide that source of truth — ensuring agents act on current, accurate documentation rather than hallucinated (made-up) or stale information baked into training data months ago.

What 90% Token Reduction Means for Your AI Automation Budget

Token costs are the per-use pricing unit for AI inference (the computational process of running a prompt through a model to generate a response). If a standard agent run costs $1.00 in tokens today, the same task with ADK Skill optimization could cost $0.10. At enterprise scale — 10,000 workflow executions per day — that's $900 saved daily, or roughly $328,500 per year from a single automation pipeline.

Google ADK Go 1.0 and Java 1.0.0: Production-Ready AI Frameworks

Alongside the Skills announcement, Google shipped two major framework releases that bring ADK to the two languages powering most enterprise backends.

ADK Go 1.0 targets Go (a programming language used at companies like Uber, Cloudflare, and Google itself for high-throughput backend services). Key features include:

Native OpenTelemetry integration — OpenTelemetry (an industry standard for tracking software performance in production) lets teams pinpoint exactly where agents succeed, fail, or slow down
YAML-based configurations — YAML (a human-readable settings file format) enables rapid iteration without recompiling code
Self-healing plugin system — agents detect errors and automatically recover using configurable fallback logic
Human-in-the-Loop confirmations — mandatory checkpoints where a human must approve before the agent performs any sensitive or irreversible action

ADK for Java 1.0.0 targets enterprise teams on the JVM (Java Virtual Machine — the software runtime powering most large-scale enterprise applications, from banking systems to logistics platforms). It ships with:

Google Maps grounding — agents verify answers against real-world location data
Built-in URL fetching — agents browse live web pages during task execution, no custom scraper required
Firestore and Vertex AI integrations — native connections to Google's database and ML platform (machine learning infrastructure) for persistent memory and model serving
Agent2Agent (A2A) protocol — standardized messaging so Java agents collaborate seamlessly with agents written in Python, Go, or any supported language

Both frameworks support the 6 agent communication protocols now standardized in ADK: MCP (Model Context Protocol, for connecting agents to external tools), A2A (Agent-to-Agent messaging), UCP, AP2, A2UI, and AG-UI. Here's what a minimal Go 1.0 agent looks like:

// ADK Go 1.0 — agent with skill loading and human-in-the-loop approval
import "github.com/google/adk-go"

agent := adk.NewAgent(
    adk.WithModel("gemini-3.1-pro"),
    adk.WithSkill("developer-docs-v2"),
    adk.WithHumanApproval(adk.SensitiveActions),
)
result, err := agent.Run(ctx, "Refactor auth handler to use JWT")

Pocket AI: FunctionGemma Now Runs Offline on Your Phone

Running in parallel to the server-side releases, Google shipped FunctionGemma — a 270-million-parameter model (a compact AI with 270 million internal numerical settings, roughly 100x smaller than frontier models like Gemini Ultra) purpose-built for on-device function calling on Android and iOS.

FunctionGemma runs through LiteRT — Google's successor to TFLite (TensorFlow Lite, the previous standard for running AI on mobile hardware). Where TFLite required separate acceleration backends for GPU (graphics processing unit — the chip handling parallel math operations) and NPU (neural processing unit — a chip specialized for AI calculations), LiteRT unifies both and adds PyTorch and JAX (Google's high-performance ML research framework) support.

For developers, this means shipping AI agents that work fully offline — no cloud subscription, no network latency, no token bills per call. For end users, it means AI-powered features that work on a plane, in a tunnel, or anywhere with poor connectivity. FunctionGemma is available now in the Google AI Edge Gallery app on Android and iOS.

Start with Google ADK for AI Automation: Practical Starting Points

The ADK ecosystem now integrates with GitHub, Notion, and Hugging Face (the open-source AI model hub where thousands of public models are hosted), making it practical to connect existing workflows without rewriting your infrastructure. Here's where each type of user can start:

Java/Go enterprise developers: Pull ADK Java 1.0.0 via Maven or Gradle, or ADK Go 1.0 via go get github.com/google/adk-go
Python developers: The Python ADK has been available since early 2025 — Skills are now backported to the existing Python runtime
CLI users: A new Plan Mode in Gemini CLI lets you analyze any codebase in read-only mode before committing to changes — a planning-first workflow familiar to users of Claude Code and other vibe coding tools
VS Code / IntelliJ users: Finish Changes (an AI co-pilot that completes multi-file edits end-to-end) and Outlines (interactive English summaries layered into source code for navigation) are live in Gemini Code Assist
ML engineers training large models: Continuous checkpointing in Orbax/MaxText (Google's framework for training billion-parameter models) now saves training state asynchronously — protecting months of compute against hardware failure without throttling throughput

If you're evaluating whether Google's ADK can replace your current agent stack, the jump from 28.2% to 96.6% success rate on coding tasks gives you a concrete benchmark to test against. You can explore beginner-friendly AI automation guides before diving into the full framework documentation — and watch Google I/O 2026 on May 19–20 for the next wave of ADK capabilities.

Related Content — Get Started | Guides | More News

Sources

Stay updated on AI news

Simple explanations of the latest AI developments