Google for Startups AI Agents Challenge · 2026

The agent inside the product,
not the product inside the agent.

Gobi is a simulated-first practice trading app. Its AI coach doesn't live in a chat window — it lives on the chart, in the order flow, and in the market scanner. Powered by Gemini, grounded by Google Search, guarded by Model Armor, and it can never place an order: every action ends at human review.

Where to test Launch the live demo Agent SDK on GitHub Architecture

Try it two ways: the full Gobi app is live on Google Play internal testing, or run the same open agent SDK right here in a neutral web skin — every capability, no install.

The agent demo: coach conversation, live chart with AI-drawn annotations, and the market scanner

Why this is different

The AI draws the chart

We kept the candlestick chart deliberately simple — no intimidating pro-tool clutter. When you ask “is BTC going up?”, the agent answers in short story beats and draws each idea on the chart itself: support, trend, momentum. Fifteen technical methods are computed deterministically in the SDK's engine, so the math is never hallucinated. Gemini narrates; the engine draws.

It knows what's in play

The question beginners never ask: what should I even look at today? A deterministic scanner sweeps every market for volume surges, funding extremes, and breakouts — and the agent explains why something is hot, citing machine-computed facts and Google Search-grounded headlines.

It cannot pull the trigger

There is no submit, place, or confirm tool in the entire catalog. Order drafts are clamped for size and leverage, require a stop-loss, and always terminate at a review surface — open_order_review → pause_for_user. The human decides. A red-team eval suite re-proves this on every go test.

Built on Google's agent stack

Planning brain

Gemini 2.5 Flash

Plans every turn over a 27-tool catalog with server-side context pre-read, so routine turns never waste tool calls rediscovering known facts.

Real-world context

Google Search grounding

News and one rationed web search per turn cite real, sourced headlines — never model memory — for “why is this moving?” questions.

Prompt screening

Model Armor

Every inbound message is screened for prompt injection and jailbreaks before the agent plans anything. Blocked turns get a safe next step.

Long-term memory

Vertex AI Memory Bank

Durable cross-session memory of preferences and learning progress, so coaching adapts without storing predictions or signals.

Interoperability

Model Context Protocol

An MCP stdio server exposes the same tool catalog — with the same policy clamps — to Claude, Gemini CLI, or any MCP client.

Infrastructure

Cloud Run

This demo is two Cloud Run services: the Go agent runtime and this Next.js client. Same SDK, deployed on Google Cloud.

Architecture

Four views of the same system — open any diagram full-size.

System overview

One runtime, three transports, a 27-tool catalog, and a human-review terminus.

Anatomy of a turn

Screen, pre-read, route, call tools, clamp risk, validate, trace — then the human gate.

The safety contract

Four independent defense layers; every road to action ends at a human.

Quality flywheel

Golden cases, red teams, a Gemini judge, and live telemetry close the loop.

Quality is a test suite, not a vibe

The agent ships with its own proof: golden capability cases, adversarial red-team prompts, trajectory invariants (every mutating plan must end at review), an opt-in Gemini-as-judge coaching rubric, and per-run telemetry — model, tokens, latency, tool calls — at /api/ops. Run it all yourself: go test ./....