Featured gemini-3-flash

Gemini 3 Flash — Gemini

Gemini 3 Flash is built for speed and high-frequency efficiency. It is the "workhorse" of the Gemini family, designed to provide frontier-level intelligence with the lowest possible latency. Best For: Real-time applications, high-volume automated tasks, and "agentic" workflows where the AI needs to move through many small steps quickly. Key Strength: It delivers "Pro-grade" reasoning (scoring impressively high on PhD-level benchmarks) but is optimized to use fewer tokens and respond nearly instantaneously, making it the most cost-effective choice for scaling.

Provider Gemini

Model type completion

Status Active

Endpoint


                                                        /chat/completions

Pricing

Input tokens0.080800000000 tokens
Output tokens0.076000000000 tokens
Context windowPENDING INFORMATION
ThroughputPENDING INFORMATION

Routing settings

Fallback modelatlas-dialogue-lite
Safety filterEnabled (Tier 2)
Max latency1800 ms
Region lockEU, US

Back to models Read the docs