Featured gemini-2.5-flash

Gemini 2.5 Flash — Gemini

Gemini 2.5 Flash is a high-performance, lightweight multimodal model developed by Google, engineered for speed and cost-efficiency without compromising on intelligence. It features a massive 1 million token context window, allowing it to process and reason across vast datasets, long documents, and extensive codebases. Distinguished as a 'thinking' model, it utilizes adaptive reasoning to modulate its processing power based on task complexity, delivering superior prompt adherence and high-fidelity outputs. With native support for text, image, audio, and video inputs, Gemini 2.5 Flash is optimized for low-latency, real-time applications and complex agentic workflows.

Provider Gemini

Model type completion

Status Active

Endpoint


                                                        /chat/completions

Pricing

Input tokens0.072000000000 tokens
Output tokens0.062000000000 tokens
Context windowPENDING INFORMATION
ThroughputPENDING INFORMATION

Routing settings

Fallback modelatlas-dialogue-lite
Safety filterEnabled (Tier 2)
Max latency1800 ms
Region lockEU, US

Back to models Read the docs