AI language model for text generation and analysis.
Price per 1 million tokens
per 1M tokens you send
per 1M tokens you receive
Input tokens represent the text you send to the model for processing. Output tokens represent the model's generated response. Pricing is set by Google DeepMind and reflects current API rates as of October 2025.
per month
per month
per month
Gemini 2.5 Pro is Google DeepMind’s state-of-the-art “thinking” model for complex reasoning and coding. It supports long-context analysis across text, images, video, audio, and PDFs as input, and produces text output.
Google DeepMind
Text
Text-only processing
1048576 tokens
Maximum tokens per request
$1.25 (≤200k tokens), $2.50 (>200k)
per 1M input tokens
$10.00 (≤200k), $15.00 (>200k)
per 1M output tokens
Gemini 2.5 Pro is the flagship multipurpose model from Google DeepMind, tuned for hard reasoning, code, math, and multi-document analysis. Pricing is tiered by prompt size: $1.25 per 1M input tokens and $10.00 per 1M output tokens for prompts ≤200k tokens; $2.50 in and $15.00 out for prompts >200k. Batch API pricing is roughly half of standard ($0.625/$1.25 input and $5.00/$7.50 output at the same thresholds). The model accepts text, images, video, audio, and PDFs as input, returns text, and works with long contexts up to about 1,048,576 tokens in and 65,536 tokens out—enough for codebases, research dossiers, or agent chains without constant truncation.
In production, use caching to cut costs on repeated system prompts, stream outputs to improve perceived latency, and prefer structured outputs/tool-calling where possible. For cost control, route routine traffic to Flash/Flash-Lite and reserve 2.5 Pro for complex or customer-visible paths.
Gemini 2.5 Pro is Google DeepMind’s most advanced reasoning LLM. It supports multimodal inputs (text, images, video, audio, PDFs) with a ~1M-token input limit and 65k-token output limit, excels at coding/STEM tasks, and is priced at $1.25–$2.50 per 1M input tokens and $10.00–$15.00 per 1M output tokens depending on prompt size; batch rates are ~50% of standard. Common uses: code assistants, deep research, long-context planning, data/document analysis, and agent workflows.
Gemini 1.5 Flash - 8B by Google DeepMind is a multimodal AI model for chat, w...
Gemini 1.5 Flash by Google DeepMind is a multimodal AI model for chat, writin...
The bargain ‘lite’ Google Gemini—fast, cheap, and good enough for chatbots or...
Join only me, who has switched to API.chat and is saving on AI expenses while enjoying a better experience.