Provider: Google DeepMind

Gemini 2.5 Pro Price

AI language model for text generation and analysis.

Current Pricing

Price per 1 million tokens

Input Tokens

$1.25 (≤200k tokens), $2.50 (>200k)

per 1M tokens you send

Output Tokens

$10.00 (≤200k), $15.00 (>200k)

per 1M tokens you receive

Input tokens represent the text you send to the model for processing. Output tokens represent the model's generated response. Pricing is set by Google DeepMind and reflects current API rates as of October 2025.

Quick Cost Calculator

Light Usage

~$1.13

per month

  • 100K input tokens
  • 100K output tokens
  • ~75 pages of text processed

Medium Usage

~$11.25

per month

  • 1M input tokens
  • 1M output tokens
  • ~750 pages of text processed

Heavy Usage

~$112.5

per month

  • 10M input tokens
  • 10M output tokens
  • ~7,500 pages of text processed

Model Specifications

Gemini 2.5 Pro is Google DeepMind’s state-of-the-art “thinking” model for complex reasoning and coding. It supports long-context analysis across text, images, video, audio, and PDFs as input, and produces text output.

Provider

Google DeepMind

Modality

Text

Text-only processing

Context Window

1048576 tokens

Maximum tokens per request

Input Price

$1.25 (≤200k tokens), $2.50 (>200k)

per 1M input tokens

Output Price

$10.00 (≤200k), $15.00 (>200k)

per 1M output tokens

Advanced Features

Streaming

Introduction

Gemini 2.5 Pro is the flagship multipurpose model from Google DeepMind, tuned for hard reasoning, code, math, and multi-document analysis. Pricing is tiered by prompt size: $1.25 per 1M input tokens and $10.00 per 1M output tokens for prompts ≤200k tokens; $2.50 in and $15.00 out for prompts >200k. Batch API pricing is roughly half of standard ($0.625/$1.25 input and $5.00/$7.50 output at the same thresholds). The model accepts text, images, video, audio, and PDFs as input, returns text, and works with long contexts up to about 1,048,576 tokens in and 65,536 tokens out—enough for codebases, research dossiers, or agent chains without constant truncation.

In production, use caching to cut costs on repeated system prompts, stream outputs to improve perceived latency, and prefer structured outputs/tool-calling where possible. For cost control, route routine traffic to Flash/Flash-Lite and reserve 2.5 Pro for complex or customer-visible paths.

What is Gemini 2.5 Pro?

Gemini 2.5 Pro is Google DeepMind’s most advanced reasoning LLM. It supports multimodal inputs (text, images, video, audio, PDFs) with a ~1M-token input limit and 65k-token output limit, excels at coding/STEM tasks, and is priced at $1.25–$2.50 per 1M input tokens and $10.00–$15.00 per 1M output tokens depending on prompt size; batch rates are ~50% of standard. Common uses: code assistants, deep research, long-context planning, data/document analysis, and agent workflows.

Compare Similar AI Models

Ready to reduce your AI costs?

Join only me, who has switched to API.chat and is saving on AI expenses while enjoying a better experience.