Documentation Index
Fetch the complete documentation index at: https://docs.laozhang.ai/llms.txt
Use this file to discover all available pages before exploring further.
Model Overview
Gemini is Google’s latest generation of multimodal large language models, featuring long context windows and powerful multimodal understanding capabilities. From the latest Gemini 3.1 Pro Preview to the stable Gemini 2.5 Pro and fast Gemini 2.5 Flash, Gemini models excel in long document analysis, complex reasoning, code generation, and more.OpenAI Format Compatible: Fully compatible with OpenAI API format, seamlessly integrate with your existing code
Model Classification
Gemini 3.1 / 2.5 Series
Gemini 3.1 Pro Preview
Gemini 3.1 Pro Preview
Latest Pro preview model for tools, multimodal input, and long-context tasks
-
Core Features:
- 1M tokens context window
- Strong tool and agent workflow support
- Powerful multimodal understanding
- Excellent reasoning capabilities
- Recommended for testing the latest Gemini capabilities
-
Pricing:
- Check the console for real-time pricing
-
Suitable Scenarios:
- Agent workflows
- Complex reasoning
- Multimodal analysis
- Long-context tasks
Gemini 2.5 Pro
Gemini 2.5 Pro
Stable high-performance model with ultra-long context window
-
Core Features:
- 2M tokens context window (industry-leading)
- Powerful multimodal understanding
- Excellent reasoning capabilities
- Supports text, images, audio, and video
- Precise long document analysis
-
Pricing:
- Check the console for real-time pricing
-
Suitable Scenarios:
- Ultra-long document analysis
- Complex code repository understanding
- Multi-video content analysis
- Large-scale data processing
- Academic research
Gemini 2.5 Flash
Gemini 2.5 Flash
Ultra-fast model with best cost-performance
-
Core Features:
- 1M tokens context window
- Fastest response speed in industry
- Ultra-low price
- Multimodal support
- Suitable for high-frequency calls
-
Pricing:
- Check the console for real-time pricing
-
Suitable Scenarios:
- Daily conversations
- Quick queries
- Batch processing
- Real-time applications
- Cost-sensitive projects
Gemini 1.0 Series
Gemini 1.0 Pro
Gemini 1.0 Pro
Classic model, stable and reliable
-
Core Features:
- 32K context window
- Stable performance
- Good multilingual support
- Balanced cost and performance
-
Pricing:
- Check the console for real-time pricing
-
Suitable Scenarios:
- Standard dialogue
- Text generation
- Translation tasks
- General queries
Gemini Vision (Pro Vision)
Gemini Vision (Pro Vision)
Image understanding model
-
Core Features:
- Strong image understanding
- Supports multi-image analysis
- Precise OCR capabilities
- Scene description
-
Pricing:
- Input: $0.25/1M tokens
- Image: $0.0025/image
-
Suitable Scenarios:
- Image content analysis
- Document OCR
- Multi-image comparison
- Visual Q&A
Experimental Models
Gemini 2.0 Flash Exp
Gemini 2.0 Flash Exp
Latest experimental model, free to use
-
Core Features:
- 1M tokens context window
- Latest model architecture
- Completely free (limited time)
- May have instability
-
Pricing:
- Completely free (experimental phase)
-
Suitable Scenarios:
- Testing and validation
- Development prototypes
- Feature exploration
- Non-critical applications
Usage Methods
Basic Text Dialogue
Application Scenarios
1. Ultra-Long Document Analysis
Leverage Gemini’s 2M tokens context to process entire books:2. Code Repository Understanding
Analyze entire code repositories:3. Multi-Image Analysis
Analyze multiple images simultaneously:4. Document OCR and Information Extraction
5. Data Analysis and Visualization
6. Code Generation
Gemini’s Unique Advantages
1. Ultra-Long Context Window
Industry Leading: Gemini 2.5 Pro supports 2M tokens context, about 10x Claude and GPT-4
- Entire book analysis
- Large code repository review
- Mass document processing
- Long conversation history maintenance
2. Powerful Multimodal Capabilities
Supports multiple modalities including text, images, audio, and video:3. Precise Multimodal Understanding
Gemini has excellent understanding capabilities for images, videos, and audio:- Accurate scene description
- Multi-object recognition
- Temporal sequence understanding
- Audio content analysis
Usage Tips
1. Model Selection Guide
| Scenario | Recommended Model | Reason |
|---|---|---|
| Daily conversations | Gemini 2.5 Flash | Fastest speed, lowest price |
| Long documents | Gemini 2.5 Pro | 2M context |
| Image understanding | Gemini Pro Vision | Image-specific optimization |
| Code generation | Gemini 2.5 Flash | Cost-effective, good quality |
| Complex reasoning | Gemini 2.5 Pro | Powerful reasoning |
| Testing | Gemini 2.0 Flash Exp | Free |
2. Context Window Management
Short Context (< 128K)
Short Context (< 128K)
Price advantage:
- Gemini 2.5 Flash: $0.075/1M tokens
- Gemini 2.5 Pro: $1.25/1M tokens
- Daily conversations
- Short document processing
- Quick queries
Long Context (> 128K)
Long Context (> 128K)
Pricing changes:
- Gemini 2.5 Flash: $0.15/1M tokens (2x)
- Gemini 2.5 Pro: $2.5/1M tokens (2x)
- Entire book analysis
- Large code repositories
- Mass documents
3. Prompt Optimization
4. Parameter Tuning
Control creativity:
0: Most deterministic (translation, facts)0.7: Balanced (general dialogue)1.0-1.5: More creative (creative writing)
Nucleus sampling:
0.9: Conservative0.95: Balanced1.0: Most diverse
Top-K sampling:
- Gemini-specific parameter
- Recommended range: 1-40
- Lower values = more deterministic
Cost Optimization Strategies
1. Choose Appropriate Models
Cost First
Gemini 2.5 Flash
- Input: $0.075/1M tokens
- 95% cheaper than GPT-4o
- Suitable for most scenarios
Performance First
Gemini 2.5 Pro
- Input: $1.25/1M tokens (≤128K)
- 2M tokens context
- Suitable for complex tasks
2. Control Context Length
3. Batch Processing
Error Handling
Common Error Codes
| Error Code | Description | Solution |
|---|---|---|
| 400 | Invalid request parameters | Check parameter format |
| 401 | Invalid API Key | Verify API Key |
| 429 | Rate limit exceeded | Implement retry mechanism |
| 500 | Server error | Retry later |
Retry Mechanism Implementation
Streaming Response
For long responses, use streaming for better user experience:Best Practices
1. Long Document Processing
2. Multi-turn Conversation Management
3. Structured Output
Compare with Other Models
| Dimension | Gemini 2.5 Flash | GPT-4o Mini | Claude Haiku 4.5 |
|---|---|---|---|
| Price | $0.075/1M | $0.15/1M | $1/1M |
| Context | 1M tokens | 128K tokens | 200K tokens |
| Speed | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ |
| Cost-performance | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐ | ⭐⭐⭐ |
| Long documents | ⭐⭐⭐⭐⭐ | ⭐⭐⭐ | ⭐⭐⭐⭐ |
| Code generation | ⭐⭐⭐⭐ | ⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ |
Related Resources
- Chat Completions API - Complete API documentation
- OpenAI Models - GPT series models guide
- Claude Models - Anthropic Claude models guide
- Pricing - Detailed model pricing information