Model Overview
OpenAI is one of the world’s leading AI research institutions, offering multiple high-performance large language models. From the powerful GPT-4o to the cost-effective GPT-4o Mini, and the reasoning-focused O1 series, OpenAI provides solutions for various scenarios.Full Compatibility: Laozhang API is 100% compatible with OpenAI’s official API format. Simply replace
https://api.openai.com/v1
with https://api.laozhang.ai/v1
to use it.Model Classification
GPT-4o Series
GPT-4o
GPT-4o
Latest flagship model, the most powerful multimodal AI
- Core Features:
- Supports text, images, and audio multimodal understanding
- 128K context window
- Fastest response speed
- Excellent multilingual capabilities
- Pricing:
- Input: $2.5/1M tokens
- Output: $10/1M tokens
- Suitable Scenarios:
- Complex task handling
- Image understanding and analysis
- Long document processing
- Professional content generation
GPT-4o Mini
GPT-4o Mini
Cost-effective model, price reduced by 90%
- Core Features:
- 90% price reduction compared to GPT-4o
- Supports image understanding
- Fast response speed
- Excellent performance
- Pricing:
- Input: $0.15/1M tokens
- Output: $0.60/1M tokens
- Suitable Scenarios:
- Daily conversations
- Batch processing
- Development and testing
- Cost-sensitive applications
GPT-4 Series
GPT-4 Turbo
GPT-4 Turbo
Classic high-performance model with powerful reasoning capabilities
- Core Features:
- 128K context window
- Strong logical reasoning
- Excellent code understanding
- Multi-domain knowledge
- Pricing:
- Input: $10/1M tokens
- Output: $30/1M tokens
- Suitable Scenarios:
- Complex reasoning tasks
- Code generation and review
- Academic research
- Professional consulting
GPT-4
GPT-4
Classic GPT-4 model, stable and reliable
- Core Features:
- 8K context window
- Outstanding text understanding
- Creative writing capabilities
- Accurate information extraction
- Pricing:
- Input: $30/1M tokens
- Output: $60/1M tokens
- Suitable Scenarios:
- High-quality content creation
- Important decision support
- Detailed analysis reports
O1 Series
O1 Preview
O1 Preview
Reasoning-specialized model with PhD-level thinking ability
- Core Features:
- Strongest reasoning capabilities
- Multi-step thinking process
- Excellent math problem solving
- Complex logic analysis
- Pricing:
- Input: $15/1M tokens
- Output: $60/1M tokens
- Special Limitations:
- Does not support streaming output
- Does not support
system
role - Does not support
temperature
parameter
- Suitable Scenarios:
- Mathematical olympiad problems
- Scientific research
- Code algorithm optimization
- Complex decision analysis
O1 Mini
O1 Mini
Lightweight reasoning model, extreme cost-performance
- Core Features:
- Fast reasoning speed
- 80% cheaper than O1 Preview
- Good code and math capabilities
- Suitable for daily reasoning tasks
- Pricing:
- Input: $3/1M tokens
- Output: $12/1M tokens
- Suitable Scenarios:
- Daily math problems
- Code logic optimization
- Reasoning practice
- Education and tutoring
GPT-3.5 Series
GPT-3.5 Turbo
GPT-3.5 Turbo
Classic dialogue model, excellent cost-performance
- Core Features:
- 16K context window
- Fast response speed
- Stable performance
- Lowest price
- Pricing:
- Input: $0.5/1M tokens
- Output: $1.5/1M tokens
- Suitable Scenarios:
- Simple conversations
- Content summarization
- Text translation
- Customer service bots
Code Examples
Basic Text Dialogue
Image Understanding
Long Document Analysis
Creative Writing
Complex Code Review
Mathematical Problem Solving
O1 Series Special Notes:
- Do not support
system
role messages - Do not support streaming output
- Do not support
temperature
,top_p
and other creativity parameters max_tokens
defaults to model’s maximum value
Usage Tips
1. Choose the Right Model
Scenario | Recommended Model | Reason |
---|---|---|
Daily conversations | GPT-4o Mini | Cost-effective, fast speed |
Image understanding | GPT-4o | Powerful multimodal capabilities |
Complex reasoning | O1 Preview | PhD-level thinking |
Code generation | GPT-4o, Claude 3.5 | Strong code understanding |
Long documents | GPT-4o, Gemini 1.5 | Large context window |
Creative writing | GPT-4 | Creative expression |
Math problems | O1 Mini/Preview | Strong reasoning capabilities |
2. Optimize Prompts
Clear Instructions
Clear Instructions
✅ Good Example:❌ Bad Example:
Step-by-Step
Step-by-Step
For complex tasks, break down into multiple steps:
Use Examples
Use Examples
Provide examples to help the model understand your needs better:
3. Parameter Tuning
Control randomness of output:
0
: Most deterministic (translation, summarization)0.7
: Balanced (general dialogue)1.0-1.5
: More creative (creative writing)
Maximum number of tokens to generate:
- Short responses: 500-1000
- Medium responses: 2000-4000
- Long responses: 8000+
Nucleus sampling, alternative to temperature:
0.1
: Conservative0.9
: More diverse- Generally use either
temperature
ortop_p
, not both
Reduce repetition:
0
: No penalty0.5-1.0
: Moderate penalty2.0
: Maximum penalty
Cost Optimization
1. Choose Cost-Effective Models
Daily Tasks
Use GPT-4o Mini instead of GPT-4o
- 90% price reduction
- Similar quality
- Faster speed
Reasoning Tasks
Use O1 Mini instead of O1 Preview
- 80% price reduction
- Good reasoning capability
- Suitable for most scenarios
2. Control Context Length
3. Set Reasonable max_tokens
Error Handling
Common Errors
401: Unauthorized
401: Unauthorized
429: Rate Limit
429: Rate Limit
Cause: Request rate limit exceededSolution:
400: Invalid Request
400: Invalid Request
Cause: Parameter format errorSolution:
- Check if model name is correct
- Verify message format is correct
- Ensure parameters meet requirements
Retry Mechanism
Streaming Response
For long responses, use streaming output for better user experience:Best Practices
-
Choose the Right Model
- Simple tasks → GPT-4o Mini
- Complex tasks → GPT-4o
- Reasoning tasks → O1 series
-
Optimize Prompts
- Clear and specific instructions
- Provide examples
- Break down complex tasks
-
Control Costs
- Only pass necessary context
- Set reasonable
max_tokens
- Use cost-effective models
-
Error Handling
- Implement retry mechanism
- Catch and handle different error types
- Set reasonable timeout
-
User Experience
- Use streaming output
- Show loading status
- Provide feedback
Related Resources
- Chat Completions API - Complete API documentation
- Claude Models - Anthropic Claude models guide
- Gemini Models - Google Gemini models guide
- Pricing - Detailed model pricing information