Documentation Index
Fetch the complete documentation index at: https://docs.laozhang.ai/llms.txt
Use this file to discover all available pages before exploring further.
Model Overview
OpenAI is one of the world’s leading AI research institutions, offering multiple high-performance large language models. From GPT-5.5 to GPT-5, GPT-4.1, and the reasoning-focused o3/o4 series, OpenAI provides solutions for various scenarios.Full Compatibility: Laozhang API is 100% compatible with OpenAI’s official API format. Simply replace
https://api.openai.com/v1 with https://api.laozhang.ai/v1 to use it.Model Classification
GPT-5.5 Series
GPT-5.5
GPT-5.5
Latest flagship model for complex professional work
-
Core Features:
- Supports text and image input
- 1M context window
- Strong coding and agentic reasoning
- Excellent multilingual capabilities
-
Pricing:
- Check the console for real-time pricing
-
Suitable Scenarios:
- Complex task handling
- Image understanding and analysis
- Long document processing
- Professional content generation
GPT-4.1 Mini
GPT-4.1 Mini
Fast economical model for everyday workloads
-
Core Features:
- Fast response speed
- Good general-purpose quality
- Fast response speed
- Suitable for cost-sensitive use
-
Pricing:
- Check the console for real-time pricing
-
Suitable Scenarios:
- Daily conversations
- Batch processing
- Development and testing
- Cost-sensitive applications
GPT-5 / GPT-4.1 Series
GPT-5.5
GPT-5.5
Current high-performance model with powerful reasoning capabilities
-
Core Features:
- 1M context window
- Strong logical reasoning
- Excellent code understanding
- Multi-domain knowledge
-
Pricing:
- Check the console for real-time pricing
-
Suitable Scenarios:
- Complex reasoning tasks
- Code generation and review
- Academic research
- Professional consulting
GPT-4.1
GPT-4.1
Classic stable model for production workloads
-
Core Features:
- 128K context window
- Outstanding text understanding
- Creative writing capabilities
- Accurate information extraction
-
Pricing:
- Check the console for real-time pricing
-
Suitable Scenarios:
- High-quality content creation
- Important decision support
- Detailed analysis reports
o3 / o4 Reasoning Models
o3-pro
o3-pro
Reasoning-specialized model with PhD-level thinking ability
-
Core Features:
- Strongest reasoning capabilities
- Multi-step thinking process
- Excellent math problem solving
- Complex logic analysis
-
Pricing:
- Check the console for real-time pricing
-
Special Limitations:
- Does not support streaming output
- Does not support
systemrole - Does not support
temperatureparameter
-
Suitable Scenarios:
- Mathematical olympiad problems
- Scientific research
- Code algorithm optimization
- Complex decision analysis
o4-mini
o4-mini
Lightweight reasoning model, extreme cost-performance
-
Core Features:
- Fast reasoning speed
- 80% cheaper than o3-pro
- Good code and math capabilities
- Suitable for daily reasoning tasks
-
Pricing:
- Check the console for real-time pricing
-
Suitable Scenarios:
- Daily math problems
- Code logic optimization
- Reasoning practice
- Education and tutoring
GPT-4o Series (Classic Multimodal)
GPT-4o Mini
GPT-4o Mini
Classic lightweight multimodal model for legacy compatibility
-
Core Features:
- Fast response speed
- Fast response speed
- Stable performance
- Suitable for high-frequency calls
-
Pricing:
- Check the console for real-time pricing
-
Suitable Scenarios:
- Simple conversations
- Content summarization
- Text translation
- Customer service bots
Code Examples
Basic Text Dialogue
Image Understanding
Long Document Analysis
Creative Writing
Complex Code Review
Mathematical Problem Solving
Usage Tips
1. Choose the Right Model
| Scenario | Recommended Model | Reason |
|---|---|---|
| Daily conversations | GPT-5.5 or GPT-4o Mini | Quality-first or cost-effective choice |
| Image understanding | GPT-5.5 or GPT-4o | Powerful multimodal capabilities |
| Complex reasoning | o3-pro | PhD-level thinking |
| Code generation | GPT-5.5, Claude Sonnet 4.6 | Strong code understanding |
| Long documents | GPT-5.5, Gemini 3.1 Pro Preview | Large context window |
| Creative writing | GPT-5.5 | Creative expression |
| Math problems | o4-mini or o3-pro | Strong reasoning capabilities |
2. Optimize Prompts
Clear Instructions
Clear Instructions
✅ Good Example:❌ Bad Example:
Step-by-Step
Step-by-Step
For complex tasks, break down into multiple steps:
Use Examples
Use Examples
Provide examples to help the model understand your needs better:
3. Parameter Tuning
Control randomness of output:
0: Most deterministic (translation, summarization)0.7: Balanced (general dialogue)1.0-1.5: More creative (creative writing)
Maximum number of tokens to generate:
- Short responses: 500-1000
- Medium responses: 2000-4000
- Long responses: 8000+
Nucleus sampling, alternative to temperature:
0.1: Conservative0.9: More diverse- Generally use either
temperatureortop_p, not both
Reduce repetition:
0: No penalty0.5-1.0: Moderate penalty2.0: Maximum penalty
Cost Optimization
1. Choose Cost-Effective Models
Daily Tasks
Use GPT-4.1 Mini or GPT-4o Mini for simple daily tasks
- Lower cost than flagship models
- Good quality
- Faster speed
Reasoning Tasks
Use o4-mini instead of o3-pro
- 80% price reduction
- Good reasoning capability
- Suitable for most scenarios
2. Control Context Length
3. Set Reasonable max_tokens
Error Handling
Common Errors
401: Unauthorized
401: Unauthorized
429: Rate Limit
429: Rate Limit
Cause: Request rate limit exceededSolution:
400: Invalid Request
400: Invalid Request
Cause: Parameter format errorSolution:
- Check if model name is correct
- Verify message format is correct
- Ensure parameters meet requirements
Retry Mechanism
Streaming Response
For long responses, use streaming output for better user experience:Best Practices
-
Choose the Right Model
- Simple tasks → GPT-4o Mini
- Complex tasks → GPT-4o
- Reasoning tasks → O1 series
-
Optimize Prompts
- Clear and specific instructions
- Provide examples
- Break down complex tasks
-
Control Costs
- Only pass necessary context
- Set reasonable
max_tokens - Use cost-effective models
-
Error Handling
- Implement retry mechanism
- Catch and handle different error types
- Set reasonable timeout
-
User Experience
- Use streaming output
- Show loading status
- Provide feedback
Related Resources
- Chat Completions API - Complete API documentation
- Claude Models - Anthropic Claude models guide
- Gemini Models - Google Gemini models guide
- Pricing - Detailed model pricing information