API Endpoint
Full compatibility with OpenAI official formatLaozhang API is fully compatible with OpenAI official interface format, you can directly replace
https://api.openai.com/v1 with https://api.laozhang.ai/v1 to use.Request Parameters
Required Parameters
Model name to useSupported models:
- OpenAI Series:
gpt-4o,gpt-4o-mini,gpt-4-turbo,gpt-3.5-turbo, etc. - Claude Series:
claude-3-5-sonnet,claude-3-opus,claude-3-haiku, etc. - Gemini Series:
gemini-1.5-pro,gemini-1.5-flash,gemini-2.0-flash-exp, etc. - Chinese Models:
deepseek-chat,qwen-max,glm-4-flash,yi-lightning, etc.
Conversation message array, each message contains Role Descriptions:
role and contentsystem: System prompt, defines AI assistant behavioruser: User messageassistant: AI assistant’s previous response
Optional Parameters
Randomness of generated results, range 0-2
- 0: Deterministic, minimal randomness (recommended for translation, summarization, etc.)
- 0.7: Balanced, suitable for most scenarios
- 1.5-2: High creativity (recommended for creative writing, brainstorming, etc.)
Maximum number of tokens to generateRecommended Values:
- Short responses: 500-1000
- Medium responses: 2000-4000
- Long responses: 8000+
Whether to use stream output
false: Wait for complete responsetrue: Receive response in chunks (better user experience)
Nucleus sampling parameter, range 0-1Controls diversity of output. Generally use either
temperature or top_p, not both simultaneously.Frequency penalty, range -2.0 to 2.0Positive values reduce repetition of already appearing content.
Presence penalty, range -2.0 to 2.0Positive values encourage discussion of new topics.
Stop sequences, generation stops when these strings are encounteredCan be a single string or array of up to 4 strings.
End user unique identifier for abuse detectionRecommended for multi-user scenarios.
Message Format
Basic Text Message
Multimodal Message (Image Understanding)
- Image URL
- Base64 Image
Multimodal ModelsSupport image understanding:
- OpenAI:
gpt-4o,gpt-4o-mini,gpt-4-turbo - Claude:
claude-3-5-sonnet,claude-3-opus,claude-3-sonnet,claude-3-haiku - Gemini:
gemini-1.5-pro,gemini-1.5-flash,gemini-2.0-flash-exp
Request Examples
cURL
Node.js
Python
Go
Response Format
Standard Response
Stream Response
Each chunk format:Response Field Descriptions
Unique request identifier
Object type:
chat.completion: Standard responsechat.completion.chunk: Stream response chunk
Creation timestamp (Unix timestamp)
Model name used
Generated results array, typically containing one result
Result index
Incremental content (stream response)
This chunk’s content
Completion reason:
stop: Natural completionlength: Reachedmax_tokenslimitcontent_filter: Content filtered by policynull: Not yet finished (stream output)
Special Usage
GPT-4o Vision
Claude Native Format
Claude models also support native format:O1 Series Special Parameters
O1 series models (o1-preview, o1-mini) have parameter limitations: Correct usage:Usage Tips
Multi-turn Dialogue
Implement multi-turn dialogue by passing context:JSON Output
Get structured JSON output:JSON Mode SupportCurrently supports JSON mode models:
- GPT-4o series
- GPT-4-turbo series
- GPT-3.5-turbo-1106 and later versions
Billing
Billing is based on actual token usage: Total Cost = (Input Tokens × Input Price + Output Tokens × Output Price)Model Price Reference
| Model | Input Price | Output Price | Features |
|---|---|---|---|
| gpt-4o-mini | $0.15/1M tokens | $0.60/1M tokens | Cost-effective, supports image understanding |
| gpt-4o | $2.5/1M tokens | $10/1M tokens | Strongest capabilities, supports multimodal |
| claude-3-5-sonnet | $3/1M tokens | $15/1M tokens | Excellent reasoning, supports image understanding |
| gemini-1.5-flash | $0.075/1M tokens | $0.3/1M tokens | Fastest speed, long context |
Error Handling
Common error codes:| Error Code | Meaning | Solution |
|---|---|---|
| 401 | API Key invalid or missing | Check if API Key is correct |
| 429 | Request rate limit exceeded | Slow down request frequency or upgrade plan |
| 500 | Server internal error | Retry request or contact support |
| 400 | Request parameter error | Check if request parameters conform to API documentation |
Best Practices
-
Use Appropriate Temperature
- Translation, summarization, Q&A: temperature=0
- General dialogue: temperature=0.7
- Creative writing: temperature=1.0-1.5
-
Control Context Length
- Only pass necessary historical messages
- Regularly clean up irrelevant context
- Long documents can be processed in segments
-
Choose Right Model
- Simple tasks: gpt-4o-mini, gpt-3.5-turbo
- Reasoning tasks: claude-3-5-sonnet, gpt-4o
- Cost-sensitive: gemini-1.5-flash
-
Error Retry
- Implement exponential backoff retry mechanism
- Catch and handle different error types
- Set reasonable timeout
-
Stream Output
- Better user experience for long responses
- Reduce perceived latency
- Can implement typewriter effect
Related Resources
- Models API - Get complete available model list
- Images API - Image generation and editing
- OpenAI Models Guide - Detailed GPT-4o usage
- Claude Models Guide - Detailed Claude usage
- Gemini Models Guide - Detailed Gemini usage