Skip to main content

Model Availability Status

Laozhang API provides multiple mainstream AI models, availability status updated in real-time.

Currently Available Models

  • OpenAI Series
  • Anthropic Series
  • Google Series
  • Image Generation
  • Video Generation
ModelStatusFeaturesUse Cases
GPT-4 Turbo🟢 AvailableLatest GPT-4, strong reasoningComplex tasks, code generation
GPT-4🟢 AvailableStandard GPT-4Professional analysis, long text
GPT-3.5 Turbo🟢 AvailableFast and economicalDaily conversation, quick tasks
GPT-3.5 Turbo 16K🟢 AvailableLong context supportDocument processing

Real-time Status

Check latest model status:
import requests

# Get model list
response = requests.get(
    "https://api.laozhang.ai/v1/models",
    headers={"Authorization": "Bearer your_api_key"}
)

models = response.json()
for model in models['data']:
    print(f"{model['id']}: {model['status']}")

Model Selection Recommendations

By Task Type

Task TypeRecommended ModelAlternativeReason
Daily ConversationGPT-3.5 TurboGemini FlashFast, economical
Code GenerationGPT-4 TurboClaude Sonnet 4Strong code understanding
Long DocumentClaude Sonnet 4Gemini 2.5 Pro200K+ context
Content CreationClaude 3 OpusGPT-4 TurboStrong creativity
Quick TranslationGPT-3.5 TurboGemini FlashFast response
Image UnderstandingGemini 2.5 ProGPT-4 VisionMultimodal capability
Image GenerationDALL-E 3Flux ProHigh quality
Video CreationSora 2VEO 3Cost-effective

By Budget

Budget LevelRecommended ModelDaily Estimated Cost
EconomicalGPT-3.5 Turbo, Gemini Flash$1-5
BalancedGPT-4 Turbo, Claude Sonnet 4$10-30
PremiumClaude Opus, Gemini 2.5 Pro$50-100
CustomAPI-specific on-demandContact sales

By Response Speed

Speed RequirementRecommended ModelAverage Response Time
Real-timeGemini Flash, GPT-3.5 Turbo0.5-2s
FastGPT-4 Turbo, Claude Sonnet2-5s
StandardClaude Opus, Gemini 2.5 Pro5-10s

Model Unavailability Handling

Temporary Unavailability

If model is temporarily unavailable:
import time

def call_with_fallback(primary_model, fallback_model, prompt):
    """Call with fallback model"""
    try:
        # Try primary model
        response = client.chat.completions.create(
            model=primary_model,
            messages=[{"role": "user", "content": prompt}]
        )
        return response
    except Exception as e:
        if "model_unavailable" in str(e):
            # Switch to fallback model
            print(f"{primary_model} unavailable, switching to {fallback_model}")
            response = client.chat.completions.create(
                model=fallback_model,
                messages=[{"role": "user", "content": prompt}]
            )
            return response
        else:
            raise e

# Usage example
result = call_with_fallback(
    primary_model="gpt-4-turbo",
    fallback_model="gpt-3.5-turbo",
    prompt="Hello, world!"
)

Auto-retry Mechanism

Implement smart retry mechanism:
from tenacity import retry, stop_after_attempt, wait_exponential, retry_if_exception_type

class ModelUnavailableError(Exception):
    pass

@retry(
    retry=retry_if_exception_type(ModelUnavailableError),
    stop=stop_after_attempt(3),
    wait=wait_exponential(multiplier=1, min=4, max=10)
)
def call_api_with_retry(model, prompt):
    """API call with retry"""
    try:
        response = client.chat.completions.create(
            model=model,
            messages=[{"role": "user", "content": prompt}]
        )
        return response
    except Exception as e:
        if "model_unavailable" in str(e):
            raise ModelUnavailableError(f"{model} unavailable")
        raise e

Model Fallback Chain

Implement multi-level fallback chain:
MODEL_FALLBACK_CHAIN = {
    "gpt-4-turbo": ["gpt-4", "claude-sonnet-4", "gpt-3.5-turbo"],
    "claude-opus": ["claude-sonnet-4", "gpt-4-turbo", "claude-haiku"],
    "gemini-2.5-pro": ["gemini-1.5-pro", "gpt-4-turbo", "claude-sonnet-4"]
}

def call_with_fallback_chain(model, prompt):
    """Try call with fallback chain"""
    models_to_try = [model] + MODEL_FALLBACK_CHAIN.get(model, [])
    
    for current_model in models_to_try:
        try:
            response = client.chat.completions.create(
                model=current_model,
                messages=[{"role": "user", "content": prompt}]
            )
            if current_model != model:
                print(f"Using fallback model: {current_model}")
            return response
        except Exception as e:
            if "model_unavailable" in str(e):
                continue
            raise e
    
    raise Exception("All models unavailable")

Common Issues

Possible Causes:
  1. Model name incorrect
  2. Model not available in your region
  3. Model has been deprecated
Solutions:
# Check available models
response = requests.get(
    "https://api.laozhang.ai/v1/models",
    headers={"Authorization": "Bearer your_api_key"}
)
available_models = [m['id'] for m in response.json()['data']]
print("Available models:", available_models)

# Verify model name
model = "gpt-4-turbo"  # Ensure name is correct
if model not in available_models:
    print(f"{model} unavailable, please choose another model")
Possible Causes:
  1. Model under maintenance
  2. High load on model
  3. Network issues
Solutions:
  1. Retry: Retry after a few minutes
  2. Switch Model: Use alternative model
  3. Check Status: Check Status Page
import time

def call_with_retry(model, prompt, max_retries=3):
    for i in range(max_retries):
        try:
            return client.chat.completions.create(
                model=model,
                messages=[{"role": "user", "content": prompt}]
            )
        except Exception as e:
            if i < max_retries - 1:
                wait_time = 2 ** i  # Exponential backoff
                print(f"Retry {i+1}/{max_retries}, waiting {wait_time}s...")
                time.sleep(wait_time)
            else:
                raise e
Possible Reasons:
  1. Region Restrictions
    • Some models restricted by region
    • Comply with local regulations
    • Can apply for special access
  2. Subscription Plan
    • Some models require higher plan
    • Upgrade plan to access
    • Or purchase dedicated model access
  3. Capacity Limitations
    • New models may have capacity limits
    • Gradually opening to more users
    • Can join waitlist
  4. Maintenance Updates
    • Models periodically maintained and updated
    • Typically choose off-peak hours
    • Advance notice on Status Page
Application Methods:
  1. Join Beta Program
  2. Enterprise Users
    • Contact sales team: [email protected]
    • Can get early access privileges
    • Dedicated support and services
  3. Developer Community
Deprecation Policy:
  • Advance Notice: At least 3 months notice before deprecation
  • Migration Guide: Provide detailed migration guide
  • Compatibility Period: Maintain compatibility period
  • Support: Offer free migration support
How to Get Notifications:

Model Performance Monitoring

Real-time Monitoring Dashboard

Access Model Status Dashboard to view:
  • Availability: Model online rate
  • Response Time: Average response time
  • Success Rate: Request success rate
  • Queue Status: Current queue length

Self-built Monitoring

Implement your own monitoring system:
import time
import requests

def monitor_model_status(models, interval=60):
    """Monitor model status"""
    while True:
        for model in models:
            try:
                start_time = time.time()
                response = client.chat.completions.create(
                    model=model,
                    messages=[{"role": "user", "content": "test"}],
                    max_tokens=5
                )
                response_time = time.time() - start_time
                
                print(f"{model}: ✅ Available, Response time: {response_time:.2f}s")
            except Exception as e:
                print(f"{model}: ❌ Unavailable, Error: {str(e)}")
        
        time.sleep(interval)

# Monitor key models
models_to_monitor = ["gpt-4-turbo", "claude-sonnet-4", "gemini-2.5-pro"]
monitor_model_status(models_to_monitor)

Best Practices

1. Smart Model Selection

Dynamically select models based on task:
def select_model(task_type, complexity, budget):
    """Smart model selection"""
    if budget == "low":
        return "gpt-3.5-turbo"
    
    if complexity == "high":
        if task_type == "code":
            return "gpt-4-turbo"
        elif task_type == "long_text":
            return "claude-sonnet-4"
        else:
            return "gpt-4-turbo"
    else:
        return "gpt-3.5-turbo"

2. Load Balancing

Distribute requests across multiple models:
import random

AVAILABLE_MODELS = [
    "gpt-4-turbo",
    "claude-sonnet-4",
    "gemini-2.5-pro"
]

def load_balanced_call(prompt):
    """Load balanced call"""
    model = random.choice(AVAILABLE_MODELS)
    try:
        return client.chat.completions.create(
            model=model,
            messages=[{"role": "user", "content": prompt}]
        )
    except Exception as e:
        # Try other models if one fails
        for backup_model in AVAILABLE_MODELS:
            if backup_model != model:
                try:
                    return client.chat.completions.create(
                        model=backup_model,
                        messages=[{"role": "user", "content": prompt}]
                    )
                except:
                    continue
        raise e

3. Cache Strategy

Cache model responses to reduce calls:
from functools import lru_cache
import hashlib

@lru_cache(maxsize=1000)
def cached_completion(model, prompt_hash):
    """Cached completion"""
    # Actual API call
    response = client.chat.completions.create(
        model=model,
        messages=[{"role": "user", "content": prompt_hash}]
    )
    return response

def get_completion(model, prompt):
    """Get completion with cache"""
    prompt_hash = hashlib.md5(prompt.encode()).hexdigest()
    return cached_completion(model, prompt_hash)
I