Overview
Moderation API detects harmful or inappropriate content in text, helping you:- Content Filtering: Automatically filter inappropriate user submissions
- Safety Review: Detect potential violations before publishing
- Compliance Check: Ensure content meets platform guidelines
- Risk Warning: Identify potentially harmful content types
Moderation API uses OpenAI’s moderation model and is free to use without consuming token quota.
Quick Start
Basic Example
Batch Moderation
Detection Categories
| Category | Description |
|---|---|
hate | Hate speech targeting specific groups |
hate/threatening | Threatening hate speech |
harassment | Harassing content |
harassment/threatening | Threatening harassment |
self-harm | Self-harm related content |
self-harm/intent | Intent to self-harm |
self-harm/instructions | Self-harm instructions |
sexual | Sexual content |
sexual/minors | Sexual content involving minors |
violence | Violent content |
violence/graphic | Graphic violence |
Practical Examples
1. User Input Filter
2. Chatbot Safety Layer
3. Custom Threshold
Best Practices
Multi-layer Protection
Pricing
Moderation API is currently free to use and does not count towards token consumption.FAQ
How accurate is moderation?
How accurate is moderation?
Based on OpenAI’s model with high accuracy, but not 100% reliable. Recommend:
- Combine with human review for critical scenarios
- Set appropriate thresholds
- Provide appeal channels
Does it support multiple languages?
Does it support multiple languages?
Supports multiple languages including Chinese. English works best.
Are there rate limits?
Are there rate limits?
There are rate limits. Control request frequency for batch processing.
Related Documentation
Text Generation
Chat API documentation
Content Safety
Content safety policy
API Reference
API reference
Data Security
Data privacy protection