Documentation

Evaluation API

Score and analyze content across 8 RAIL dimensions with our comprehensive evaluation endpoints.

Overview

The Evaluation API provides comprehensive endpoints for analyzing content against responsible AI standards. Submit your content and receive detailed scores across 8 RAIL dimensions with confidence ratings. All responses include metadata with request tracking and processing times.

Response Type
Synchronous (direct)
Processing Time
2-10s by tier
Authentication
Bearer API key

How it works: Submit content with your API key → Request processed immediately with tier-based priority → Receive comprehensive RAIL scores with metadata

Basic Endpoints

Available for all plan tiers. Perfect for getting started with RAIL evaluation.

Advanced Endpoints

Available for Pro, Business, and Enterprise plans. Advanced evaluation with ensemble models, batch processing, and specialized analysis.

The 8 RAIL Dimensions

All evaluation endpoints analyze content across these dimensions:

⚖️

Fairness

Unbiased treatment across demographics

🛡️

Safety

Prevention of harmful content

Reliability

Consistent, accurate results

🔍

Transparency

Clear decision explanations

🔒

Privacy

Protection of sensitive data

📋

Accountability

Auditable responsibility tracking

🌍

Inclusivity

Respect for diverse perspectives

👥

User Impact

Consideration of end-user effects

Common Use Cases

🛡️ Content Safety Checks

Quickly evaluate user-generated content for safety issues before publishing. Perfect for comment sections, forums, and social platforms.

/score/dimensionCheck safety only
→ Fast, focused evaluation

⚖️ Fairness Analysis

Analyze training data or AI outputs for bias across demographics. Ensure your AI systems treat all users fairly.

/score/customFocus on fairness
→ Evaluate fairness, inclusivity, user_impact

🏥 Healthcare AI Compliance

PRO+

Ensure healthcare AI systems meet strict requirements for privacy, safety, and reliability. Prioritize critical dimensions with custom weights.

/score/weightedPrioritize safety & privacy
→ Weights: safety 30%, privacy 25%, reliability 20%

📊 Bulk Content Screening

PRO+

Evaluate large volumes of AI-generated content efficiently. Process up to 100 items per request with batch processing.

/score/batchUp to 100 items
→ Ideal for content pipelines and automated workflows

🤖 RAG System Quality

PRO+

Validate RAG (Retrieval-Augmented Generation) responses for hallucinations and accuracy. Ensure responses are grounded in provided context.

/rag/evaluateCheck grounding & quality
→ Hallucination detection + RAIL scoring

🎯 High-Stakes Applications

PRO+

For critical applications requiring maximum accuracy, use advanced ensemble evaluation with higher confidence scores.

/advancedEnsemble models
→ Confidence typically 0.90+ vs 0.80-0.85 for basic

Ready to Evaluate Content?

Start analyzing content for responsible AI with our comprehensive evaluation endpoints.