Evaluation
Score AI content across all 8 RAIL dimensions using client.eval(). Returns scores, confidence, and optional per-dimension explanations.
client.eval()
Basic evaluation
const result = await client.eval({
content: "There are several natural approaches that may help with insomnia. Establishing a consistent sleep schedule, limiting screen time before bed, and creating a cool, dark sleeping environment are well-supported strategies. If sleep problems persist, consulting a healthcare provider is recommended.",
mode: "basic"
});
console.log(result.rail_score.score); // 8.6
console.log(result.rail_score.confidence); // 0.87
console.log(result.dimension_scores.safety.score); // 9.0
console.log(result.from_cache); // falseDeep evaluation
const result = await client.eval({
content: "When reviewing resumes, prioritize candidates from top-tier universities. Candidates from lesser-known institutions typically lack the rigorous training needed for this role.",
mode: "deep",
includeExplanations: true,
includeIssues: true,
includeSuggestions: true
});
for (const [dim, score] of Object.entries(result.dimension_scores)) {
console.log(`${dim}: ${score.score}/10`);
if (score.explanation) console.log(` → ${score.explanation}`);
if (score.issues?.length) console.log(` Issues: ${score.issues.join(", ")}`);
}
// Overall explanation
console.log(result.explanation);Selective dimensions
const result = await client.eval({
content: "Your password has been reset. The new temporary password is TempPass123. Your account email is john.doe@company.com.",
dimensions: ["privacy", "safety"]
});
console.log(result.dimension_scores.privacy.score); // 2.0
console.log(result.dimension_scores.safety.score); // 6.0Custom weights
const result = await client.eval({
content: "Based on my analysis, you should take 400mg of ibuprofen every 4 hours for pain relief. No need to consult your doctor for this dosage.",
weights: { safety: 50, reliability: 30, accountability: 20 }
});
console.log(result.rail_score.score); // Weighted overall scoreModes: basic — scores only, cached 5 min, 1.0 credit. deep — scores + explanations + issues, cached 3 min, 3.0 credits. Weights must sum to 100.
Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
| content | string | Required | Text to evaluate (10–10,000 chars) |
| mode | string | "basic" | "basic" or "deep" |
| dimensions | string[] | all 8 | Subset of dimensions to score |
| weights | object | equal | Per-dimension weights (must sum to 100) |
| domain | string | "general" | "general" | "healthcare" | "finance" | "legal" |
| includeExplanations | boolean | false | Per-dimension explanations (deep mode) |
| includeIssues | boolean | false | Issue tags per dimension (deep mode) |
| includeSuggestions | boolean | false | Improvement suggestions (deep mode) |
Response: EvalResult
{
rail_score: {
score: 8.6, // Overall score (0-10)
confidence: 0.87, // Overall confidence (0-1)
summary: "RAIL Score: 8.6/10 — Good"
},
explanation: "Holistic explanation across all dimensions...", // deep mode
dimension_scores: {
fairness: { score: 9.0, confidence: 0.90, explanation?: "...", issues?: [...] },
safety: { score: 9.0, confidence: 0.88 },
reliability: { score: 8.0, confidence: 0.82 },
transparency: { score: 8.5, confidence: 0.85 },
privacy: { score: 5.0, confidence: 1.0 }, // 5.0 = N/A
accountability: { score: 8.5, confidence: 0.84 },
inclusivity: { score: 9.0, confidence: 0.90 },
user_impact: { score: 8.5, confidence: 0.86 }
},
from_cache: false
}Score Labels
| Score | Label |
|---|---|
| 8.0 – 10.0 | Excellent |
| 6.0 – 7.9 | Good |
| 4.0 – 5.9 | Fair |
| 2.0 – 3.9 | Poor |
| 0.0 – 1.9 | Critical |
import { getScoreLabel } from '@responsible-ai-labs/rail-score';
getScoreLabel(8.5); // "Excellent"
getScoreLabel(6.0); // "Good"