JavaScript / TypeScript SDK
@responsible-ai-labs/rail-scorev2.2.1 on npm
Official JavaScript/TypeScript client for the RAIL Score API. Evaluate AI outputs across 8 responsibility dimensions with full TypeScript support, LLM provider wrappers, policy engine, session tracking, and middleware.
Quick Start — 30 seconds
npm install @responsible-ai-labs/rail-score
import { RailScore, getScoreLabel } from '@responsible-ai-labs/rail-score';
const client = new RailScore({ apiKey: "YOUR_RAIL_API_KEY" });
const result = await client.eval({ content: "Your AI-generated text here...", mode: "basic" });
console.log(result.rail_score.score); // 8.4
console.log(getScoreLabel(8.4)); // "Excellent"On this page
Installation
npm
npm install @responsible-ai-labs/rail-scoreyarn
yarn add @responsible-ai-labs/rail-scoreWith LLM provider wrappers (optional peer deps)
npm install openai # For RAILOpenAI wrapper
npm install @anthropic-ai/sdk # For RAILAnthropic wrapper
npm install @google/generative-ai # For RAILGemini wrapper
npm install langfuse # For RAILLangfuse observabilityRequires: Node.js ≥ 16.0.0
Modules: CommonJS + ESM (dual build)
Types: Full TypeScript definitions included
Client Initialization
import { RailScore } from '@responsible-ai-labs/rail-score';
const client = new RailScore({
apiKey: 'YOUR_RAIL_API_KEY', // Required
baseUrl: 'https://api.responsibleailabs.ai', // Optional (default)
timeout: 30000 // Optional (ms, default 30s)
});
// Health check — no auth required
const health = await client.health();
console.log(health.status); // "healthy"
console.log(health.service); // "rail-score-engine"Evaluation
Use client.eval() to score content across all 8 RAIL dimensions. Supports basic and deep modes, selective dimensions, and custom weights.
Basic Evaluation
Score content across all 8 dimensions:
const result = await client.eval({
content: "There are several natural approaches that may help with insomnia. Establishing a consistent sleep schedule, limiting screen time before bed, and creating a cool, dark sleeping environment are well-supported strategies. If sleep problems persist for more than a few weeks, consulting a healthcare provider is recommended.",
mode: "basic"
});
console.log(result.rail_score.score); // 8.6
console.log(result.rail_score.confidence); // 0.87
console.log(result.dimension_scores.safety.score); // 9.0
console.log(result.from_cache); // falseDeep Evaluation
Per-dimension explanations, issues, and suggestions:
const result = await client.eval({
content: "When reviewing resumes, prioritize candidates from top-tier universities. Candidates from lesser-known institutions typically lack the rigorous training needed for this role.",
mode: "deep",
domain: "general",
includeExplanations: true,
includeIssues: true,
includeSuggestions: true
});
// Access per-dimension explanations
for (const [dim, score] of Object.entries(result.dimension_scores)) {
console.log(`${dim}: ${score.score}/10`);
if (score.explanation) console.log(` → ${score.explanation}`);
if (score.issues?.length) console.log(` Issues: ${score.issues.join(", ")}`);
}
// Overall explanation
console.log(result.explanation);
// Improvement suggestions
if (result.improvement_suggestions) {
result.improvement_suggestions.forEach(s => console.log(` 💡 ${s}`));
}Selective Dimensions
Evaluate only specific dimensions:
const result = await client.eval({
content: "Your password has been reset. The new temporary password is TempPass123. Your account email is john.doe@company.com and your employee ID is EMP-4521.",
dimensions: ["privacy", "safety"]
});
console.log(result.dimension_scores.privacy.score); // 2.0
console.log(result.dimension_scores.safety.score); // 6.0Custom Weights
Weight dimensions differently — weights must sum to 100:
const result = await client.eval({
content: "Based on my analysis, you should take 400mg of ibuprofen every 4 hours for pain relief. No need to consult your doctor for this dosage.",
dimensions: ["safety", "reliability", "accountability"],
weights: { safety: 50, reliability: 30, accountability: 20 }
});
console.log(result.rail_score.score); // Weighted overall scoreSafe Regeneration
Evaluate content and iteratively regenerate until quality thresholds are met. The API evaluates, generates a RAIL-guided prompt, regenerates, and re-evaluates — up to the configured maximum iterations.
Server-Side Regeneration
The API handles regeneration internally (default):
const result = await client.safeRegenerate({
content: "When reviewing resumes, prioritize candidates from top-tier universities. Candidates from lesser-known institutions typically lack the rigorous training needed.",
mode: "basic",
maxRegenerations: 3,
thresholds: {
overall: { score: 8.0, confidence: 0.5 }
},
domain: "general"
});
console.log(result.status); // "passed" | "max_iterations_reached"
console.log(result.best_content); // Improved content
console.log(result.best_iteration); // Which iteration was best
console.log(result.best_scores.rail_score.score); // Best score achieved
console.log(result.credits_consumed); // Total credits used
// Iteration history
result.iteration_history?.forEach(iter => {
console.log(`Iteration ${iter.iteration}: ${iter.scores.rail_score.score} (thresholds met: ${iter.thresholds_met})`);
});
// Credits breakdown
if (result.credits_breakdown) {
console.log(`Evaluations: ${result.credits_breakdown.evaluations}`);
console.log(`Regenerations: ${result.credits_breakdown.regenerations}`);
}Client-Side Regeneration
Use your own LLM to regenerate content. The API returns a RAIL-guided prompt and a session ID — you regenerate with your model, then submit the result:
// Step 1: Start session — API evaluates and returns a guided prompt
const initial = await client.safeRegenerate({
content: "Content that needs improvement...",
maxRegenerations: 3,
thresholds: { overall: { score: 8.0 } }
});
// When status is "awaiting_regeneration", use the rail_prompt with your own LLM
if (initial.status === "awaiting_regeneration" && initial.rail_prompt) {
const { system_prompt, user_prompt } = initial.rail_prompt;
// Regenerate with your model (e.g., OpenAI)
const completion = await openai.chat.completions.create({
model: "gpt-4o",
messages: [
{ role: "system", content: system_prompt },
{ role: "user", content: user_prompt }
]
});
// Step 2: Submit regenerated content for re-evaluation
const continued = await client.safeRegenerateContinue({
sessionId: initial.session_id,
regeneratedContent: completion.choices[0].message.content
});
console.log(continued.status); // "passed" or "awaiting_regeneration"
console.log(continued.best_content); // Best content so far
}Note: Sessions expire after 15 minutes. If the session expires, a SessionExpiredError is thrown.
Compliance Check
Check content against regulatory frameworks. Supports: gdpr, ccpa, hipaa, eu_ai_act, india_dpdp, india_ai_gov.
Single Framework
const result = await client.complianceCheck({
content: "Our AI system processes user photos to determine creditworthiness and loan eligibility.",
framework: "gdpr",
strictMode: true,
includeExplanations: true
});
console.log(result.compliance_score.score); // 0-10
console.log(result.compliance_score.label); // "Critical" | "Poor" | "Fair" | "Good" | "Excellent"
console.log(result.requirements_passed); // Number passed
console.log(result.requirements_failed); // Number failed
// Individual requirements
result.requirements.forEach(req => {
console.log(`${req.requirement_id}: ${req.status} (${req.score}/10) — ${req.article}`);
});
// Issues and remediation
result.issues.forEach(issue => {
console.log(`[${issue.severity}] ${issue.description}`);
console.log(` Remediation: ${issue.remediation_effort}`);
});Multi-Framework
const result = await client.complianceCheck({
content: "Patient records are processed by our AI diagnostic assistant.",
frameworks: ["gdpr", "hipaa"],
context: {
domain: "healthcare",
data_types: ["health_records", "patient_identifiers"],
cross_border: true
}
});
// Per-framework results
for (const [framework, check] of Object.entries(result.results)) {
console.log(`${framework}: ${check.compliance_score.score}/10 — ${check.compliance_score.label}`);
}
// Cross-framework summary
console.log(`Average: ${result.cross_framework_summary.average_score}`);
console.log(`Weakest: ${result.cross_framework_summary.weakest_framework}`);Response Objects
EvalResult
Returned by client.eval():
{
rail_score: {
score: 8.6, // Overall score (0-10)
confidence: 0.87, // Overall confidence (0-1)
summary: "RAIL Score: 8.6/10 — Good"
},
explanation: "Holistic explanation across all dimensions...",
dimension_scores: {
fairness: { score: 9.0, confidence: 0.90, explanation?: "...", issues?: [...] },
safety: { score: 9.0, confidence: 0.88 },
reliability: { score: 8.0, confidence: 0.82 },
transparency: { score: 8.5, confidence: 0.85 },
privacy: { score: 5.0, confidence: 1.0 },
accountability: { score: 8.5, confidence: 0.84 },
inclusivity: { score: 9.0, confidence: 0.90 },
user_impact: { score: 8.5, confidence: 0.86 }
},
issues?: [{ dimension: "privacy", description: "..." }],
improvement_suggestions?: ["..."],
from_cache: false
}SafeRegenerateResult
Returned by client.safeRegenerate() and client.safeRegenerateContinue():
{
status: "passed", // "passed" | "max_iterations_reached" | "awaiting_regeneration"
original_content: "...",
best_content: "...", // Best version generated
best_iteration: 2,
best_scores: {
rail_score: { score: 8.4, confidence: 0.82, summary: "..." },
dimension_scores: { ... },
thresholds_met: true
},
iteration_history: [
{ iteration: 1, thresholds_met: false, failing_dimensions: ["fairness"] },
{ iteration: 2, thresholds_met: true, failing_dimensions: [] }
],
credits_consumed: 4.0,
credits_breakdown: { evaluations: 2.0, regenerations: 2.0, total: 4.0 },
metadata: { req_id: "...", mode: "basic" }
}ComplianceResult
Returned by client.complianceCheck() (single framework):
{
framework: "gdpr",
framework_version: "2016/679",
compliance_score: {
score: 4.2,
confidence: 0.85,
label: "Fair",
summary: "..."
},
dimension_scores: { ... }, // Framework-specific dimensions
requirements_checked: 12,
requirements_passed: 7,
requirements_failed: 4,
requirements_warned: 1,
requirements: [{ requirement_id: "...", status: "pass", score: 8.5, article: "Art. 5" }],
issues: [{ id: "...", severity: "high", description: "...", remediation_effort: "medium" }],
improvement_suggestions: ["..."],
from_cache: false
}Session Tracking
Track RAIL scores across multi-turn conversations:
import { RAILSession } from '@responsible-ai-labs/rail-score';
const session = new RAILSession(client, {
deepEvalFrequency: 5, // Deep eval every 5 turns
contextWindow: 10, // Track last 10 turns
qualityThreshold: 7.0, // Trigger deep eval when score dips below
});
// Add conversation turns
const result = await session.addTurn("AI response content");
console.log(result.rail_score.score);
// Get session metrics
const metrics = session.getMetrics();
console.log(`Average: ${metrics.averageScore}`);
console.log(`Min: ${metrics.minScore}, Max: ${metrics.maxScore}`);
console.log(`Passing rate: ${metrics.passingRate}`);
console.log(`Turns: ${metrics.turnCount}`);
// Dimension averages across session
for (const [dim, avg] of Object.entries(metrics.dimensionAverages)) {
console.log(` ${dim}: ${avg.toFixed(1)}`);
}
// Get full history
const history = session.getHistory();
// Reset for new conversation
session.reset();Policy Engine
Enforce content quality policies with four modes: LOG_ONLY, BLOCK, REGENERATE, CUSTOM.
import { PolicyEngine, RAILBlockedError } from '@responsible-ai-labs/rail-score';
const policy = new PolicyEngine(client, {
mode: "BLOCK",
thresholds: { safety: 7.0, privacy: 7.0 },
});
try {
const result = await policy.enforce("Content to check");
console.log(result.evaluation.rail_score.score);
console.log(result.passed); // true/false
console.log(result.failedDimensions); // ["safety"] or []
} catch (error) {
if (error instanceof RAILBlockedError) {
console.log(`Blocked: ${error.message}`);
console.log(`Policy mode: ${error.policyMode}`);
}
}
// Runtime reconfiguration
policy.setMode("LOG_ONLY");
policy.setThresholds({ safety: 8.0 });
// REGENERATE mode — auto-regenerates failing content
const regenPolicy = new PolicyEngine(client, {
mode: "REGENERATE",
thresholds: { safety: 7.0 },
});
const result = await regenPolicy.enforce("Risky content");
console.log(result.regeneratedContent); // Improved content (if regenerated)
// CUSTOM mode — your own enforcement logic
const customPolicy = new PolicyEngine(client, {
mode: "CUSTOM",
thresholds: { safety: 7.0 },
customCallback: async (content, evalResult) => {
// Return modified content or null to pass through
return `[Reviewed] ${content}`;
},
});Middleware
Wrap any async function with pre/post RAIL evaluation:
import { RAILMiddleware } from '@responsible-ai-labs/rail-score';
const middleware = new RAILMiddleware(client, {
inputThresholds: { safety: 5.0 },
outputThresholds: { safety: 7.0, privacy: 7.0 },
onInputEval: (result) => console.log(`Input score: ${result.rail_score.score}`),
onOutputEval: (result) => console.log(`Output score: ${result.rail_score.score}`),
});
// Wrap your LLM call
const safeLLMCall = middleware.wrap(async (input) => {
return await myLLM.generate(input);
});
const output = await safeLLMCall("User message");LLM Provider Wrappers
Built-in wrappers for popular LLM providers with automatic RAIL scoring:
OpenAI
import { RAILOpenAI } from '@responsible-ai-labs/rail-score';
import OpenAI from 'openai';
const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
const railOpenAI = new RAILOpenAI(client, openai, {
thresholds: { safety: 7.0 },
});
const result = await railOpenAI.chat({
model: "gpt-4o",
messages: [{ role: "user", content: "Explain quantum computing simply." }],
});
console.log(result.content); // LLM response text
console.log(result.railScore.score); // RAIL score
console.log(result.evaluation); // Full EvalResult
console.log(result.response); // Raw OpenAI response
Anthropic
import { RAILAnthropic } from '@responsible-ai-labs/rail-score';
import Anthropic from '@anthropic-ai/sdk';
const anthropic = new Anthropic();
const railAnthropic = new RAILAnthropic(client, anthropic, {
thresholds: { safety: 7.0 },
});
const result = await railAnthropic.message({
model: "claude-sonnet-4-6",
max_tokens: 1024,
messages: [{ role: "user", content: "Explain quantum computing simply." }],
});
console.log(result.content); // LLM response text
console.log(result.railScore.score); // RAIL score
Google Gemini
import { RAILGemini } from '@responsible-ai-labs/rail-score';
import { GoogleGenerativeAI } from '@google/generative-ai';
const genAI = new GoogleGenerativeAI(process.env.GOOGLE_API_KEY);
const model = genAI.getGenerativeModel({ model: "gemini-2.0-flash" });
const railGemini = new RAILGemini(client, model, {
thresholds: { safety: 7.0 },
});
const result = await railGemini.generate("Explain quantum computing simply.");
console.log(result.content);
console.log(result.railScore.score);All provider wrappers return { response, content, railScore, evaluation } — where railScore is a convenience alias for evaluation.rail_score and response is the raw provider response.
Observability
Langfuse Integration
import { RAILLangfuse } from '@responsible-ai-labs/rail-score';
import { Langfuse } from 'langfuse';
const langfuse = new Langfuse({ publicKey: "...", secretKey: "..." });
const railLangfuse = new RAILLangfuse(client, langfuse);
// Evaluate content and push scores to a Langfuse trace
const result = await railLangfuse.traceEvaluation("trace-id", "Content to evaluate");
// Push an existing evaluation result to a trace
await railLangfuse.scoreTrace("trace-id", existingResult);Guardrail Handler
import { RAILGuardrail } from '@responsible-ai-labs/rail-score';
const guardrail = new RAILGuardrail(client, {
inputThresholds: { safety: 7.0 },
outputThresholds: { safety: 7.0, fairness: 7.0 },
});
// Check input before LLM call
const preResult = await guardrail.preCall("User message");
if (!preResult.allowed) {
console.log("Input blocked:", preResult.failedDimensions);
}
// Check output after LLM call
const postResult = await guardrail.postCall("LLM response");
if (!postResult.allowed) {
console.log("Output blocked:", postResult.failedDimensions);
}
// Get handler for integration with frameworks
const handler = guardrail.getHandler();Error Handling
The SDK provides specific error types for granular error handling:
import {
RailScoreError,
AuthenticationError,
InsufficientCreditsError,
InsufficientTierError,
ValidationError,
ContentTooLongError,
SessionExpiredError,
ContentTooHarmfulError,
RateLimitError,
TimeoutError,
NetworkError,
ServerError,
EvaluationFailedError,
ServiceUnavailableError,
RAILBlockedError
} from '@responsible-ai-labs/rail-score';
try {
const result = await client.eval({ content: "Content to evaluate" });
} catch (error) {
if (error instanceof AuthenticationError) {
console.error("Invalid API key");
} else if (error instanceof InsufficientCreditsError) {
console.error(`Need ${error.required} credits, have ${error.balance}`);
} else if (error instanceof RateLimitError) {
console.error(`Rate limited. Retry after ${error.retryAfter}s`);
} else if (error instanceof ContentTooHarmfulError) {
console.error("Content too harmful to regenerate (score < 3.0)");
} else if (error instanceof SessionExpiredError) {
console.error("Safe-regenerate session expired (15 min TTL)");
} else if (error instanceof ContentTooLongError) {
console.error(`Max ${error.maxLength} chars, got ${error.actualLength}`);
} else if (error instanceof RAILBlockedError) {
console.error(`Blocked by policy: ${error.policyMode}`);
} else if (error instanceof ValidationError) {
console.error(`Validation error: ${error.message}`);
} else if (error instanceof RailScoreError) {
console.error(`API error (${error.statusCode}): ${error.message}`);
}
}| Error | Status | When |
|---|---|---|
| AuthenticationError | 401 | Invalid or missing API key |
| InsufficientCreditsError | 402 | Not enough credits |
| InsufficientTierError | 403 | Feature requires higher plan |
| ValidationError | 400 | Invalid parameters |
| ContentTooLongError | 400 | Content exceeds max length |
| SessionExpiredError | 410 | Safe-regenerate session expired |
| ContentTooHarmfulError | 422 | Content avg score < 3.0 |
| RateLimitError | 429 | Rate limit exceeded |
| EvaluationFailedError | 500 | Internal error (safe to retry) |
| ServiceUnavailableError | 503 | Temporarily unavailable |
| TimeoutError | — | Client-side timeout |
| NetworkError | — | DNS/connection failure |
| RAILBlockedError | — | Content blocked by policy engine |
Utility Functions
import {
getScoreLabel,
getScoreColor,
getScoreGrade,
formatScore,
formatDimensionName,
normalizeDimensionName,
resolveFrameworkAlias,
validateWeights,
normalizeWeights,
calculateWeightedScore,
confidenceWeightedScore,
isPassing,
getDimensionsBelowThreshold,
getLowestScoringDimension,
getHighestScoringDimension,
aggregateScores
} from '@responsible-ai-labs/rail-score';
// Score labeling
getScoreLabel(8.5); // "Excellent"
getScoreLabel(6.0); // "Good"
getScoreLabel(4.0); // "Fair"
getScoreLabel(2.0); // "Poor"
// Display helpers
getScoreColor(8.5); // "green"
getScoreGrade(8.5); // "A-"
formatScore(8.567, 2); // "8.57"
formatDimensionName("user_impact"); // "User Impact"
// Dimension compatibility
normalizeDimensionName("legal_compliance"); // "inclusivity"
// Framework alias resolution
resolveFrameworkAlias("ai_act"); // "eu_ai_act"
resolveFrameworkAlias("dpdp"); // "india_dpdp"
// Weight validation
validateWeights({ safety: 50, privacy: 50 }); // true (sum = 100)
// Result analysis
const weakAreas = getDimensionsBelowThreshold(result, 7.0);
const lowest = getLowestScoringDimension(result);
const highest = getHighestScoringDimension(result);
const passing = isPassing(result.rail_score.score, 7.0);
// Aggregate across multiple evaluations
const stats = aggregateScores([result1, result2, result3]);
console.log(stats.averageScore, stats.minScore, stats.maxScore);Score Labels
| Score Range | Label |
|---|---|
| 8.0 – 10.0 | Excellent |
| 6.0 – 7.9 | Good |
| 4.0 – 5.9 | Fair |
| 2.0 – 3.9 | Poor |
| 0.0 – 1.9 | Critical |
Available Dimensions
| Dimension | Description |
|---|---|
| fairness | Bias detection and equitable treatment |
| safety | Content safety and harm prevention |
| reliability | Factual accuracy and consistency |
| transparency | Explainability and clear communication |
| privacy | Data protection and user privacy |
| accountability | Traceability and auditability |
| inclusivity | Inclusive language and accessibility |
| user_impact | User value and appropriateness |
Note: legal_compliance is deprecated but still accepted — it auto-maps to inclusivity via normalizeDimensionName().
TypeScript Types
import type {
// Client
RailScoreConfig,
// Evaluation
EvalParams,
EvalResult,
EvalIssue,
DimensionScore,
Dimension,
DimensionInput,
EvaluationMode,
ContentDomain,
UseCase,
ScoreLabel,
// Safe Regeneration
SafeRegenerateParams,
SafeRegenerateResult,
SafeRegenerateContinueParams,
// Compliance
ComplianceCheckSingleParams,
ComplianceCheckMultiParams,
ComplianceResult,
MultiComplianceResult,
ComplianceFramework,
ComplianceFrameworkInput,
ComplianceContext,
// Session & Policy
SessionConfig,
SessionMetrics,
PolicyMode,
PolicyConfig,
MiddlewareConfig,
// Observability
GuardResult,
RAILGuardrailConfig,
// Health
HealthCheckResponse,
} from '@responsible-ai-labs/rail-score';Environment Variables
# Required
RAIL_API_KEY=rail_your_api_key_here
# Optional — for LLM provider wrappers
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
GOOGLE_API_KEY=AIza...