Documentation

Policy Engine

Evaluation tells you what the score is. The policy engine tells your application what to do about it. You declare rules — "if safety drops below 7, block the response" — and the SDK enforces them automatically on every evaluated response.

Evaluation vs Policy

Evaluation says:

"This response scored 6.2 on safety, 8.1 on fairness, 4.8 on reliability."

Observation only — returns scores and returns control to you.

Policy engine says:

"Safety is below 7 — block this response. Reliability is below 5 — warn the caller."

Enforcement — takes actions on your behalf so you don't write if-else chains everywhere.

How It Works

Policy Engine: Scores → Rules → Actions

Dimension Scores

safety: 6.2
fairness: 8.1
reliability: 4.8
privacy: 5.0

Policy Rules

safety < 7Block
fairness < 6Flag
reliability < 5Warn
all passAllow

First Matching Rule Wins

Block — safety failed

reliability warning also queued

Rules evaluate in priority order — first match wins. Remaining rules produce secondary actions.

After each evaluation, the policy engine checks every rule in priority order. The first rule whose condition matches determines the primary action. If a lower-priority rule also matches (e.g., both safety and reliability fail), its action is appended as a secondary action — so no failure is silently dropped.

Policy Actions

ActionWhen to useExample
blockThe response must not reach the usersafety < 5 on a customer-facing chatbot
warnThe response can proceed but the caller should be notifiedreliability < 6 — response may contain uncertainty
flagQueue for async human review without blocking the responsefairness < 7 — flag for bias review
allowExplicitly pass (the default for unmatched content)Used as a catch-all at the end of a rule list

Declaring a Policy

Policies are declared as a list of rules. Each rule names a dimension, a threshold, and an action. Rules are evaluated in the order you declare them.

from rail_score_sdk import RailScoreClient, Policy, Rule

client = RailScoreClient(api_key="...")

# Define the policy
policy = Policy(rules=[
    Rule(dimension="safety",      threshold=7.0, action="block"),
    Rule(dimension="fairness",    threshold=6.0, action="flag"),
    Rule(dimension="reliability", threshold=5.0, action="warn"),
])

# Apply at eval time
result = client.eval(
    content="...",
    mode="basic",
    policy=policy,
)

print(result.policy_outcome.action)          # "block" | "warn" | "flag" | "allow"
print(result.policy_outcome.triggered_rules) # Which rules fired
print(result.policy_outcome.blocked)         # True if action == "block"

Reusable Policies

Define policies once and attach them to a client so they apply to every call automatically — the same way you might configure logging or retry behaviour:

# Define once
HEALTHCARE_POLICY = Policy(rules=[
    Rule(dimension="safety",      threshold=8.5, action="block"),
    Rule(dimension="reliability", threshold=7.5, action="block"),
    Rule(dimension="privacy",     threshold=8.0, action="block"),
    Rule(dimension="transparency",threshold=6.0, action="warn"),
])

# Attach to the client — applies to every eval call
client = RailScoreClient(
    api_key="...",
    default_policy=HEALTHCARE_POLICY,
)

# No need to pass policy= on every call
result = client.eval(content="...", mode="basic")

if result.policy_outcome.blocked:
    return "I'm unable to provide that information — please consult a healthcare professional."
elif result.policy_outcome.action == "warn":
    log.warning("Low-reliability response passed to user", extra={"scores": result.dimension_scores})

Session-Level Policies

A session tracks quality across an entire conversation. You can set a policy that triggers not just on individual turn scores, but on the aggregate conversation quality — useful for detecting gradual drift across many turns, each of which looks acceptable on its own:

Multi-Turn Session Lifecycle

Turn 1
Eval
Turn 2
Eval
Turn N
RAILSession tracks all turns
Avg Score
Lowest Turn
Below Threshold
from rail_score_sdk import RailScoreClient, RAILSession, Policy, Rule

client = RailScoreClient(api_key="...")

# Per-turn policy: block if any single turn falls critically low
turn_policy = Policy(rules=[
    Rule(dimension="safety", threshold=5.0, action="block"),
])

# Session policy: flag if average safety across conversation dips
session_policy = Policy(rules=[
    Rule(dimension="safety", threshold=7.0, action="flag", aggregate="avg"),
])

session = RAILSession(
    client=client,
    turn_policy=turn_policy,
    session_policy=session_policy,
)

for user_message in conversation:
    response = await generate_response(user_message)
    outcome = session.record(content=response, mode="basic")

    if outcome.turn_blocked:
        # Single turn was blocked — do not send
        send_fallback(user_message)
    elif outcome.session_flagged:
        # Conversation quality declining — escalate
        notify_human_reviewer(session.session_id)
    else:
        send(response)

Real-World Policy Examples

Healthcare chatbot

High safety and reliability bars — any critical failure blocks immediately.

safety ≥ 8.5 — block on failreliability ≥ 7.5 — block on failprivacy ≥ 8.0 — block on fail

Hiring assistant

Strict fairness — flags any potential bias in candidate assessments.

fairness ≥ 8.0 — block on failinclusivity ≥ 7.0 — flag on failsafety ≥ 6.0 — warn on fail

Customer support bot

Balanced policy — blocks severe issues, flags borderline responses for review.

safety ≥ 7.0 — block on failreliability ≥ 5.0 — warn on failuser_impact ≥ 6.0 — flag on fail

Next Steps