Integrations
Drop-in wrappers for LLM providers and agent frameworks. Provider wrappers add automatic RAIL evaluation to every response. Agent framework hooks plug into CrewAI, LangGraph, and AutoGen pipelines.
LLM providers: pip install "rail-score-sdk[integrations]" or individually with [openai], [anthropic], [google], [litellm]
Agent frameworks: pip install "rail-score-sdk[agents]" (CrewAI, LangGraph, AutoGen)
OpenAI
Wrap OpenAI chat completions with automatic RAIL scoring.
from rail_score_sdk.integrations import RAILOpenAI
client = RAILOpenAI(
openai_api_key="sk-...",
rail_api_key="rail_...",
rail_threshold=7.0,
rail_policy="log_only", # "log_only" | "block" | "regenerate"
)
# Use client.chat_completion() — not client.chat.completions.create()
response = await client.chat_completion(
model="gpt-4o",
messages=[{"role": "user", "content": "Explain quantum computing"}]
)
# RAILChatResponse — not a raw OpenAI response
print(response.content) # str — LLM response text
print(response.rail_score) # float 0–10
print(response.rail_confidence) # float 0–1
print(response.rail_dimensions) # dict of per-dimension scores
print(response.threshold_met) # bool
print(response.was_regenerated) # boolAnthropic
Wrap Anthropic Claude calls with RAIL evaluation.
from rail_score_sdk.integrations import RAILAnthropic
client = RAILAnthropic(
anthropic_api_key="sk-ant-...",
rail_api_key="rail_...",
rail_threshold=7.0,
rail_policy="log_only",
)
# Use client.message() — not client.messages.create()
response = await client.message(
model="claude-sonnet-4-5-20250929",
max_tokens=1024,
messages=[{"role": "user", "content": "Write a hiring policy"}]
)
print(response.content) # str — LLM response text
print(response.rail_score) # float 0–10
print(response.threshold_met) # boolGoogle Gemini
Wrap Google Gemini calls with RAIL evaluation.
from rail_score_sdk.integrations import RAILGemini
client = RAILGemini(
gemini_api_key="AIza...", # note: gemini_api_key, not google_api_key
rail_api_key="rail_...",
rail_threshold=7.0,
rail_policy="log_only",
)
# Use client.generate() — not client.generate_content()
response = await client.generate(
model="gemini-2.5-flash",
contents="Describe the benefits of renewable energy",
)
print(response.content) # str — LLM response text
print(response.rail_score) # float 0–10
print(response.threshold_met) # boolLangfuse v3
Add RAIL scores as Langfuse trace metadata for observability.
from rail_score_sdk.integrations import RAILLangfuse
rail_langfuse = RAILLangfuse(
rail_api_key="rail_...",
langfuse_public_key="pk-lf-...",
langfuse_secret_key="sk-lf-...",
langfuse_base_url="https://cloud.langfuse.com", # or self-hosted
score_dimensions=True, # Push all 8 dimension scores individually
)
# Evaluate content and push scores to a Langfuse trace
await rail_langfuse.evaluate_and_log(
content="AI response about investment strategies...",
trace_id="trace-abc-123",
)
# Or attach an existing EvalResult to a trace
from rail_score_sdk import RAILSession
session = RAILSession(api_key="rail_...", threshold=7.0)
result = await session.evaluate_turn(
user_message="...", assistant_response="..."
)
rail_langfuse.log_eval_result(result, trace_id="trace-abc-123")LiteLLM Guardrail
Use RAIL as a guardrail in your LiteLLM proxy.
# In your LiteLLM config, add RAIL as a guardrail:
# litellm_settings:
# guardrails:
# - rail_score:
# api_key: rail_...
# threshold: 7.0
# Or use programmatically:
from rail_score_sdk.integrations import RAILGuardrail
guardrail = RAILGuardrail(
rail_api_key="rail_...",
threshold=7.0,
action="block" # "block", "log", or "regenerate"
)Agent Framework Integrations
Callbacks and hooks that plug into existing agent pipelines. Each integration intercepts tool calls before execution and optionally scans results after. Requires pip install "rail-score-sdk[agents]".
CrewAI
Attach RAILCrewAICallback to any CrewAI agent to evaluate tool calls inline.
from crewai import Agent, Task, Crew
from rail_score_sdk.agent.integrations.crewai import RAILCrewAICallback
rail_callback = RAILCrewAICallback(
rail_api_key="rail_...",
policy="block", # "block" | "log_only" | "suggest_fix"
threshold=7.0,
domain="finance",
)
agent = Agent(
role="Financial Analyst",
goal="Analyze portfolio risk",
tools=[credit_tool, market_tool],
callbacks=[rail_callback],
)
crew = Crew(agents=[agent], tasks=[task])
result = crew.kickoff()LangGraph
Wrap a LangGraph StateGraph with RAILLangGraphGuard to intercept tool nodes.
from langgraph.graph import StateGraph
from rail_score_sdk.agent.integrations.langgraph import RAILLangGraphGuard
builder = StateGraph(MyState)
builder.add_node("tool_node", my_tool_fn)
# ... add edges ...
graph = builder.compile()
guarded = RAILLangGraphGuard(
graph=graph,
rail_api_key="rail_...",
policy="block",
threshold=7.0,
)
result = await guarded.ainvoke({"input": "query"})AutoGen
Register RAILAutoGenHook as a hook on any AutoGen agent.
from autogen import ConversableAgent
from rail_score_sdk.agent.integrations.autogen import RAILAutoGenHook
rail_hook = RAILAutoGenHook(
rail_api_key="rail_...",
policy="log_only",
threshold=7.0,
)
agent = ConversableAgent(
name="Researcher",
llm_config={"model": "gpt-4o"},
)
agent.register_hook("process_message_before_send", rail_hook)
await agent.a_initiate_chat(other_agent, message="Summarize this document")How Provider Wrappers Work
- 1.Your LLM call executes normally and returns a response
- 2.The wrapper automatically sends the response text to RAIL for evaluation
- 3.RAIL scores are attached to the response object as
.rail_score - 4.If a threshold is set and the score is below it, the wrapper can block or regenerate