RAIL-HH-10K and Indian RAI Benchmark on Hugging Face

RAIL publishes open datasets so the responsible-AI research community can reproduce, extend, and critique our evaluation methodology. Both datasets below are MIT-licensed and available through the Responsible AI Labs HuggingFace organization.

RAIL-HH-10K

789 downloads · Published Oct 31, 2025

The first large-scale, multi-dimensional safety dataset for responsible AI evaluation. Every example is annotated across all eight RAIL dimensions with scores, key spans, and reviewer reasoning. Supports text generation, RLHF, QA, and feature extraction tasks. 10K size category, Parquet format.

View on HuggingFace →

Indian Responsible AI Benchmark

158 downloads · Published March 12, 2026

Responsible-AI evaluation tasks authored in an Indian context: English and Hindi prompts across text classification and generation, designed to test whether models handle Indian cultural, linguistic, and regulatory context as carefully as they handle Western defaults. Parquet + optimized-parquet formats.

View on HuggingFace →

Both datasets are designed to be used together: RAIL-HH-10K for general responsible-AI scoring and fine-tuning, and the Indian Responsible AI Benchmark for evaluating regional competence. If you use either dataset in published research, we'd appreciate a citation of our arXiv paper.

RAIL datasets on HuggingFace: RAIL-HH-10K and the Indian Responsible AI Benchmark