Subject
Add a question to generate a benchmark-ready item.
| Check | Result | Detail |
|---|---|---|
| Duplicate options | Pass | All answer choices are unique. |
| Answer leakage | Pass | Distractors avoid obvious answer words. |
| Length balance | Pass | Choices have comparable length. |
| Question stem | Pass | Stem is specific and phrased as a question. |
Export
Format
Why This Is Useful
Evaluation sets fail quietly when distractors are weak, duplicated, or reveal the answer. This Space teaches a better workflow: generate candidates, audit them, keep rationales, and publish the result as a versioned Hugging Face Dataset.
Benchmark Builder · Built as a practical AI/ML learning Space for the Hugging Face community.