E-EVAL
A Comprehensive Chinese K12 Education Evaluation Benchmark for Large Language Models
(2024)
E-EVAL is a comprehensive Chinese K12 education evaluation benchmark that contains 4,352 multiple-choice questions across three difficulty levels, primary, middle and high school, for a total of 23 subjects. More details can be found in our paper.
Our data can be directly downloaded on Huggingface datasets. Please refer to our github instructions for how to read and use the data.
@article{hou2024eeval, title={E-EVAL: A Comprehensive Chinese K-12 Education Evaluation Benchmark for Large Language Models}, author={Jinchang Hou and Chang Ao and Haihong Wu and Xiangtao Kong and Zhigang Zheng and Daijia Tang and Chengming Li and Xiping Hu and Ruifeng Xu and Shiwen Ni and Min Yang}, journal={https://arxiv.org/abs/2401.15927}, year={2024} }
Have any questions about E-EVAL? Please contact us at e-eval@outlook.com or create an issue on Github. For potential collaboration, please contact sw.ni@siat.ac.cn.