cLegal-QA: a Chinese legal question answering with natural language generation methods,Complex & Intelligent Systems

当前位置： X-MOL 学术 › Complex Intell. Syst. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

cLegal-QA: a Chinese legal question answering with natural language generation methods
Complex & Intelligent Systems ( IF 5.0 ) Pub Date : 2024-12-19 , DOI: 10.1007/s40747-024-01675-x
Yizhen Wang, Xueying Shen, Zixian Huang, Lihui Niu, Shiyan Ou

Legal question answering (Legal QA) aims to provide accurate and timely answers to legal questions, significantly reducing the workload of legal professionals. This approach improves the efficiency of the judiciary and ensures prompt, professional legal assistance to the public. Currently, a major challenge is the absence of a large-scale dataset tailored for Chinese generative legal question answering. To address this, our study developed a comprehensive automatic question answering dataset for Chinese civil law, named cLegal-QA, which comprises 14,000 high-frequency questions from Chinese legal communities. This dataset spans various legal disputes and includes questions, disputes, scenarios, multiple lawyer responses, and gold-standard answers from human annotators. Additionally, we employed a generative QA model specifically designed for the cLegal-QA dataset. The results indicate that fully-supervised models, notably UniLM, T5, and BART, substantially outperform zero-shot models on this dataset, with ChatYuan being the most effective among the zero-shot models. Our analysis also reveals that answers labeled with 60–80% accuracy yield the highest efficiency. Furthermore, we evaluated the real-world performance of these models with expert validation and applied transfer learning to new civil disputes. While the QA models demonstrate commendable performance on the dataset, there is still potential for further improvement.

更新日期：2024-12-19

点击分享查看原文

点击收藏

公开下载

阅读更多本刊新发论文本刊介绍/投稿指南