10 Commits

Author SHA1 Message Date
40262648c4 添加多个类别关键词,优化数据处理逻辑,支持从arXiv提取和筛选论文数据 2025-07-30 23:05:31 +08:00
7d15721f61 添加从arXiv批量获取论文数据的功能,并将结果保存为JSONL格式,优化了数据处理流程 2025-07-28 06:11:49 +08:00
ecf6279300 添加多种问题模板生成和数据解析功能,优化数据转换流程 2025-07-26 11:16:28 +08:00
2846ebd310 添加爬取arXiv论文的功能,支持根据查询获取论文标题、作者和摘要 2025-07-25 18:11:11 +08:00
87f2756fdf Add validation analysis script for classification results
- Implemented a new script `val_test.py` to analyze classification results from a JSONL file.
- Extracted true labels and predicted responses, handling invalid entries gracefully.
- Generated a classification report with accuracy metrics and detailed statistics for each category.
- Added functionality to export results to CSV and save analysis reports.
- Included visualization of confusion matrix and category accuracy distribution.
- Ensured dynamic handling of categories based on the input data.
2025-07-20 21:04:08 +08:00
24ac0ed40c 更新数据转换功能,支持从新格式提取信息并生成多种问题模板,优化输入输出文件路径 2025-07-19 17:06:10 +08:00
0147058343 multi type question 2025-07-19 12:48:51 +08:00
563f16f0c5 swift 2025-07-18 18:00:04 +08:00
24abc7aab3 添加数据处理脚本,支持从原始数据筛选、抽样到转换为Alpaca格式 2025-06-09 14:39:07 +08:00
40c5dee22c first commit 2025-06-09 14:21:39 +08:00