一句话摘要
V2-0517 重点提升指令遵循和 JSON 输出可靠性,优化了 RAG 和翻译场景的 system prompt 表现。
详细描述
deepseek-chat upgraded to DeepSeek-V2-0517. IFEval Benchmark Prompt-Level accuracy improved from 63.9% to 77.6%. JSON parsing rate increased from 78% to 85% (97% with regex). Optimized system prompt instruction following for immersive translation and RAG.
IFEval Prompt-Level 准确率从 63.9% 升至 77.6%,JSON 解析率从 78% 升至 85%(配合正则达 97%),优化了 system prompt 遵循能力。
原文摘录
IFEval Benchmark Prompt-Level accuracy jumping from 63.9% to 77.6%. JSON parsing rate increased from 78% to 85%. By introducing appropriate regular expressions, the JSON parsing rate was further improved to 97%.