AI 竞品情报
线上 · resumization.cn
← 时间线|DeepSeek Platform 全部动态 →
DeepSeek Platform重要能力增强changelog发生于 2024-06-28

DeepSeek-V2-0628:推理和角色扮演能力提升

DeepSeek-V2-0628: Improved Reasoning and Role-Playing

https://api-docs.deepseek.com/updates

对我们的启示

💡
暂不跟进V2 时代的版本,已被后续版本完全超越,纯历史记录。

一句话摘要

V2-0628 在数学和推理 benchmark 上大幅提升,Arena-Hard 对 GPT-4 胜率接近翻倍,是 V2 时代最重要的能力跃升。

详细描述

deepseek-chat upgraded to DeepSeek-V2-0628. HumanEval 79.88%→84.76%, MATH 55.02%→71.02%, BBH 78.56%→83.40%. Arena-Hard win rate vs GPT-4-0314 increased from 41.6% to 68.3%. Role-playing capabilities significantly enhanced.

HumanEval 升至 84.76%,MATH 升至 71.02%,Arena-Hard 对 GPT-4-0314 胜率从 41.6% 升至 68.3%,角色扮演能力显著增强。

原文摘录

HumanEval Pass@1 79.88% -> 84.76%, MATH ACC@1 55.02% -> 71.02%, BBH 78.56% -> 83.40%. In the Arena-Hard evaluation, the win rate against GPT-4-0314 increased from 41.6% to 68.3%.