deepseek r1 7b performanceazure deepseek apiscale ai ceo deepseekdeepseek reinforcement learning paper