Shanghai AI Lab Surpasses DeepSeek in Math Reasoning Without Distilling R1, Using RL to Break Limits
Shanghai AI Lab beats DeepSeek in math via RL, not R1 distillation. I read the filing: ops trade-off is compute cost vs. raw reasoning gains for production models.