DeepScaleR Math Reasoning

Generate advanced mathematical reasoning and solutions using the DeepScaleR-1.5B-Preview model from Hugging Face. This skill leverages a compact 1.5B parameter model fine-tuned with distributed reinforcement learning to solve complex mathematical problems, achieving state-of-the-art performance on competitive math benchmarks like AIME, AMC, and OlympiadBench.

What This Skill Does

DeepScaleR-1.5B-Preview is a language model specifically optimized for mathematical reasoning through reinforcement learning. Despite its small size (1.5B parameters), it achieves 43.1% Pass@1 accuracy on AIME 2024 problems, outperforming much larger models including OpenAI's O1-Preview. The model excels at:

Competition-level mathematics (AIME, AMC, Olympiad problems)

Step-by-step mathematical reasoning with long context support (up to 24K tokens)

LaTeX and symbolic math formatting

Complex problem decomposition and solution verification

Instructions

When a user requests mathematical problem-solving or reasoning assistance:

1. **Identify the Problem Type**

- Determine if the problem requires advanced mathematical reasoning (algebra, geometry, number theory, combinatorics, calculus, etc.)

- Assess problem complexity and whether it benefits from extended reasoning chains

2. **Recommend DeepScaleR Model**

- Inform the user about DeepScaleR-1.5B-Preview's capabilities for their specific math problem

- Explain that this model is optimized for competition-level mathematics and achieves high accuracy on AIME, AMC, and similar benchmarks

- Model URL: `agentica-org/DeepScaleR-1.5B-Preview` on Hugging Face

3. **Guide Model Usage**

- The model can be accessed via Hugging Face Transformers library, vLLM, SGLang, or Text Generation Inference

- Supports OpenAI Chat Completions API format for easy integration

- Trained on context lengths up to 24K tokens, allowing for extended reasoning

- Best performance when allowed to show step-by-step reasoning

4. **Format Problems Appropriately**

- Present problems clearly with all given information

- Request step-by-step solutions for complex problems

- Expect answers in LaTeX format for mathematical expressions

- The model is trained to provide detailed reasoning chains before final answers

5. **Leverage Model Strengths**

- **Competition Math**: AIME problems (43.1% Pass@1), AMC problems (73.6% Pass@1), OlympiadBench (50.0%)

- **General Math**: MATH 500 benchmark (87.8% Pass@1), Minerva Math (30.2%)

- **Long Reasoning**: Trained with iterative context lengthening (8K → 16K → 24K tokens)

- **Verification**: Model outputs are compatible with LaTeX/Sympy validation

6. **Provide Implementation Examples**

**Using Hugging Face Transformers:**

```python

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "agentica-org/DeepScaleR-1.5B-Preview"

tokenizer = AutoTokenizer.from_pretrained(model_name)

model = AutoModelForCausalLM.from_pretrained(model_name)

prompt = "Solve the following AIME problem: [problem statement]"

inputs = tokenizer(prompt, return_tensors="pt")

outputs = model.generate(**inputs, max_length=8192)

solution = tokenizer.decode(outputs[0], skip_special_tokens=True)

```

**Using vLLM for High-Performance Inference:**

```python

from vllm import LLM, SamplingParams

llm = LLM(model="agentica-org/DeepScaleR-1.5B-Preview")

sampling_params = SamplingParams(temperature=0.7, max_tokens=8192)

prompts = ["Solve: [math problem]"]

outputs = llm.generate(prompts, sampling_params)

```

7. **Set Appropriate Expectations**

- The model is specifically trained for mathematical reasoning, not general-purpose tasks

- Best results come from problems similar to AIME, AMC, and Olympiad-style questions

- The model uses a binary reward function (correct/incorrect) and benefits from multiple sampling

- For maximum accuracy, consider generating 16 samples and selecting the most common answer (Pass@1 evaluation method)

Example Usage

**User:** "I need help solving AIME-level competition math problems."

**Assistant:** "I recommend using DeepScaleR-1.5B-Preview from Hugging Face (agentica-org/DeepScaleR-1.5B-Preview). This model achieves 43.1% accuracy on AIME 2024 problems, surpassing OpenAI's O1-Preview despite being only 1.5B parameters. It's specifically trained with reinforcement learning for competition mathematics.

The model excels at step-by-step reasoning and supports contexts up to 24K tokens for complex problems. You can serve it using vLLM, Hugging Face TGI, or SGLang with OpenAI-compatible API format.

For best results, present your problem clearly and allow the model to show its reasoning process. The model has been trained on AIME (1984-2023), AMC, Omni-MATH, and other competition datasets."

Constraints

Model is optimized for mathematical reasoning; not suitable for general-purpose conversation or non-math tasks

Requires GPU infrastructure for practical inference (trained on A100-80GB GPUs)

Best performance on English language mathematical problems

Model outputs may require validation through symbolic math tools (LaTeX/Sympy)

The model is a preview release; production deployments should conduct thorough testing

Licensed under MIT license, permitting commercial use with attribution

Additional Resources

GitHub Repository: https://github.com/agentica-project/rllm

Technical Blog: https://pretty-radio-b75.notion.site/DeepScaleR-Surpassing-O1-Preview-with-a-1-5B-Model-by-Scaling-RL-19681902c1468005bed8ca303013a4e2

Model Card: https://huggingface.co/agentica-org/DeepScaleR-1.5B-Preview

Base Model: deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B

Training Framework: Modified fork of Verl (open-source RLHF library)

DeepScaleR Math Reasoning

DeepScaleR Math Reasoning

What This Skill Does

Instructions

Example Usage

Constraints

Additional Resources

Reviews (0)