Update README.md (#10)

- Update README.md (0193a513625b669b8529dc583d213494b0be7b20)


Co-authored-by: yong <yo37@users.noreply.huggingface.co>
This commit is contained in:
Cherrytest 2025-08-23 07:42:27 +00:00
parent 879761d4b6
commit 2131a3c091

View File

@ -474,7 +474,7 @@ Incorporating synthetic instruction data into pretraining leads to improved perf
Users can flexibly specify the model's thinking budget. The figure below shows the performance curves across different tasks as the thinking budget varies. For simpler tasks (such as IFEval), the model's chain of thought (CoT) is shorter, and the score exhibits fluctuations as the thinking budget increases. For more challenging tasks (such as AIME and LiveCodeBench), the model's CoT is longer, and the score improves with an increase in the thinking budget. Users can flexibly specify the model's thinking budget. The figure below shows the performance curves across different tasks as the thinking budget varies. For simpler tasks (such as IFEval), the model's chain of thought (CoT) is shorter, and the score exhibits fluctuations as the thinking budget increases. For more challenging tasks (such as AIME and LiveCodeBench), the model's CoT is longer, and the score improves with an increase in the thinking budget.
![thinking_budget](./figures/thinking_budget.png) ![thinking_budget](./thinking_budget.png)
Here is an example with a thinking budget set to 512: during the reasoning process, the model periodically triggers self-reflection to estimate the consumed and remaining budget, and delivers the final response once the budget is exhausted or the reasoning concludes. Here is an example with a thinking budget set to 512: during the reasoning process, the model periodically triggers self-reflection to estimate the consumed and remaining budget, and delivers the final response once the budget is exhausted or the reasoning concludes.
``` ```