Update README.md
This commit is contained in:
parent
980712f58b
commit
03dabea474
@ -23,6 +23,9 @@ Qwen3 is the latest generation of large language models in Qwen series, offering
|
||||
|
||||
For more details, including benchmark evaluation, hardware requirements, and inference performance, please refer to our [blog](https://qwenlm.github.io/blog/qwen3/), [GitHub](https://github.com/QwenLM/Qwen3), and [Documentation](https://qwen.readthedocs.io/en/latest/).
|
||||
|
||||
> [!TIP]
|
||||
> If you encounter significant endless repetitions, please refer to the [Best Practices](#best-practices) section for optimal sampling parameters, and set the ``presence_penalty`` to 1.5.
|
||||
|
||||
## Quickstart
|
||||
|
||||
The code of Qwen3 has been in the latest Hugging Face `transformers` and we advise you to use the latest version of `transformers`.
|
||||
@ -92,9 +95,10 @@ For deployment, you can use `vllm>=0.8.5` or `sglang>=0.4.5.post2` to create an
|
||||
|
||||
## Switching Between Thinking and Non-Thinking Mode
|
||||
|
||||
> [!TIP]
|
||||
> [!TIP]
|
||||
> The `enable_thinking` switch is also available in APIs created by vLLM and SGLang.
|
||||
> Please refer to [our documentation](https://qwen.readthedocs.io/) for more details.
|
||||
> Please refer to our documentation for [vLLM](https://qwen.readthedocs.io/en/latest/deployment/vllm.html#thinking-non-thinking-modes) and [SGLang](https://qwen.readthedocs.io/en/latest/deployment/sglang.html#thinking-non-thinking-modes) users.
|
||||
|
||||
### `enable_thinking=True`
|
||||
|
||||
|
||||
Loading…
Reference in New Issue
Block a user