Update README.md

2024-09-18 23:39:23 +08:00 · 2024-09-18 23:39:23 +08:00 · f9998697de
commit f9998697de
parent a323327099
1 changed files with 5 additions and 3 deletions
--- a/README.md
+++ b/README.md
@ -9,6 +9,7 @@ base_model: Qwen/Qwen2.5-72B-Instruct
 tags:
 - chat
 ---
+
 # Qwen2.5-72B-Instruct-GGUF

 ## Introduction
@ -29,6 +30,7 @@ Qwen2.5 is the latest series of Qwen large language models. For Qwen2.5, we rele
 - Number of Layers: 80
 - Number of Attention Heads (GQA): 64 for Q and 8 for KV
 - Context Length: Full 32,768 tokens and generation 8192 tokens
+  - Note: Currently, only vLLM supports YARN for length extrapolating. If you want to process sequences up to 131,072 tokens, please refer to non-GGUF models.
 - Quantization: q2_K, q3_K_M, q4_0, q4_K_M, q5_0, q5_K_M, q6_K, q8_0

 For more details, please refer to our [blog](https://qwenlm.github.io/blog/qwen2.5/), [GitHub](https://github.com/QwenLM/Qwen2.5), and [Documentation](https://qwen.readthedocs.io/en/latest/).
@ -47,14 +49,14 @@ Since cloning the entire repo may be inefficient, you can manually download the
   ```
 2. Download:
   ```shell
-   huggingface-cli download Qwen/Qwen2.5-72B-Instruct-GGUF qwen2.5-72b-instruct-q4_k_m.gguf --local-dir . --local-dir-use-symlinks False
+   huggingface-cli download Qwen/Qwen2.5-72B-Instruct-GGUF --include "qwen2.5-72b-instruct-q5_k_m*.gguf" --local-dir . --local-dir-use-symlinks False
   ```
-   For large files, we split them into multiple segments due to the limitation of file upload. They share a prefix, with a suffix indicating its index. For examples, `qwen2.5-72b-instruct-q5_k_m-00001-of-00002.gguf` and `qwen2.5-72b-instruct-q5_k_m-00002-of-00002.gguf`. You need to download all of them.
+   For large files, we split them into multiple segments due to the limitation of file upload. They share a prefix, with a suffix indicating its index. For examples, `qwen2.5-72b-instruct-q5_k_m-00001-of-00014.gguf` to `qwen2.5-72b-instruct-q5_k_m-00014-of-00014.gguf`. The above command will download all of them.
 3. (Optional) Merge:
   For split files, you need to merge them first with the command `llama-gguf-split` as shown below:
   ```bash
   # ./llama-gguf-split --merge <first-split-file-path> <merged-file-path>
-   ./llama-gguf-split --merge qwen2.5-72b-instruct-q5_k_m-00001-of-00002.gguf qwen2.5-72b-instruct-q5_k_m.gguf
+   ./llama-gguf-split --merge qwen2.5-72b-instruct-q5_k_m-00001-of-00014.gguf qwen2.5-72b-instruct-q5_k_m.gguf
   ```

 For users, to achieve chatbot-like experience, it is recommended to commence in the conversation mode: