From f9998697dee1f7c5b24f348ffbce994d4b3d00ed Mon Sep 17 00:00:00 2001
From: ai-modelscope <ai@modelscope.cn>
Date: Wed, 18 Sep 2024 23:39:23 +0800
Subject: [PATCH] Update README.md

---
 README.md | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/README.md b/README.md
index 0002afc..f1b5cb6 100644
--- a/README.md
+++ b/README.md
@@ -9,6 +9,7 @@ base_model: Qwen/Qwen2.5-72B-Instruct
 tags:
 - chat
 ---
+
 # Qwen2.5-72B-Instruct-GGUF
 
 ## Introduction
@@ -29,6 +30,7 @@ Qwen2.5 is the latest series of Qwen large language models. For Qwen2.5, we rele
 - Number of Layers: 80
 - Number of Attention Heads (GQA): 64 for Q and 8 for KV
 - Context Length: Full 32,768 tokens and generation 8192 tokens
+  - Note: Currently, only vLLM supports YARN for length extrapolating. If you want to process sequences up to 131,072 tokens, please refer to non-GGUF models.
 - Quantization: q2_K, q3_K_M, q4_0, q4_K_M, q5_0, q5_K_M, q6_K, q8_0
 
 For more details, please refer to our [blog](https://qwenlm.github.io/blog/qwen2.5/), [GitHub](https://github.com/QwenLM/Qwen2.5), and [Documentation](https://qwen.readthedocs.io/en/latest/).
@@ -47,14 +49,14 @@ Since cloning the entire repo may be inefficient, you can manually download the
    ```
 2. Download:
    ```shell
-   huggingface-cli download Qwen/Qwen2.5-72B-Instruct-GGUF qwen2.5-72b-instruct-q4_k_m.gguf --local-dir . --local-dir-use-symlinks False
+   huggingface-cli download Qwen/Qwen2.5-72B-Instruct-GGUF --include "qwen2.5-72b-instruct-q5_k_m*.gguf" --local-dir . --local-dir-use-symlinks False
    ```
-   For large files, we split them into multiple segments due to the limitation of file upload. They share a prefix, with a suffix indicating its index. For examples, `qwen2.5-72b-instruct-q5_k_m-00001-of-00002.gguf` and `qwen2.5-72b-instruct-q5_k_m-00002-of-00002.gguf`. You need to download all of them.
+   For large files, we split them into multiple segments due to the limitation of file upload. They share a prefix, with a suffix indicating its index. For examples, `qwen2.5-72b-instruct-q5_k_m-00001-of-00014.gguf` to `qwen2.5-72b-instruct-q5_k_m-00014-of-00014.gguf`. The above command will download all of them.
 3. (Optional) Merge:
    For split files, you need to merge them first with the command `llama-gguf-split` as shown below:
    ```bash
    # ./llama-gguf-split --merge <first-split-file-path> <merged-file-path>
-   ./llama-gguf-split --merge qwen2.5-72b-instruct-q5_k_m-00001-of-00002.gguf qwen2.5-72b-instruct-q5_k_m.gguf
+   ./llama-gguf-split --merge qwen2.5-72b-instruct-q5_k_m-00001-of-00014.gguf qwen2.5-72b-instruct-q5_k_m.gguf
    ```
 
 For users, to achieve chatbot-like experience, it is recommended to commence in the conversation mode: