From f9998697dee1f7c5b24f348ffbce994d4b3d00ed Mon Sep 17 00:00:00 2001 From: ai-modelscope Date: Wed, 18 Sep 2024 23:39:23 +0800 Subject: [PATCH] Update README.md --- README.md | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/README.md b/README.md index 0002afc..f1b5cb6 100644 --- a/README.md +++ b/README.md @@ -9,6 +9,7 @@ base_model: Qwen/Qwen2.5-72B-Instruct tags: - chat --- + # Qwen2.5-72B-Instruct-GGUF ## Introduction @@ -29,6 +30,7 @@ Qwen2.5 is the latest series of Qwen large language models. For Qwen2.5, we rele - Number of Layers: 80 - Number of Attention Heads (GQA): 64 for Q and 8 for KV - Context Length: Full 32,768 tokens and generation 8192 tokens + - Note: Currently, only vLLM supports YARN for length extrapolating. If you want to process sequences up to 131,072 tokens, please refer to non-GGUF models. - Quantization: q2_K, q3_K_M, q4_0, q4_K_M, q5_0, q5_K_M, q6_K, q8_0 For more details, please refer to our [blog](https://qwenlm.github.io/blog/qwen2.5/), [GitHub](https://github.com/QwenLM/Qwen2.5), and [Documentation](https://qwen.readthedocs.io/en/latest/). @@ -47,14 +49,14 @@ Since cloning the entire repo may be inefficient, you can manually download the ``` 2. Download: ```shell - huggingface-cli download Qwen/Qwen2.5-72B-Instruct-GGUF qwen2.5-72b-instruct-q4_k_m.gguf --local-dir . --local-dir-use-symlinks False + huggingface-cli download Qwen/Qwen2.5-72B-Instruct-GGUF --include "qwen2.5-72b-instruct-q5_k_m*.gguf" --local-dir . --local-dir-use-symlinks False ``` - For large files, we split them into multiple segments due to the limitation of file upload. They share a prefix, with a suffix indicating its index. For examples, `qwen2.5-72b-instruct-q5_k_m-00001-of-00002.gguf` and `qwen2.5-72b-instruct-q5_k_m-00002-of-00002.gguf`. You need to download all of them. + For large files, we split them into multiple segments due to the limitation of file upload. They share a prefix, with a suffix indicating its index. For examples, `qwen2.5-72b-instruct-q5_k_m-00001-of-00014.gguf` to `qwen2.5-72b-instruct-q5_k_m-00014-of-00014.gguf`. The above command will download all of them. 3. (Optional) Merge: For split files, you need to merge them first with the command `llama-gguf-split` as shown below: ```bash # ./llama-gguf-split --merge - ./llama-gguf-split --merge qwen2.5-72b-instruct-q5_k_m-00001-of-00002.gguf qwen2.5-72b-instruct-q5_k_m.gguf + ./llama-gguf-split --merge qwen2.5-72b-instruct-q5_k_m-00001-of-00014.gguf qwen2.5-72b-instruct-q5_k_m.gguf ``` For users, to achieve chatbot-like experience, it is recommended to commence in the conversation mode: