Update README.md

This commit is contained in:
ai-modelscope 2024-09-18 23:39:23 +08:00
parent a323327099
commit f9998697de

View File

@ -9,6 +9,7 @@ base_model: Qwen/Qwen2.5-72B-Instruct
tags:
- chat
---
# Qwen2.5-72B-Instruct-GGUF
## Introduction
@ -29,6 +30,7 @@ Qwen2.5 is the latest series of Qwen large language models. For Qwen2.5, we rele
- Number of Layers: 80
- Number of Attention Heads (GQA): 64 for Q and 8 for KV
- Context Length: Full 32,768 tokens and generation 8192 tokens
- Note: Currently, only vLLM supports YARN for length extrapolating. If you want to process sequences up to 131,072 tokens, please refer to non-GGUF models.
- Quantization: q2_K, q3_K_M, q4_0, q4_K_M, q5_0, q5_K_M, q6_K, q8_0
For more details, please refer to our [blog](https://qwenlm.github.io/blog/qwen2.5/), [GitHub](https://github.com/QwenLM/Qwen2.5), and [Documentation](https://qwen.readthedocs.io/en/latest/).
@ -47,14 +49,14 @@ Since cloning the entire repo may be inefficient, you can manually download the
```
2. Download:
```shell
huggingface-cli download Qwen/Qwen2.5-72B-Instruct-GGUF qwen2.5-72b-instruct-q4_k_m.gguf --local-dir . --local-dir-use-symlinks False
huggingface-cli download Qwen/Qwen2.5-72B-Instruct-GGUF --include "qwen2.5-72b-instruct-q5_k_m*.gguf" --local-dir . --local-dir-use-symlinks False
```
For large files, we split them into multiple segments due to the limitation of file upload. They share a prefix, with a suffix indicating its index. For examples, `qwen2.5-72b-instruct-q5_k_m-00001-of-00002.gguf` and `qwen2.5-72b-instruct-q5_k_m-00002-of-00002.gguf`. You need to download all of them.
For large files, we split them into multiple segments due to the limitation of file upload. They share a prefix, with a suffix indicating its index. For examples, `qwen2.5-72b-instruct-q5_k_m-00001-of-00014.gguf` to `qwen2.5-72b-instruct-q5_k_m-00014-of-00014.gguf`. The above command will download all of them.
3. (Optional) Merge:
For split files, you need to merge them first with the command `llama-gguf-split` as shown below:
```bash
# ./llama-gguf-split --merge <first-split-file-path> <merged-file-path>
./llama-gguf-split --merge qwen2.5-72b-instruct-q5_k_m-00001-of-00002.gguf qwen2.5-72b-instruct-q5_k_m.gguf
./llama-gguf-split --merge qwen2.5-72b-instruct-q5_k_m-00001-of-00014.gguf qwen2.5-72b-instruct-q5_k_m.gguf
```
For users, to achieve chatbot-like experience, it is recommended to commence in the conversation mode: