101 lines
		
	
	
		
			4.4 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
		
		
			
		
	
	
			101 lines
		
	
	
		
			4.4 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
| 
								 | 
							
								# Prepare Models
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								To support the evaluation of new models in OpenCompass, there are several ways:
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								1. HuggingFace-based models
							 | 
						||
| 
								 | 
							
								2. API-based models
							 | 
						||
| 
								 | 
							
								3. Custom models
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								## HuggingFace-based Models
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								In OpenCompass, we support constructing evaluation models directly from HuggingFace's
							 | 
						||
| 
								 | 
							
								`AutoModel.from_pretrained` and `AutoModelForCausalLM.from_pretrained` interfaces. If the model to be
							 | 
						||
| 
								 | 
							
								evaluated follows the typical generation interface of HuggingFace models, there is no need to write code. You
							 | 
						||
| 
								 | 
							
								can simply specify the relevant configurations in the configuration file.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								Here is an example configuration file for a HuggingFace-based model:
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								```python
							 | 
						||
| 
								 | 
							
								# Use `HuggingFace` to evaluate models supported by AutoModel.
							 | 
						||
| 
								 | 
							
								# Use `HuggingFaceCausalLM` to evaluate models supported by AutoModelForCausalLM.
							 | 
						||
| 
								 | 
							
								from opencompass.models import HuggingFaceCausalLM
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								models = [
							 | 
						||
| 
								 | 
							
								    dict(
							 | 
						||
| 
								 | 
							
								        type=HuggingFaceCausalLM,
							 | 
						||
| 
								 | 
							
								        # Parameters for `HuggingFaceCausalLM` initialization.
							 | 
						||
| 
								 | 
							
								        path='huggyllama/llama-7b',
							 | 
						||
| 
								 | 
							
								        tokenizer_path='huggyllama/llama-7b',
							 | 
						||
| 
								 | 
							
								        tokenizer_kwargs=dict(padding_side='left', truncation_side='left'),
							 | 
						||
| 
								 | 
							
								        max_seq_len=2048,
							 | 
						||
| 
								 | 
							
								        batch_padding=False,
							 | 
						||
| 
								 | 
							
								        # Common parameters shared by various models, not specific to `HuggingFaceCausalLM` initialization.
							 | 
						||
| 
								 | 
							
								        abbr='llama-7b',            # Model abbreviation used for result display.
							 | 
						||
| 
								 | 
							
								        max_out_len=100,            # Maximum number of generated tokens.
							 | 
						||
| 
								 | 
							
								        batch_size=16,              # The size of a batch during inference.
							 | 
						||
| 
								 | 
							
								        run_cfg=dict(num_gpus=1),   # Run configuration to specify resource requirements.
							 | 
						||
| 
								 | 
							
								    )
							 | 
						||
| 
								 | 
							
								]
							 | 
						||
| 
								 | 
							
								```
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								Explanation of some of the parameters:
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								- `batch_padding=False`: If set to False, each sample in a batch will be inferred individually. If set to True,
							 | 
						||
| 
								 | 
							
								  a batch of samples will be padded and inferred together. For some models, such padding may lead to
							 | 
						||
| 
								 | 
							
								  unexpected results. If the model being evaluated supports sample padding, you can set this parameter to True
							 | 
						||
| 
								 | 
							
								  to speed up inference.
							 | 
						||
| 
								 | 
							
								- `padding_side='left'`: Perform padding on the left side. Not all models support padding, and padding on the
							 | 
						||
| 
								 | 
							
								  right side may interfere with the model's output.
							 | 
						||
| 
								 | 
							
								- `truncation_side='left'`: Perform truncation on the left side. The input prompt for evaluation usually
							 | 
						||
| 
								 | 
							
								  consists of both the in-context examples prompt and the input prompt. If the right side of the input prompt
							 | 
						||
| 
								 | 
							
								  is truncated, it may cause the input of the generation model to be inconsistent with the expected format.
							 | 
						||
| 
								 | 
							
								  Therefore, if necessary, truncation should be performed on the left side.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								During evaluation, OpenCompass will instantiate the evaluation model based on the `type` and the
							 | 
						||
| 
								 | 
							
								initialization parameters specified in the configuration file. Other parameters are used for inference,
							 | 
						||
| 
								 | 
							
								summarization, and other processes related to the model. For example, in the above configuration file, we will
							 | 
						||
| 
								 | 
							
								instantiate the model as follows during evaluation:
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								```python
							 | 
						||
| 
								 | 
							
								model = HuggingFaceCausalLM(
							 | 
						||
| 
								 | 
							
								    path='huggyllama/llama-7b',
							 | 
						||
| 
								 | 
							
								    tokenizer_path='huggyllama/llama-7b',
							 | 
						||
| 
								 | 
							
								    tokenizer_kwargs=dict(padding_side='left', truncation_side='left'),
							 | 
						||
| 
								 | 
							
								    max_seq_len=2048,
							 | 
						||
| 
								 | 
							
								)
							 | 
						||
| 
								 | 
							
								```
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								## API-based Models
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								Currently, OpenCompass supports API-based model inference for the following:
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								- OpenAI (`opencompass.models.OpenAI`)
							 | 
						||
| 
								 | 
							
								- More coming soon
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								Let's take the OpenAI configuration file as an example to see how API-based models are used in the
							 | 
						||
| 
								 | 
							
								configuration file.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								```python
							 | 
						||
| 
								 | 
							
								from opencompass.models import OpenAI
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								models = [
							 | 
						||
| 
								 | 
							
								    dict(
							 | 
						||
| 
								 | 
							
								        type=OpenAI,                             # Using the OpenAI model
							 | 
						||
| 
								 | 
							
								        # Parameters for `OpenAI` initialization
							 | 
						||
| 
								 | 
							
								        path='gpt-4',                            # Specify the model type
							 | 
						||
| 
								 | 
							
								        key='YOUR_OPENAI_KEY',                   # OpenAI API Key
							 | 
						||
| 
								 | 
							
								        max_seq_len=2048,                        # The max input number of tokens
							 | 
						||
| 
								 | 
							
								        # Common parameters shared by various models, not specific to `OpenAI` initialization.
							 | 
						||
| 
								 | 
							
								        abbr='GPT-4',                            # Model abbreviation used for result display.
							 | 
						||
| 
								 | 
							
								        max_out_len=512,                         # Maximum number of generated tokens.
							 | 
						||
| 
								 | 
							
								        batch_size=1,                            # The size of a batch during inference.
							 | 
						||
| 
								 | 
							
								        run_cfg=dict(num_gpus=0),                # Resource requirements (no GPU needed)
							 | 
						||
| 
								 | 
							
								    ),
							 | 
						||
| 
								 | 
							
								]
							 | 
						||
| 
								 | 
							
								```
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								## Custom Models
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								If the above methods do not support your model evaluation requirements, you can refer to
							 | 
						||
| 
								 | 
							
								[Supporting New Models](../advanced_guides/new_model.md) to add support for new models in OpenCompass.
							 |