如何以官方方式重新初始化 Hugging Face LLaMA v2 模型的权重作为原始模型？

How does one reinitialize the weights of a Hugging Face LLaMA v2 model the official way as the original model?

提问人：Charlie Parker 提问时间：11/17/2023 最后编辑：Charlie Parker 更新时间：11/17/2023 访问量：38

问：

我想重新初始化我正在使用/下载的 LLaMA v2 模型的权重。我浏览了所有文档和他们的 HF 代码中的源代码：

https://github.com/huggingface/transformers/blob/main/src/transformers/models/llama/modeling_llama.py#L721
https://github.com/huggingface/transformers/blob/main/src/transformers/modeling_utils.py#L1154
https://github.com/huggingface/transformers/blob/main/src/transformers/models/llama/modeling_llama.py#L809
https://github.com/huggingface/transformers/blob/main/src/transformers/models/llama/modeling_llama.py#L721
文档 https://huggingface.co/docs/transformers/main/model_doc/llama#transformers.LlamaModel
以及两者都没有提到如何准确初始化模型或任何层 llamav1 https://arxiv.org/pdf/2302.13971.pdf 和 llamav2 https://arxiv.org/pdf/2307.09288.pdf 的论文（可能是由于商业机密？

尝试了非常简单的测试，即遍历模块/参数并根据其代码的建议重新初始化，并在权重规范发生变化时打印。它从未改变过，所以如果 pytorch HF 模型中有一些突变保护，那么 idk。我可能做错了什么吗？

import torch
from transformers import AutoModelForCausalLM, AutoConfig 
import torch.nn as nn

def main_reinit_model():
    """
    ref: https://stackoverflow.com/questions/76971761/how-to-adapt-llama-v2-model-to-less-than-7b-parameters
    ref: https://github.com/huggingface/transformers/blob/main/src/transformers/models/llama/modeling_llama.py#L721
    ref: https://chat.openai.com/c/977d0cb0-b819-48ac-be5c-6e482ad5e518 
    """
    print('Starting to reinitialize the model...')
    # Load the pretrained LLaMA v2 config
    config = AutoConfig.from_pretrained("meta-llama/Llama-2-7b-hf")
    # print(f'config: {config} {type(config)}')
    # Print the original number of parameters 
    model = AutoModelForCausalLM.from_config(config) 
    # put model on device cuda
    model = model.to('cuda')
    # print the model's device
    print(f'{model.device=}')
    # print(f'{model=}')
    # print("Original number of parameters:", sum(p.numel() for p in model.parameters()))
    # go through all parameters and compute the l1 norm and sum it then print it
    norm_model = sum(p.norm(1) for p in model.parameters())
    # loop through modules of model and reinitialize weights with normal_mean, 0.02 
    print(f'{norm_model=}')
    """
    go through model and print all laters
    """
    # model.init_weights()  # didn't work
    # model._init_weights(module)  # didn't work needs module
    # for name, param in model.named_parameters():
    #     model._init_weights(param)
    # model.post_init()
    reinitialize_weights(model)
    # model._initialize_weights(module)  # didn't work needs module
    # for name, param in model.named_parameters():
    #     print(f'{name=} {param.shape=}')
    norm_model = sum(p.norm(1) for p in model.parameters())
    print(f'{norm_model=}')

def reinitialize_weights(model) -> None:
    for module in model.modules():
        if isinstance(module, nn.Linear):
            nn.init.normal_(module.weight, mean=0, std=0.02)
            if module.bias is not None:
                nn.init.constant_(module.bias, 0)

def _init_weights(self, module):
    std = self.config.initializer_range
    if isinstance(module, nn.Linear):
        module.weight.data.normal_(mean=100.0, std=std)
        if module.bias is not None:
            module.bias.data.zero_()
    elif isinstance(module, nn.Embedding):
        module.weight.data.normal_(mean=0.0, std=std)
        if module.padding_idx is not None:
            module.weight.data[module.padding_idx].zero_()

def main_generate_smaller_model():
    """
    ref: https://stackoverflow.com/questions/76971761/how-to-adapt-llama-v2-model-to-less-than-7b-parameters
    """
    print('Starting to reinitialize the model...')
    # Load the pretrained LLaMA v2 config
    config = AutoConfig.from_pretrained("meta-llama/Llama-2-7b-hf")
    print(f'config: {config} {type(config)}')
    # Print the original number of parameters 
    model = AutoModelForCausalLM.from_config(config) 
    print("Original number of parameters:", sum(p.numel() for p in model.parameters()))

    # Modify the config to reduce size
    config.hidden_size = 2048
    config.num_hidden_layers = 12

    # Create new smaller model from modified config
    smaller_model = AutoModelForCausalLM.from_config(config)
    print("New number of parameters:", sum(p.numel() for p in smaller_model.parameters()))

if __name__ == '__main__':
    import time
    start = time.time()
    # main_generate_smaller_model() 
    main_reinit_model()
    print('Done!\a\a\a')

输出从未显示权重规范发生变化：

Starting to reinitialize the model...
model.device=device(type='cuda', index=0)
norm_model=tensor(1.0779e+08, device='cuda:0', grad_fn=<AddBackward0>)
norm_model=tensor(1.0779e+08, device='cuda:0', grad_fn=<AddBackward0>)
Done!

我做错了什么？我只需要知道如何根据骆驼以正确/正确的方式重新初始化权重。使用什么确切的 init 方法和值？

相关

相关博客： https://blog.briankitano.com/llama-from-scratch/
HF 问题预训练：https://discuss.huggingface.co/t/can-i-pretrain-llama-from-scratch/37821/7
HF 不和谐：https://discord.com/channels/879548962464493619/1174911090254172231/1174911090254172231
HF 问题 reinit：https://discuss.huggingface.co/t/how-does-one-reinitialize-the-weights-of-a-hugging-face-llama-v2-model-the-official-way-as-the-original-model/62547
相关：如何使 LLaMA v2 模型适应小于 7B 的参数？

机器学习 PyTorch NLP HuggingFace-Transformers HuggingFace

1赞 cronoik 11/25/2023

你的问题，就像你之前的问题之一一样，几乎不可读。50% 的代码片段只是与问题无关的注释。你还提出了两个不同的问题。标题表示您正在寻找原始初始化参数，而您的问题文本想知道您在重新初始化模型时是否犯了错误。请编辑您的问题，以明确您想知道的内容。

1赞 Peter Mortensen 11/27/2023

赏金可能会吸引 ChatGPT 剽窃者。

答：

-1赞 user13905637 12/1/2023 #1

试试这个例子

import torch
from transformers import AutoModelForCausalLM, AutoConfig
import torch.nn as nn

def main_reinit_model():
    print('Starting to reinitialize the model...')
    
    # Load the pretrained LLaMA v2 config
    config = AutoConfig.from_pretrained("meta-llama/Llama-2-7b-hf")
    
    # Print the original number of parameters
    model = AutoModelForCausalLM.from_config(config)
    
    # Reinitialize weights
    reinitialize_weights(model)
    
    # Move the model to GPU (cuda) after initializing weights
    model = model.to('cuda')
    
    # Print the model's device
    print(f'{model.device=}')
    
    # Print the original number of parameters
    print("Original number of parameters:", sum(p.numel() for p in model.parameters()))

    # Print the model architecture
    print(model)

def reinitialize_weights(model) -> None:
    for module in model.modules():
        if isinstance(module, nn.Linear):
            nn.init.normal_(module.weight, mean=0, std=0.02)
            if module.bias is not None:
                nn.init.constant_(module.bias, 0)

def main_generate_smaller_model():
    """
    ref: https://stackoverflow.com/questions/76971761/how-to-adapt-llama-v2-model-to-less-than-7b-parameters
    """
    print('Starting to generate a smaller model...')
    # Load the pretrained LLaMA v2 config
    config = AutoConfig.from_pretrained("meta-llama/Llama-2-7b-hf")
    print(f'config: {config} {type(config)}')
    # Print the original number of parameters 
    model = AutoModelForCausalLM.from_config(config) 
    print("Original number of parameters:", sum(p.numel() for p in model.parameters()))

    # Modify the config to reduce size
    config.hidden_size = 2048
    config.num_hidden_layers = 12

    # Create a new smaller model from the modified config
    smaller_model = AutoModelForCausalLM.from_config(config)
    print("New number of parameters:", sum(p.numel() for p in smaller_model.parameters()))

if __name__ == '__main__':
    import time
    start = time.time()
    # main_generate_smaller_model() 
    main_reinit_model()
    print('Done!\a\a\a')

我发现main_reinit_model功能可能存在问题。具体来说，您正在尝试将模型移动到初始化权重之前。GPU (cuda)

0赞 skaltenp 12/15/2023 #2

也许 Huggingface 课程会有所帮助。那里提供了一个示例，如何从预先训练的配置中实例化 GPT2。

重要的代码是这样的：

config = AutoConfig.from_pretrained(
    "gpt2", # change this to "meta-llama/Llama-2-7b-hf"
    vocab_size=len(tokenizer),
    n_ctx=context_length,
    bos_token_id=tokenizer.bos_token_id,
    eos_token_id=tokenizer.eos_token_id,
    trust_remote_code=True # maybe you need this
    use_auth_token=True # you will need this for LLaMA2 (it is a gated model)
)

我认为这将表明一个新型号具有与原始 LlaMA2 相同的规格。编辑：您需要使用 huggingface_hub 登录，也许可以查看快速入门。

其他参考资料：

Huggingface Docs 自动类

Huggingface Docs 自动配置

上一个：更改langchain链中的示例值

下一个：如何训练检测两个单词完全不同的相似句子？（如果可能）[关闭]

如何以官方方式重新初始化 Hugging Face LLaMA v2 模型的权重作为原始模型？

How does one reinitialize the weights of a Hugging Face LLaMA v2 model the official way as the original model?

评论