提问人:Charlie Parker 提问时间:11/17/2023 最后编辑:Charlie Parker 更新时间:11/17/2023 访问量:38
如何以官方方式重新初始化 Hugging Face LLaMA v2 模型的权重作为原始模型?
How does one reinitialize the weights of a Hugging Face LLaMA v2 model the official way as the original model?
问:
我想重新初始化我正在使用/下载的 LLaMA v2 模型的权重。我浏览了所有文档和他们的 HF 代码中的源代码:
- https://github.com/huggingface/transformers/blob/main/src/transformers/models/llama/modeling_llama.py#L721
- https://github.com/huggingface/transformers/blob/main/src/transformers/modeling_utils.py#L1154
- https://github.com/huggingface/transformers/blob/main/src/transformers/models/llama/modeling_llama.py#L809
- https://github.com/huggingface/transformers/blob/main/src/transformers/models/llama/modeling_llama.py#L721
- 文档 https://huggingface.co/docs/transformers/main/model_doc/llama#transformers.LlamaModel
- 以及两者都没有提到如何准确初始化模型或任何层 llamav1 https://arxiv.org/pdf/2302.13971.pdf 和 llamav2 https://arxiv.org/pdf/2307.09288.pdf 的论文(可能是由于商业机密?
尝试了非常简单的测试,即遍历模块/参数并根据其代码的建议重新初始化,并在权重规范发生变化时打印。它从未改变过,所以如果 pytorch HF 模型中有一些突变保护,那么 idk。我可能做错了什么吗?
import torch
from transformers import AutoModelForCausalLM, AutoConfig
import torch.nn as nn
def main_reinit_model():
"""
ref: https://stackoverflow.com/questions/76971761/how-to-adapt-llama-v2-model-to-less-than-7b-parameters
ref: https://github.com/huggingface/transformers/blob/main/src/transformers/models/llama/modeling_llama.py#L721
ref: https://chat.openai.com/c/977d0cb0-b819-48ac-be5c-6e482ad5e518
"""
print('Starting to reinitialize the model...')
# Load the pretrained LLaMA v2 config
config = AutoConfig.from_pretrained("meta-llama/Llama-2-7b-hf")
# print(f'config: {config} {type(config)}')
# Print the original number of parameters
model = AutoModelForCausalLM.from_config(config)
# put model on device cuda
model = model.to('cuda')
# print the model's device
print(f'{model.device=}')
# print(f'{model=}')
# print("Original number of parameters:", sum(p.numel() for p in model.parameters()))
# go through all parameters and compute the l1 norm and sum it then print it
norm_model = sum(p.norm(1) for p in model.parameters())
# loop through modules of model and reinitialize weights with normal_mean, 0.02
print(f'{norm_model=}')
"""
go through model and print all laters
"""
# model.init_weights() # didn't work
# model._init_weights(module) # didn't work needs module
# for name, param in model.named_parameters():
# model._init_weights(param)
# model.post_init()
reinitialize_weights(model)
# model._initialize_weights(module) # didn't work needs module
# for name, param in model.named_parameters():
# print(f'{name=} {param.shape=}')
norm_model = sum(p.norm(1) for p in model.parameters())
print(f'{norm_model=}')
def reinitialize_weights(model) -> None:
for module in model.modules():
if isinstance(module, nn.Linear):
nn.init.normal_(module.weight, mean=0, std=0.02)
if module.bias is not None:
nn.init.constant_(module.bias, 0)
def _init_weights(self, module):
std = self.config.initializer_range
if isinstance(module, nn.Linear):
module.weight.data.normal_(mean=100.0, std=std)
if module.bias is not None:
module.bias.data.zero_()
elif isinstance(module, nn.Embedding):
module.weight.data.normal_(mean=0.0, std=std)
if module.padding_idx is not None:
module.weight.data[module.padding_idx].zero_()
def main_generate_smaller_model():
"""
ref: https://stackoverflow.com/questions/76971761/how-to-adapt-llama-v2-model-to-less-than-7b-parameters
"""
print('Starting to reinitialize the model...')
# Load the pretrained LLaMA v2 config
config = AutoConfig.from_pretrained("meta-llama/Llama-2-7b-hf")
print(f'config: {config} {type(config)}')
# Print the original number of parameters
model = AutoModelForCausalLM.from_config(config)
print("Original number of parameters:", sum(p.numel() for p in model.parameters()))
# Modify the config to reduce size
config.hidden_size = 2048
config.num_hidden_layers = 12
# Create new smaller model from modified config
smaller_model = AutoModelForCausalLM.from_config(config)
print("New number of parameters:", sum(p.numel() for p in smaller_model.parameters()))
if __name__ == '__main__':
import time
start = time.time()
# main_generate_smaller_model()
main_reinit_model()
print('Done!\a\a\a')
输出从未显示权重规范发生变化:
Starting to reinitialize the model...
model.device=device(type='cuda', index=0)
norm_model=tensor(1.0779e+08, device='cuda:0', grad_fn=<AddBackward0>)
norm_model=tensor(1.0779e+08, device='cuda:0', grad_fn=<AddBackward0>)
Done!
我做错了什么?我只需要知道如何根据骆驼以正确/正确的方式重新初始化权重。使用什么确切的 init 方法和值?
相关
- 相关博客: https://blog.briankitano.com/llama-from-scratch/
- HF 问题预训练:https://discuss.huggingface.co/t/can-i-pretrain-llama-from-scratch/37821/7
- HF 不和谐:https://discord.com/channels/879548962464493619/1174911090254172231/1174911090254172231
- HF 问题 reinit:https://discuss.huggingface.co/t/how-does-one-reinitialize-the-weights-of-a-hugging-face-llama-v2-model-the-official-way-as-the-original-model/62547
- 相关: 如何使 LLaMA v2 模型适应小于 7B 的参数?
答:
-1赞
user13905637
12/1/2023
#1
试试这个例子
import torch
from transformers import AutoModelForCausalLM, AutoConfig
import torch.nn as nn
def main_reinit_model():
print('Starting to reinitialize the model...')
# Load the pretrained LLaMA v2 config
config = AutoConfig.from_pretrained("meta-llama/Llama-2-7b-hf")
# Print the original number of parameters
model = AutoModelForCausalLM.from_config(config)
# Reinitialize weights
reinitialize_weights(model)
# Move the model to GPU (cuda) after initializing weights
model = model.to('cuda')
# Print the model's device
print(f'{model.device=}')
# Print the original number of parameters
print("Original number of parameters:", sum(p.numel() for p in model.parameters()))
# Print the model architecture
print(model)
def reinitialize_weights(model) -> None:
for module in model.modules():
if isinstance(module, nn.Linear):
nn.init.normal_(module.weight, mean=0, std=0.02)
if module.bias is not None:
nn.init.constant_(module.bias, 0)
def main_generate_smaller_model():
"""
ref: https://stackoverflow.com/questions/76971761/how-to-adapt-llama-v2-model-to-less-than-7b-parameters
"""
print('Starting to generate a smaller model...')
# Load the pretrained LLaMA v2 config
config = AutoConfig.from_pretrained("meta-llama/Llama-2-7b-hf")
print(f'config: {config} {type(config)}')
# Print the original number of parameters
model = AutoModelForCausalLM.from_config(config)
print("Original number of parameters:", sum(p.numel() for p in model.parameters()))
# Modify the config to reduce size
config.hidden_size = 2048
config.num_hidden_layers = 12
# Create a new smaller model from the modified config
smaller_model = AutoModelForCausalLM.from_config(config)
print("New number of parameters:", sum(p.numel() for p in smaller_model.parameters()))
if __name__ == '__main__':
import time
start = time.time()
# main_generate_smaller_model()
main_reinit_model()
print('Done!\a\a\a')
我发现main_reinit_model功能可能存在问题。具体来说,您正在尝试将模型移动到初始化权重之前。GPU (cuda)
0赞
skaltenp
12/15/2023
#2
也许 Huggingface 课程会有所帮助。那里提供了一个示例,如何从预先训练的配置中实例化 GPT2。
重要的代码是这样的:
config = AutoConfig.from_pretrained(
"gpt2", # change this to "meta-llama/Llama-2-7b-hf"
vocab_size=len(tokenizer),
n_ctx=context_length,
bos_token_id=tokenizer.bos_token_id,
eos_token_id=tokenizer.eos_token_id,
trust_remote_code=True # maybe you need this
use_auth_token=True # you will need this for LLaMA2 (it is a gated model)
)
我认为这将表明一个新型号具有与原始 LlaMA2 相同的规格。 编辑: 您需要使用 huggingface_hub 登录,也许可以查看快速入门。
其他参考资料:
评论