当我设置 load_in_8bit=true 时，发生了一些错误-解网

问：

当我运行 alpaca-lora 时，我想用我自己的数据微调它并使用它来生成.但无论是在 generate.py 还是 finetune.py，一旦我设置了 load_in_8bit=true，就无法正常生成。模型会输出一堆问号，如下图所示：在此处输入图片描述

我打印出了它的向量，如图所示，在这里输入图像描述，看起来它根本没有正确生成，但是当我将其设置为load_in_8bit=false时，它可以正常生成和微调。

我已经正确安装了 bitsandbytes 并加速，测试期间不会报告任何错误。这个问题我已经卡住了一个星期了，所以我想寻求帮助，谢谢！！

以下是我的 generate.py 代码

from peft import PeftModel
from transformers import LlamaTokenizer, LlamaForCausalLM, GenerationConfig
tokenizer = LlamaTokenizer.from_pretrained("llama1")
model = LlamaForCausalLM.from_pretrained(
    "llama1",
    load_in_8bit = True,
    device_map="auto",
)
model = PeftModel.from_pretrained(model, "tloen/alpaca-lora")
def alpaca_talk(text):
    inputs = tokenizer(
        text,
        return_tensors="pt",
    )
    input_ids = inputs["input_ids"].cuda()
    generation_config = GenerationConfig(
        temperature=0.9,
        top_p=0.75,
    )
    print("Generating...")
    generation_output = model.generate(
        input_ids=input_ids,
        generation_config=generation_config,
        return_dict_in_generate=True,
        output_scores=True,
        max_new_tokens=256,
    )
    for s in generation_output.sequences:
        print(tokenizer.decode(s))

for input_text in [
    """Below is an instruction that describes a task. Write a response that appropriately completes the request.
    ### Instruction:
    What steps should I ....?
    ### Response:
    """
]:
    alpaca_talk(input_text)

我已经正确安装了 bitsandbytes 并加速，测试期间不会报告任何错误。不知道你们有没有遇到过这种情况，或者你认为是哪一部分问题？

这个问题我已经卡住了一个星期了，所以我想寻求帮助，谢谢！！

大型语言模型羊驼

当我设置 load_in_8bit=true 时，发生了一些错误

When I set load_in_8bit=true, some errors occurred

评论