带有 Langchain 工具的 Llama 2-解网

问：

编辑：

我发现它适用于 Llama 2 70b，但不适用于 Llama 2 13b。Llama 2 13b 正确使用该工具并观察其agent_scratchpad中的最终答案，但它在末尾输出一个空字符串，而 Llama 2 70b 输出“看起来答案是 18.37917367995256！”，这是正确的。

原文：

我正在尝试按照本教程将 Llama 2 与 Langchain 工具一起使用（您不必查看教程，所有代码都包含在此问题中）。

我的代码与教程中的代码非常相似，只是我使用的是本地模型而不是连接到 Hugging Face，并且我没有使用 bitsandbytes 进行量化，因为它需要 cuda 并且我在 macOS 上。

我正在使用未量化的 Meta Llama 2 13b 聊天模型 meta-llama/Llama-2-13b-chat-hf。

该模型似乎正确输出 JSON，但由于某种原因，我收到“无法解析 LLM 输出”。

这是代码（由于 Stackoverflow 代码格式，在三重反引号之前添加了 \）。注意：我在提示中添加了以下内容：“当 Google 助理使用 JSON 进行响应时，请确保用三个反引号将 JSON 括起来。这解决了模型输出的 JSON 格式不正确的问题（用反引号和“json”标记括起来）。

import transformers
model_id = './Models/Llama_2/llama_2_13b'

# initialize the model
model = transformers.AutoModelForCausalLM.from_pretrained(model_id)
tokenizer = transformers.AutoTokenizer.from_pretrained(model_id)

generate_text = transformers.pipeline(
    model=model, tokenizer=tokenizer,
    return_full_text=True,  # langchain expects the full text
    task='text-generation',
    # we pass model parameters here too
    temperature=0.01,  # 'randomness' of outputs, 0.0 is the min and 1.0 the max
    max_new_tokens=512,  # mex number of tokens to generate in the output
    repetition_penalty=1.1  # without this output begins repeating
)

from langchain.llms import HuggingFacePipeline

llm = HuggingFacePipeline(pipeline=generate_text)

from langchain.memory import ConversationBufferWindowMemory
from langchain.agents import load_tools

memory = ConversationBufferWindowMemory(
    memory_key="chat_history", k=5, return_messages=True, output_key="output"
)

tools = load_tools(["llm-math"], llm=llm)

from langchain.agents import initialize_agent

# initialize agent
agent = initialize_agent(
    agent="chat-conversational-react-description",
    tools=tools,
    llm=llm,
    verbose=True,
    early_stopping_method="generate",
    memory=memory,
    handle_parsing_errors=True
)

# special tokens used by llama 2 chat
B_INST, E_INST = "[INST]", "[/INST]"
B_SYS, E_SYS = "<<SYS>>\n", "\n<</SYS>>\n\n"

# create the system message
sys_msg = "<s>" + B_SYS + """Assistant is a expert JSON builder designed to assist with a wide range of tasks.

Assistant is able to respond to the User and use tools using JSON strings that contain "action" and "action_input" parameters.

All of Assistant's communication is performed using this JSON format.

Assistant can also use tools by responding to the user with tool use instructions in the same "action" and "action_input" JSON format. Tools available to Assistant are:

- "Calculator": Useful for when you need to answer questions about math.
  - To use the calculator tool, Assistant should write like so:
    ```json
    {{"action": "Calculator",
      "action_input": "sqrt(4)"}}
    ```

When Assistant responds with JSON they make sure to enclose the JSON with three back ticks.

Here are some previous conversations between the Assistant and User:

User: Hey how are you today?
Assistant: ```json
{{"action": "Final Answer",
 "action_input": "I'm good thanks, how are you?"}}
\```
User: I'm great, what is the square root of 4?
Assistant: ```json
{{"action": "Calculator",
 "action_input": "sqrt(4)"}}
\```
User: 2.0
Assistant: ```json
{{"action": "Final Answer",
 "action_input": "It looks like the answer is 2!"}}
\```
User: Thanks could you tell me what 4 to the power of 2 is?
Assistant: ```json
{{"action": "Calculator",
 "action_input": "4**2"}}
\```
User: 16.0
Assistant: ```json
{{"action": "Final Answer",
 "action_input": "It looks like the answer is 16!"}}
\```

Here is the latest conversation between Assistant and User.""" + E_SYS

new_prompt = agent.agent.create_prompt(
    system_message=sys_msg,
    tools=tools
)

agent.agent.llm_chain.prompt = new_prompt

instruction = B_INST + " Respond to the following in JSON with 'action' and 'action_input' values " + E_INST
human_msg = instruction + "\nUser: {input}"

agent.agent.llm_chain.prompt.messages[2].prompt.template = human_msg

然后我提示它;

agent("hey how are you today?")

我得到以下信息;

> Entering new AgentExecutor chain...


Assistant: ```json
{"action": "Final Answer",
 "action_input": "I'm good thanks, how are you?"}
\```

> Finished chain.
{'input': 'hey how are you today?',
 'chat_history': [],
 'output': "I'm good thanks, how are you?"}

这很好，但是当我用一个需要使用工具的问题提示它时;

agent("what is 4 to the power of 2.1?")

我明白了;

> Entering new AgentExecutor chain...


Assistant: ```json
{"action": "Calculator",
 "action_input": "4**2.1"}
\```
Observation: Answer: 18.37917367995256
Thought:Could not parse LLM output: 
Observation: Invalid or incomplete response
Thought:Could not parse LLM output: 
Observation: Invalid or incomplete response
Thought:Could not parse LLM output: 
Observation: Invalid or incomplete response

它只是卡在一个循环中，重复“无法解析 LLM 输出”和“无效或不完整的响应”

有谁知道如何修复“无法解析 LLM 输出”错误？

据推测，这段代码适用于本教程的作者。

我在 macOS Sonoma 上使用 Python 3.11.5 和 Anaconda、tensorflow 2.15.0、transformers 4.35.2、langchain 0.0.336。

编辑：

我更新了代码以使用此处的输出解析器。

from langchain.agents import AgentOutputParser
from langchain.agents.conversational_chat.prompt import FORMAT_INSTRUCTIONS
from langchain.output_parsers.json import parse_json_markdown
from langchain.schema import AgentAction, AgentFinish

class OutputParser(AgentOutputParser):
    def get_format_instructions(self) -> str:
        return FORMAT_INSTRUCTIONS

    def parse(self, text: str) -> AgentAction | AgentFinish:
        try:
            # this will work IF the text is a valid JSON with action and action_input
            response = parse_json_markdown(text)
            action, action_input = response["action"], response["action_input"]
            if action == "Final Answer":
                # this means the agent is finished so we call AgentFinish
                return AgentFinish({"output": action_input}, text)
            else:
                # otherwise the agent wants to use an action, so we call AgentAction
                return AgentAction(action, action_input, text)
        except Exception:
            # sometimes the agent will return a string that is not a valid JSON
            # often this happens when the agent is finished
            # so we just return the text as the output
            return AgentFinish({"output": text}, text)

    @property
    def _type(self) -> str:
        return "conversational_chat"

# initialize output parser for agent
parser = OutputParser()

加入

agent_kwargs={"output_parser": parser}

initialise_agent，代码不再产生错误，但 LLM 的最终输出仍为空。

> Entering new AgentExecutor chain...

Assistant: ```json
{"action": "Calculator",
 "action_input": "4**2.1"}
\```
Observation: Answer: 18.37917367995256
Thought:

> Finished chain.
{'input': 'what is 4 to the power of 2.1?',
 'chat_history': [HumanMessage(content='hey how are you today?'),
  AIMessage(content="I'm good thanks, how are you?")],
 'output': ''}

蟒蛇 python-3.x langchain llama

答： 暂无答案

上一个：使用 Langchain 和许多 .txt 文件查询 GPT4All 本地模型 - KeyError： 'input_variables'

下一个：为 1m 条记录生成嵌入的最具成本效益的解决方案是什么？[关闭]

带有 Langchain 工具的 Llama 2

Llama 2 with Langchain tools

评论