带有 Langchain 工具的 Llama 2

Llama 2 with Langchain tools

提问人:tail_recursion 提问时间:11/16/2023 最后编辑:tail_recursion 更新时间:11/17/2023 访问量:329

问:

编辑:

我发现它适用于 Llama 2 70b,但不适用于 Llama 2 13b。Llama 2 13b 正确使用该工具并观察其agent_scratchpad中的最终答案,但它在末尾输出一个空字符串,而 Llama 2 70b 输出“看起来答案是 18.37917367995256!”,这是正确的。

原文:

我正在尝试按照本教程将 Llama 2 与 Langchain 工具一起使用(您不必查看教程,所有代码都包含在此问题中)。

我的代码与教程中的代码非常相似,只是我使用的是本地模型而不是连接到 Hugging Face,并且我没有使用 bitsandbytes 进行量化,因为它需要 cuda 并且我在 macOS 上。

我正在使用未量化的 Meta Llama 2 13b 聊天模型 meta-llama/Llama-2-13b-chat-hf

该模型似乎正确输出 JSON,但由于某种原因,我收到“无法解析 LLM 输出”。

这是代码(由于 Stackoverflow 代码格式,在三重反引号之前添加了 \)。注意:我在提示中添加了以下内容:“当 Google 助理使用 JSON 进行响应时,请确保用三个反引号将 JSON 括起来。这解决了模型输出的 JSON 格式不正确的问题(用反引号和“json”标记括起来)。

import transformers
model_id = './Models/Llama_2/llama_2_13b'

# initialize the model
model = transformers.AutoModelForCausalLM.from_pretrained(model_id)
tokenizer = transformers.AutoTokenizer.from_pretrained(model_id)

generate_text = transformers.pipeline(
    model=model, tokenizer=tokenizer,
    return_full_text=True,  # langchain expects the full text
    task='text-generation',
    # we pass model parameters here too
    temperature=0.01,  # 'randomness' of outputs, 0.0 is the min and 1.0 the max
    max_new_tokens=512,  # mex number of tokens to generate in the output
    repetition_penalty=1.1  # without this output begins repeating
)

from langchain.llms import HuggingFacePipeline

llm = HuggingFacePipeline(pipeline=generate_text)

from langchain.memory import ConversationBufferWindowMemory
from langchain.agents import load_tools

memory = ConversationBufferWindowMemory(
    memory_key="chat_history", k=5, return_messages=True, output_key="output"
)

tools = load_tools(["llm-math"], llm=llm)

from langchain.agents import initialize_agent

# initialize agent
agent = initialize_agent(
    agent="chat-conversational-react-description",
    tools=tools,
    llm=llm,
    verbose=True,
    early_stopping_method="generate",
    memory=memory,
    handle_parsing_errors=True
)

# special tokens used by llama 2 chat
B_INST, E_INST = "[INST]", "[/INST]"
B_SYS, E_SYS = "<<SYS>>\n", "\n<</SYS>>\n\n"

# create the system message
sys_msg = "<s>" + B_SYS + """Assistant is a expert JSON builder designed to assist with a wide range of tasks.

Assistant is able to respond to the User and use tools using JSON strings that contain "action" and "action_input" parameters.

All of Assistant's communication is performed using this JSON format.

Assistant can also use tools by responding to the user with tool use instructions in the same "action" and "action_input" JSON format. Tools available to Assistant are:

- "Calculator": Useful for when you need to answer questions about math.
  - To use the calculator tool, Assistant should write like so:
    ```json
    {{"action": "Calculator",
      "action_input": "sqrt(4)"}}
    ```

When Assistant responds with JSON they make sure to enclose the JSON with three back ticks.

Here are some previous conversations between the Assistant and User:

User: Hey how are you today?
Assistant: ```json
{{"action": "Final Answer",
 "action_input": "I'm good thanks, how are you?"}}
\```
User: I'm great, what is the square root of 4?
Assistant: ```json
{{"action": "Calculator",
 "action_input": "sqrt(4)"}}
\```
User: 2.0
Assistant: ```json
{{"action": "Final Answer",
 "action_input": "It looks like the answer is 2!"}}
\```
User: Thanks could you tell me what 4 to the power of 2 is?
Assistant: ```json
{{"action": "Calculator",
 "action_input": "4**2"}}
\```
User: 16.0
Assistant: ```json
{{"action": "Final Answer",
 "action_input": "It looks like the answer is 16!"}}
\```

Here is the latest conversation between Assistant and User.""" + E_SYS

new_prompt = agent.agent.create_prompt(
    system_message=sys_msg,
    tools=tools
)

agent.agent.llm_chain.prompt = new_prompt

instruction = B_INST + " Respond to the following in JSON with 'action' and 'action_input' values " + E_INST
human_msg = instruction + "\nUser: {input}"

agent.agent.llm_chain.prompt.messages[2].prompt.template = human_msg

然后我提示它;

agent("hey how are you today?")

我得到以下信息;

> Entering new AgentExecutor chain...


Assistant: ```json
{"action": "Final Answer",
 "action_input": "I'm good thanks, how are you?"}
\```

> Finished chain.
{'input': 'hey how are you today?',
 'chat_history': [],
 'output': "I'm good thanks, how are you?"}

这很好,但是当我用一个需要使用工具的问题提示它时;

agent("what is 4 to the power of 2.1?")

我明白了;

> Entering new AgentExecutor chain...


Assistant: ```json
{"action": "Calculator",
 "action_input": "4**2.1"}
\```
Observation: Answer: 18.37917367995256
Thought:Could not parse LLM output: 
Observation: Invalid or incomplete response
Thought:Could not parse LLM output: 
Observation: Invalid or incomplete response
Thought:Could not parse LLM output: 
Observation: Invalid or incomplete response

它只是卡在一个循环中,重复“无法解析 LLM 输出”和“无效或不完整的响应”

有谁知道如何修复“无法解析 LLM 输出”错误?

据推测,这段代码适用于本教程的作者。

我在 macOS Sonoma 上使用 Python 3.11.5 和 Anaconda、tensorflow 2.15.0、transformers 4.35.2、langchain 0.0.336。

编辑:

我更新了代码以使用此处的输出解析器。

from langchain.agents import AgentOutputParser
from langchain.agents.conversational_chat.prompt import FORMAT_INSTRUCTIONS
from langchain.output_parsers.json import parse_json_markdown
from langchain.schema import AgentAction, AgentFinish

class OutputParser(AgentOutputParser):
    def get_format_instructions(self) -> str:
        return FORMAT_INSTRUCTIONS

    def parse(self, text: str) -> AgentAction | AgentFinish:
        try:
            # this will work IF the text is a valid JSON with action and action_input
            response = parse_json_markdown(text)
            action, action_input = response["action"], response["action_input"]
            if action == "Final Answer":
                # this means the agent is finished so we call AgentFinish
                return AgentFinish({"output": action_input}, text)
            else:
                # otherwise the agent wants to use an action, so we call AgentAction
                return AgentAction(action, action_input, text)
        except Exception:
            # sometimes the agent will return a string that is not a valid JSON
            # often this happens when the agent is finished
            # so we just return the text as the output
            return AgentFinish({"output": text}, text)

    @property
    def _type(self) -> str:
        return "conversational_chat"

# initialize output parser for agent
parser = OutputParser()

加入

agent_kwargs={"output_parser": parser}

initialise_agent,代码不再产生错误,但 LLM 的最终输出仍为空。

> Entering new AgentExecutor chain...

Assistant: ```json
{"action": "Calculator",
 "action_input": "4**2.1"}
\```
Observation: Answer: 18.37917367995256
Thought:

> Finished chain.
{'input': 'what is 4 to the power of 2.1?',
 'chat_history': [HumanMessage(content='hey how are you today?'),
  AIMessage(content="I'm good thanks, how are you?")],
 'output': ''}
蟒蛇 python-3.x langchain llama

评论


答: 暂无答案