提问人:tail_recursion 提问时间:11/16/2023 最后编辑:tail_recursion 更新时间:11/17/2023 访问量:329
带有 Langchain 工具的 Llama 2
Llama 2 with Langchain tools
我发现它适用于 Llama 2 70b,但不适用于 Llama 2 13b。Llama 2 13b 正确使用该工具并观察其agent_scratchpad中的最终答案,但它在末尾输出一个空字符串,而 Llama 2 70b 输出“看起来答案是 18.37917367995256!”,这是正确的。
我正在尝试按照本教程将 Llama 2 与 Langchain 工具一起使用(您不必查看教程,所有代码都包含在此问题中)。
我的代码与教程中的代码非常相似,只是我使用的是本地模型而不是连接到 Hugging Face,并且我没有使用 bitsandbytes 进行量化,因为它需要 cuda 并且我在 macOS 上。
我正在使用未量化的 Meta Llama 2 13b 聊天模型 meta-llama/Llama-2-13b-chat-hf。
该模型似乎正确输出 JSON,但由于某种原因,我收到“无法解析 LLM 输出”。
这是代码(由于 Stackoverflow 代码格式,在三重反引号之前添加了 \)。注意:我在提示中添加了以下内容:“当 Google 助理使用 JSON 进行响应时,请确保用三个反引号将 JSON 括起来。这解决了模型输出的 JSON 格式不正确的问题(用反引号和“json”标记括起来)。
import transformers
model_id = './Models/Llama_2/llama_2_13b'
# initialize the model
model = transformers.AutoModelForCausalLM.from_pretrained(model_id)
tokenizer = transformers.AutoTokenizer.from_pretrained(model_id)
generate_text = transformers.pipeline(
model=model, tokenizer=tokenizer,
return_full_text=True, # langchain expects the full text
# we pass model parameters here too
temperature=0.01, # 'randomness' of outputs, 0.0 is the min and 1.0 the max
max_new_tokens=512, # mex number of tokens to generate in the output
repetition_penalty=1.1 # without this output begins repeating
from langchain.llms import HuggingFacePipeline
llm = HuggingFacePipeline(pipeline=generate_text)
from langchain.memory import ConversationBufferWindowMemory
from langchain.agents import load_tools
memory = ConversationBufferWindowMemory(
memory_key="chat_history", k=5, return_messages=True, output_key="output"
tools = load_tools(["llm-math"], llm=llm)
from langchain.agents import initialize_agent
# initialize agent
agent = initialize_agent(
# special tokens used by llama 2 chat
B_INST, E_INST = "[INST]", "[/INST]"
B_SYS, E_SYS = "<<SYS>>\n", "\n<</SYS>>\n\n"
# create the system message
sys_msg = "<s>" + B_SYS + """Assistant is a expert JSON builder designed to assist with a wide range of tasks.
Assistant is able to respond to the User and use tools using JSON strings that contain "action" and "action_input" parameters.
All of Assistant's communication is performed using this JSON format.
Assistant can also use tools by responding to the user with tool use instructions in the same "action" and "action_input" JSON format. Tools available to Assistant are:
- "Calculator": Useful for when you need to answer questions about math.
- To use the calculator tool, Assistant should write like so:
{{"action": "Calculator",
"action_input": "sqrt(4)"}}
When Assistant responds with JSON they make sure to enclose the JSON with three back ticks.
Here are some previous conversations between the Assistant and User:
User: Hey how are you today?
Assistant: ```json
{{"action": "Final Answer",
"action_input": "I'm good thanks, how are you?"}}
User: I'm great, what is the square root of 4?
Assistant: ```json
{{"action": "Calculator",
"action_input": "sqrt(4)"}}
User: 2.0
Assistant: ```json
{{"action": "Final Answer",
"action_input": "It looks like the answer is 2!"}}
User: Thanks could you tell me what 4 to the power of 2 is?
Assistant: ```json
{{"action": "Calculator",
"action_input": "4**2"}}
User: 16.0
Assistant: ```json
{{"action": "Final Answer",
"action_input": "It looks like the answer is 16!"}}
Here is the latest conversation between Assistant and User.""" + E_SYS
new_prompt = agent.agent.create_prompt(
agent.agent.llm_chain.prompt = new_prompt
instruction = B_INST + " Respond to the following in JSON with 'action' and 'action_input' values " + E_INST
human_msg = instruction + "\nUser: {input}"
agent.agent.llm_chain.prompt.messages[2].prompt.template = human_msg
agent("hey how are you today?")
> Entering new AgentExecutor chain...
Assistant: ```json
{"action": "Final Answer",
"action_input": "I'm good thanks, how are you?"}
> Finished chain.
{'input': 'hey how are you today?',
'chat_history': [],
'output': "I'm good thanks, how are you?"}
agent("what is 4 to the power of 2.1?")
> Entering new AgentExecutor chain...
Assistant: ```json
{"action": "Calculator",
"action_input": "4**2.1"}
Observation: Answer: 18.37917367995256
Thought:Could not parse LLM output:
Observation: Invalid or incomplete response
Thought:Could not parse LLM output:
Observation: Invalid or incomplete response
Thought:Could not parse LLM output:
Observation: Invalid or incomplete response
它只是卡在一个循环中,重复“无法解析 LLM 输出”和“无效或不完整的响应”
有谁知道如何修复“无法解析 LLM 输出”错误?
我在 macOS Sonoma 上使用 Python 3.11.5 和 Anaconda、tensorflow 2.15.0、transformers 4.35.2、langchain 0.0.336。
from langchain.agents import AgentOutputParser
from langchain.agents.conversational_chat.prompt import FORMAT_INSTRUCTIONS
from langchain.output_parsers.json import parse_json_markdown
from langchain.schema import AgentAction, AgentFinish
class OutputParser(AgentOutputParser):
def get_format_instructions(self) -> str:
def parse(self, text: str) -> AgentAction | AgentFinish:
# this will work IF the text is a valid JSON with action and action_input
response = parse_json_markdown(text)
action, action_input = response["action"], response["action_input"]
if action == "Final Answer":
# this means the agent is finished so we call AgentFinish
return AgentFinish({"output": action_input}, text)
# otherwise the agent wants to use an action, so we call AgentAction
return AgentAction(action, action_input, text)
except Exception:
# sometimes the agent will return a string that is not a valid JSON
# often this happens when the agent is finished
# so we just return the text as the output
return AgentFinish({"output": text}, text)
def _type(self) -> str:
return "conversational_chat"
# initialize output parser for agent
parser = OutputParser()
agent_kwargs={"output_parser": parser}
initialise_agent,代码不再产生错误,但 LLM 的最终输出仍为空。
> Entering new AgentExecutor chain...
Assistant: ```json
{"action": "Calculator",
"action_input": "4**2.1"}
Observation: Answer: 18.37917367995256
> Finished chain.
{'input': 'what is 4 to the power of 2.1?',
'chat_history': [HumanMessage(content='hey how are you today?'),
AIMessage(content="I'm good thanks, how are you?")],
'output': ''}
答: 暂无答案