提问人:chibop 提问时间:11/7/2023 最后编辑:Rok Benkochibop 更新时间:11/18/2023 访问量:3018
OpenAI API:如何使用 gpt-4-vision-preview 模型启用 JSON 模式?
OpenAI API: How do I enable JSON mode using the gpt-4-vision-preview model?
问:
更新:他们似乎在 API 文档中犯了一个错误,现在修复了它。
早些时候,它说“打电话时或”,但现在读作“打电话时或”。gpt-4-vision-preview
gpt-3.5-turbo
gpt-4-1106-preview
gpt-3.5-turbo-1106
根据 Text generation - OpenAI API 的说法,“调用 or 时,可以设置 response_format to 开启 JSON 模式。gpt-4-vision-preview
gpt-3.5-turbo
{ type: "json_object" }
但是,以下代码会引发错误:
{'error': {'message': '1 validation error for Request\nbody -> response_format\n extra fields not permitted (type=value_error.extra)', 'type': 'invalid_request_error', 'param': None, 'code': None}}
如果我发表评论,它工作正常。"response_format": {"type": "json_object"}
headers = {
"Content-Type": "application/json",
"Authorization": f"Bearer {api_key}"
}
payload = {
"model": "gpt-4-vision-preview",
"response_format": {"type": "json_object"},
"messages": [
{
"role": "system",
"content": "You are a helpful assistant. Your response should be in JSON format."
},
{
"role": "user",
"content": [
{
"type": "text",
"text": prompt
},
{
"type": "image_url",
"image_url": {
"url": f"data:image/jpeg;base64,{base64_image}"
}
}
]
}
],
"max_tokens": 1000,
}
response = requests.post("https://api.openai.com/v1/chat/completions", headers=headers, json=payload)
print(response.json())
答:
根据发行说明,这些 GPT 模式不支持新的 JSON 输出格式选项。试试 gpt-3.5-turbo-1106 或 gpt-4-1106-preview,而不是 gpt-3.5-turbo。
https://platform.openai.com/docs/models/gpt-4-and-gpt-4-turbo
评论
不幸的是,目前 和 模型不支持 JSON 输出格式。
在 OpenAI 的官方文档中,您可以阅读有关 JSON 模式的信息。只提到了两个模型:和 .
因此,您的解决方案是选择这些模型之一。gpt-4-vision-preview
gpt-3.5-turbo
gpt-4-1106-preview
gpt-3.5-turbo-1106
评论
只有在使用 gpt-4-1106-preview 或 gpt-3.5-turbo-1106
时,才能取回 JSON 响应,如官方 OpenAI 文档中所述:
使用聊天完成的一项常见方法是指示模型执行以下操作 始终以对您的用例有意义的格式返回 JSON, 通过提供系统消息。这效果很好,但偶尔会 模型可能会生成无法解析为有效 JSON 的输出。
为了防止这些错误并提高模型性能,在调用 or 时,可以将
response_format
设置为启用 JSON 模式。当 JSON 模式为 启用时,模型被限制为仅生成解析的字符串 转换为有效的 JSON。gpt-4-1106-preview
gpt-3.5-turbo-1106
{ type: "json_object" }
Python 中的工作示例
如果运行 ,你将得到以下响应:test.py
{ “response”: “你好!我今天能帮你什么?
test.py
import os
from openai import OpenAI
client = OpenAI()
OpenAI.api_key = os.getenv('OPENAI_API_KEY')
completion = client.chat.completions.create(
model="gpt-4-1106-preview",
messages=[
{"role": "system", "content": "You are a helpful assistant. Your response should be in JSON format."},
{"role": "user", "content": "Hello!"}
],
response_format={"type": "json_object"}
)
print(completion.choices[0].message.content)
Node.js 中的工作示例
如果运行 ,你将得到以下响应:test.js
{ “response”: “你好!我今天能帮你什么?
测试.js
const OpenAI = require("openai");
const openai = new OpenAI({
apiKey: process.env.OPENAI_API_KEY,
});
async function main() {
const completion = await openai.chat.completions.create({
model: "gpt-4-1106-preview",
messages: [
{
role: "system",
content:
"You are a helpful assistant. Your response should be in JSON format.",
},
{ role: "user", content: "Hello!" },
],
response_format: { type: "json_object" },
});
console.log(completion.choices[0].message.content);
}
main();
评论
我通过一个变通方法轻松解决了(丑陋,但它有效):
response.getBody().toString().split("```json")[1].split("```")[0];
它是在 Java 中,但概念就是这样
我一直在做一个名为 convo-lang 的项目,它是过程编程语言和提示模板系统之间的混合体。我刚刚添加了对 JSON 模式的支持,它也适用于视觉和函数调用。
您可以使用 Markdown 图像格式定义图像,这些图像可以通过 GPT-4 和 GPT-3-turbo 查询。在后台,Convo 定义了一个 GPT-4 和 GPT-3-turbo 可以调用的函数,该函数使用 GPT4-vision 作为单独的提示运行。
它们的 Convo-lang 语法超级易于使用,甚至还有一个 vscode 扩展,可以为您提供语法突出显示,并让您直接在 vscode 中运行提示以进行测试和实验。
下面是使用 JSON 模式的带有图像的提示示例
此处突出显示语法的图像 -> https://raw.githubusercontent.com/iyioio/common/main/assets/convo/image-vision-example.png
> define
Person = struct(
name?:string
description?:string
)
@json Person[]
@responseAssign dudes
> user
Describe each person in this picture.

@tokenUsage 213 / 49 / $0.0036000000000000003
@toolId call_dfpKG4bnDRA3UTBjspQr2O4s
> call queryImage(
"query": "describe each person in this picture",
"imageUrls": [
"https://raw.githubusercontent.com/iyioio/common/main/assets/convo/abbey-road.jpg"
]
)
> result
__return={
"result": "The image shows four individuals crossing the street in single file on a zebra crossing. They are all adult males, and they appear to be dressed in distinct styles:\n\n1. The first person, walking from right to left, has long hair and is wearing a white suit. He is barefoot and steps off with his right foot onto the crossing.\n\n2. The second individual has darker hair and is wearing a black suit with a white shirt and no tie. He is also stepping off with his right foot, wearing black shoes.\n\n3. The third man is wearing a suit as well, looking somewhat casually dressed compared to the second person. He has a beard and long hair and is wearing an open neck shirt with a black suit, and he is mid-step onto the zebra with his left foot forward, wearing black shoes.\n\n4. The fourth man brings up the rear and is dressed most casually. He wears a denim jacket and jeans with a white shirt underneath. He has shoulder-length hair and is taking a step with his right foot, wearing dark-colored shoes.\n\nThe setting appears to be a quiet, tree-lined street with cars parked on either side, and there is a white Volkswagen Beetle and a black car visible behind them, among other vehicles. The architecture in the background suggests this might be a residential area. The image has a calm, orderly feel, indicative of an everyday urban scene albeit with a touch of timeless style."
}
@tokenUsage 564 / 252 / $0.0132
@format json
@assign dudes
> assistant
[
{
"name": "first person",
"description": "The first person, walking from right to left, has long hair and is wearing a white suit. He is barefoot and steps off with his right foot onto the crossing."
},
{
"name": "second person",
"description": "The second individual has darker hair and is wearing a black suit with a white shirt and no tie. He is also stepping off with his right foot, wearing black shoes."
},
{
"name": "third person",
"description": "The third man is wearing a suit as well, looking somewhat casually dressed compared to the second person. He has a beard and long hair and is wearing an open neck shirt with a black suit, and he is mid-step onto the zebra with his left foot forward, wearing black shoes."
},
{
"name": "fourth person",
"description": "The fourth man brings up the rear and is dressed most casually. He wears a denim jacket and jeans with a white shirt underneath. He has shoulder-length hair and is taking a step with his right foot, wearing dark-colored shoes."
}
]
NPM - https://www.npmjs.com/package/@iyio/convo-lang
GitHub - https://github.com/iyioio/common/blob/main/packages/convo-lang/README.md
VSCode 扩展 - https://marketplace.visualstudio.com/items?itemName=IYIO.convo-lang-tools
评论