OpenAI API:如何使用 gpt-4-vision-preview 模型启用 JSON 模式?

OpenAI API: How do I enable JSON mode using the gpt-4-vision-preview model?

提问人:chibop 提问时间:11/7/2023 最后编辑:Rok Benkochibop 更新时间:11/18/2023 访问量:3018

问:

更新:他们似乎在 API 文档中犯了一个错误,现在修复了它。

早些时候,它说“打电话时或”,但现在读作“打电话时或”。gpt-4-vision-previewgpt-3.5-turbogpt-4-1106-previewgpt-3.5-turbo-1106


根据 Text generation - OpenAI API 的说法,“调用 or 时,可以设置 response_format to 开启 JSON 模式。gpt-4-vision-previewgpt-3.5-turbo{ type: "json_object" }

但是,以下代码会引发错误:

 {'error': {'message': '1 validation error for Request\nbody -> response_format\n  extra fields not permitted (type=value_error.extra)', 'type': 'invalid_request_error', 'param': None, 'code': None}}

如果我发表评论,它工作正常。"response_format": {"type": "json_object"}

    headers = {
        "Content-Type": "application/json",
        "Authorization": f"Bearer {api_key}"
    }
    
    payload = {
        "model": "gpt-4-vision-preview",
        "response_format": {"type": "json_object"},
        "messages": [
          {
            "role": "system",
            "content": "You are a helpful assistant. Your response should be in JSON format."
          },
          {
            "role": "user",
            "content": [
              {
                "type": "text",
                "text": prompt
              },
              {
                "type": "image_url",
                "image_url": {
                  "url": f"data:image/jpeg;base64,{base64_image}"
                }
              }
            ]
          }
        ],
        "max_tokens": 1000,
    }
    
    response = requests.post("https://api.openai.com/v1/chat/completions", headers=headers, json=payload)
    print(response.json())
蟒蛇 openai-api gpt-4

评论


答:

1赞 Robert Willson 11/7/2023 #1

根据发行说明,这些 GPT 模式不支持新的 JSON 输出格式选项。试试 gpt-3.5-turbo-1106 或 gpt-4-1106-preview,而不是 gpt-3.5-turbo。

https://platform.openai.com/docs/models/gpt-4-and-gpt-4-turbo

评论

0赞 chibop 11/8/2023
是的,他们似乎修复了文档。昨天肯定说“调用 gpt-4-vision-preview 或 gpt-3.5-turbo 时”。我只是复制并粘贴了我的问题。
0赞 tweeker 11/7/2023 #2

不幸的是,目前 和 模型不支持 JSON 输出格式。 在 OpenAI 的官方文档中,您可以阅读有关 JSON 模式的信息。只提到了两个模型:和 .

因此,您的解决方案是选择这些模型之一。
gpt-4-vision-previewgpt-3.5-turbogpt-4-1106-previewgpt-3.5-turbo-1106

enter image description here

enter image description here

评论

0赞 chibop 11/8/2023
是的,他们似乎修复了文档。昨天肯定说“调用 gpt-4-vision-preview 或 gpt-3.5-turbo 时”。我昨天刚刚从文档中复制并粘贴。
6赞 Rok Benko 11/7/2023 #3

只有在使用 gpt-4-1106-preview 或 gpt-3.5-turbo-1106 时,才能取回 JSON 响应,如官方 OpenAI 文档中所述:

使用聊天完成的一项常见方法是指示模型执行以下操作 始终以对您的用例有意义的格式返回 JSON, 通过提供系统消息。这效果很好,但偶尔会 模型可能会生成无法解析为有效 JSON 的输出。

为了防止这些错误并提高模型性能,在调用 or 时,可以将 response_format设置为启用 JSON 模式。当 JSON 模式为 启用时,模型被限制为仅生成解析的字符串 转换为有效的 JSON。gpt-4-1106-previewgpt-3.5-turbo-1106{ type: "json_object" }


Python 中的工作示例

如果运行 ,你将得到以下响应:test.py

{ “response”: “你好!我今天能帮你什么?

test.py

import os
from openai import OpenAI
client = OpenAI()
OpenAI.api_key = os.getenv('OPENAI_API_KEY')

completion = client.chat.completions.create(
  model="gpt-4-1106-preview",
  messages=[
    {"role": "system", "content": "You are a helpful assistant. Your response should be in JSON format."},
    {"role": "user", "content": "Hello!"}
  ],
  response_format={"type": "json_object"}
)

print(completion.choices[0].message.content)

Node.js 中的工作示例

如果运行 ,你将得到以下响应:test.js

{ “response”: “你好!我今天能帮你什么?

测试.js

const OpenAI = require("openai");
const openai = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY,
});

async function main() {
  const completion = await openai.chat.completions.create({
    model: "gpt-4-1106-preview",
    messages: [
      {
        role: "system",
        content:
          "You are a helpful assistant. Your response should be in JSON format.",
      },
      { role: "user", content: "Hello!" },
    ],
    response_format: { type: "json_object" },
  });

  console.log(completion.choices[0].message.content);
}

main();

评论

0赞 chibop 11/8/2023
是的,他们似乎修复了文档。昨天肯定说“调用 gpt-4-vision-preview 或 gpt-3.5-turbo 时”。
0赞 Rateltalk 11/8/2023
当我启用JSON模式时。他从我提供的示例中返回了 JSON 内容
1赞 Giovanni Caggianella 11/18/2023 #4

我通过一个变通方法轻松解决了(丑陋,但它有效):

response.getBody().toString().split("```json")[1].split("```")[0];

它是在 Java 中,但概念就是这样

0赞 Scott Vance 12/13/2023 #5

我一直在做一个名为 convo-lang 的项目,它是过程编程语言和提示模板系统之间的混合体。我刚刚添加了对 JSON 模式的支持,它也适用于视觉和函数调用。

您可以使用 Markdown 图像格式定义图像,这些图像可以通过 GPT-4 和 GPT-3-turbo 查询。在后台,Convo 定义了一个 GPT-4 和 GPT-3-turbo 可以调用的函数,该函数使用 GPT4-vision 作为单独的提示运行。

它们的 Convo-lang 语法超级易于使用,甚至还有一个 vscode 扩展,可以为您提供语法突出显示,并让您直接在 vscode 中运行提示以进行测试和实验。

下面是使用 JSON 模式的带有图像的提示示例

此处突出显示语法的图像 -> https://raw.githubusercontent.com/iyioio/common/main/assets/convo/image-vision-example.png

> define
Person = struct(
    name?:string
    description?:string
)

@json Person[]
@responseAssign dudes
> user
Describe each person in this picture.

![](https://raw.githubusercontent.com/iyioio/common/main/assets/convo/abbey-road.jpg)



@tokenUsage 213 / 49 / $0.0036000000000000003
@toolId call_dfpKG4bnDRA3UTBjspQr2O4s
> call queryImage(
    "query": "describe each person in this picture",
    "imageUrls": [
        "https://raw.githubusercontent.com/iyioio/common/main/assets/convo/abbey-road.jpg"
    ]
)
> result
__return={
    "result": "The image shows four individuals crossing the street in single file on a zebra crossing. They are all adult males, and they appear to be dressed in distinct styles:\n\n1. The first person, walking from right to left, has long hair and is wearing a white suit. He is barefoot and steps off with his right foot onto the crossing.\n\n2. The second individual has darker hair and is wearing a black suit with a white shirt and no tie. He is also stepping off with his right foot, wearing black shoes.\n\n3. The third man is wearing a suit as well, looking somewhat casually dressed compared to the second person. He has a beard and long hair and is wearing an open neck shirt with a black suit, and he is mid-step onto the zebra with his left foot forward, wearing black shoes.\n\n4. The fourth man brings up the rear and is dressed most casually. He wears a denim jacket and jeans with a white shirt underneath. He has shoulder-length hair and is taking a step with his right foot, wearing dark-colored shoes.\n\nThe setting appears to be a quiet, tree-lined street with cars parked on either side, and there is a white Volkswagen Beetle and a black car visible behind them, among other vehicles. The architecture in the background suggests this might be a residential area. The image has a calm, orderly feel, indicative of an everyday urban scene albeit with a touch of timeless style."
}


@tokenUsage 564 / 252 / $0.0132
@format json
@assign dudes
> assistant
[
    {
        "name": "first person",
        "description": "The first person, walking from right to left, has long hair and is wearing a white suit. He is barefoot and steps off with his right foot onto the crossing."
    },
    {
        "name": "second person",
        "description": "The second individual has darker hair and is wearing a black suit with a white shirt and no tie. He is also stepping off with his right foot, wearing black shoes."
    },
    {
        "name": "third person",
        "description": "The third man is wearing a suit as well, looking somewhat casually dressed compared to the second person. He has a beard and long hair and is wearing an open neck shirt with a black suit, and he is mid-step onto the zebra with his left foot forward, wearing black shoes."
    },
    {
        "name": "fourth person",
        "description": "The fourth man brings up the rear and is dressed most casually. He wears a denim jacket and jeans with a white shirt underneath. He has shoulder-length hair and is taking a step with his right foot, wearing dark-colored shoes."
    }
]


NPM - https://www.npmjs.com/package/@iyio/convo-lang

GitHub - https://github.com/iyioio/common/blob/main/packages/convo-lang/README.md

VSCode 扩展 - https://marketplace.visualstudio.com/items?itemName=IYIO.convo-lang-tools