0.1.
GPT4v 实战:API 调用、使用场景实践
注意:
由于网络原因下面代码务必在自己的电脑上运行,才能看到界面
0.1. 步骤 1:安装所需依赖的包
确保你已经安装了 Python,并使用 pip 安装 Gradio, OpenAI, 以及 Gemini 所需依赖库:
0.0.0.1. 如果在本地电脑执行下面命令
pip install gradio openai google-generativeai
0.1. 如果在实验室执行下面命令
%pip install gradio==3.50 # 在实验室运行时执行的代码,本地不需要
%pip install -q -U google-generativeai
%pip install anthropic openai
Collecting gradio==3.50
Using cached gradio-3.50.0-py3-none-any.whl (20.3 MB)
Requirement already satisfied: aiofiles<24.0,>=22.0 in
/opt/conda/lib/python3.11/site-packages (from gradio==3.50) (23.2.1)
Collecting altair<6.0,>=4.2.0 (from gradio==3.50)
Using cached altair-5.3.0-py3-none-any.whl (857 kB)
Requirement already satisfied: fastapi in /opt/conda/lib/python3.11/site-
packages (from gradio==3.50) (0.111.0)
Collecting ffmpy (from gradio==3.50)
Downloading ffmpy-0.4.0-py3-none-any.whl (5.8 kB)
Collecting gradio-client==0.6.1 (from gradio==3.50)
Using cached gradio_client-0.6.1-py3-none-any.whl (299 kB)
Requirement already satisfied: httpx in /opt/conda/lib/python3.11/site-packages
(from gradio==3.50) (0.26.0)
Requirement already satisfied: huggingface-hub>=0.14.0 in
/home/jovyan/.local/lib/python3.11/site-packages (from gradio==3.50) (0.23.3)
Requirement already satisfied: importlib-resources<7.0,>=1.3 in
/opt/conda/lib/python3.11/site-packages (from gradio==3.50) (6.0.0)
Requirement already satisfied: jinja2<4.0 in /opt/conda/lib/python3.11/site-
packages (from gradio==3.50) (3.1.2)
Requirement already satisfied: markupsafe~=2.0 in
/opt/conda/lib/python3.11/site-packages (from gradio==3.50) (2.1.3)
Requirement already satisfied: matplotlib~=3.0 in
/opt/conda/lib/python3.11/site-packages (from gradio==3.50) (3.9.1)
Requirement already satisfied: numpy~=1.0 in /opt/conda/lib/python3.11/site-
packages (from gradio==3.50) (1.26.2)
Requirement already satisfied: orjson~=3.0 in /opt/conda/lib/python3.11/site-
packages (from gradio==3.50) (3.10.3)
Requirement already satisfied: packaging in /opt/conda/lib/python3.11/site-
packages (from gradio==3.50) (23.2)
Collecting pandas<3.0,>=1.0 (from gradio==3.50)
Using cached pandas-2.2.2-cp311-cp311-
manylinux_2_17_x86_64.manylinux2014_x86_64.whl (13.0 MB)
Requirement already satisfied: pillow<11.0,>=8.0 in
/opt/conda/lib/python3.11/site-packages (from gradio==3.50) (10.1.0)
Requirement already satisfied: pydantic!=1.8,!=1.8.1,!=2.0.0,!=2.0.1,
<3.0.0,>=1.7.4 in /opt/conda/lib/python3.11/site-packages (from gradio==3.50)
(2.5.3)
Collecting pydub (from gradio==3.50)
Using cached pydub-0.25.1-py2.py3-none-any.whl (32 kB)
Requirement already satisfied: python-multipart in
/opt/conda/lib/python3.11/site-packages (from gradio==3.50) (0.0.9)
Requirement already satisfied: pyyaml<7.0,>=5.0 in
/opt/conda/lib/python3.11/site-packages (from gradio==3.50) (6.0)
Requirement already satisfied: requests~=2.0 in /opt/conda/lib/python3.11/site-
packages (from gradio==3.50) (2.31.0)
Collecting semantic-version~=2.0 (from gradio==3.50)
Using cached semantic_version-2.10.0-py2.py3-none-any.whl (15 kB)
Requirement already satisfied: typing-extensions~=4.0 in
/opt/conda/lib/python3.11/site-packages (from gradio==3.50) (4.12.0)
Requirement already satisfied: uvicorn>=0.14.0 in
/opt/conda/lib/python3.11/site-packages (from gradio==3.50) (0.30.0)
Collecting websockets<12.0,>=10.0 (from gradio==3.50)
Using cached websockets-11.0.3-cp311-cp311-
manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_6
4.whl (130 kB)
Requirement already satisfied: fsspec in /opt/conda/lib/python3.11/site-packages
(from gradio-client==0.6.1->gradio==3.50) (2023.12.2)
Requirement already satisfied: jsonschema>=3.0 in
/opt/conda/lib/python3.11/site-packages (from altair<6.0,>=4.2.0->gradio==3.50)
(4.20.0)
Requirement already satisfied: toolz in /opt/conda/lib/python3.11/site-packages
(from altair<6.0,>=4.2.0->gradio==3.50) (0.12.0)
Requirement already satisfied: filelock in /opt/conda/lib/python3.11/site-
packages (from huggingface-hub>=0.14.0->gradio==3.50) (3.13.1)
Requirement already satisfied: tqdm>=4.42.1 in /opt/conda/lib/python3.11/site-
packages (from huggingface-hub>=0.14.0->gradio==3.50) (4.65.0)
Requirement already satisfied: contourpy>=1.0.1 in
/opt/conda/lib/python3.11/site-packages (from matplotlib~=3.0->gradio==3.50)
(1.2.1)
Requirement already satisfied: cycler>=0.10 in /opt/conda/lib/python3.11/site-
packages (from matplotlib~=3.0->gradio==3.50) (0.12.1)
Requirement already satisfied: fonttools>=4.22.0 in
/opt/conda/lib/python3.11/site-packages (from matplotlib~=3.0->gradio==3.50)
(4.53.1)
Requirement already satisfied: kiwisolver>=1.3.1 in
/opt/conda/lib/python3.11/site-packages (from matplotlib~=3.0->gradio==3.50)
(1.4.5)
Requirement already satisfied: pyparsing>=2.3.1 in
/opt/conda/lib/python3.11/site-packages (from matplotlib~=3.0->gradio==3.50)
(3.1.2)
Requirement already satisfied: python-dateutil>=2.7 in
/opt/conda/lib/python3.11/site-packages (from matplotlib~=3.0->gradio==3.50)
(2.8.2)
Requirement already satisfied: pytz>=2020.1 in /opt/conda/lib/python3.11/site-
packages (from pandas<3.0,>=1.0->gradio==3.50) (2023.3)
Collecting tzdata>=2022.7 (from pandas<3.0,>=1.0->gradio==3.50)
Using cached tzdata-2024.1-py2.py3-none-any.whl (345 kB)
Requirement already satisfied: annotated-types>=0.4.0 in
/opt/conda/lib/python3.11/site-packages (from
pydantic!=1.8,!=1.8.1,!=2.0.0,!=2.0.1,<3.0.0,>=1.7.4->gradio==3.50) (0.6.0)
Requirement already satisfied: pydantic-core==2.14.6 in
/opt/conda/lib/python3.11/site-packages (from
pydantic!=1.8,!=1.8.1,!=2.0.0,!=2.0.1,<3.0.0,>=1.7.4->gradio==3.50) (2.14.6)
Requirement already satisfied: charset-normalizer<4,>=2 in
/opt/conda/lib/python3.11/site-packages (from requests~=2.0->gradio==3.50)
(3.2.0)
Requirement already satisfied: idna<4,>=2.5 in /opt/conda/lib/python3.11/site-
packages (from requests~=2.0->gradio==3.50) (3.4)
Requirement already satisfied: urllib3<3,>=1.21.1 in
/opt/conda/lib/python3.11/site-packages (from requests~=2.0->gradio==3.50)
(1.26.18)
Requirement already satisfied: certifi>=2017.4.17 in
/opt/conda/lib/python3.11/site-packages (from requests~=2.0->gradio==3.50)
(2023.11.17)
Requirement already satisfied: click>=7.0 in /opt/conda/lib/python3.11/site-
packages (from uvicorn>=0.14.0->gradio==3.50) (8.1.7)
Requirement already satisfied: h11>=0.8 in /opt/conda/lib/python3.11/site-
packages (from uvicorn>=0.14.0->gradio==3.50) (0.14.0)
Requirement already satisfied: starlette<0.38.0,>=0.37.2 in
/opt/conda/lib/python3.11/site-packages (from fastapi->gradio==3.50) (0.37.2)
Requirement already satisfied: fastapi-cli>=0.0.2 in
/opt/conda/lib/python3.11/site-packages (from fastapi->gradio==3.50) (0.0.4)
Requirement already satisfied:
ujson!=4.0.2,!=4.1.0,!=4.2.0,!=4.3.0,!=5.0.0,!=5.1.0,>=4.0.1 in
/opt/conda/lib/python3.11/site-packages (from fastapi->gradio==3.50) (5.10.0)
Requirement already satisfied: email_validator>=2.0.0 in
/opt/conda/lib/python3.11/site-packages (from fastapi->gradio==3.50) (2.1.1)
Requirement already satisfied: anyio in /opt/conda/lib/python3.11/site-packages
(from httpx->gradio==3.50) (3.7.1)
Requirement already satisfied: httpcore==1.* in /opt/conda/lib/python3.11/site-
packages (from httpx->gradio==3.50) (1.0.2)
Requirement already satisfied: sniffio in /opt/conda/lib/python3.11/site-
packages (from httpx->gradio==3.50) (1.3.0)
Requirement already satisfied: dnspython>=2.0.0 in
/opt/conda/lib/python3.11/site-packages (from email_validator>=2.0.0->fastapi-
>gradio==3.50) (2.4.2)
Requirement already satisfied: typer>=0.12.3 in /opt/conda/lib/python3.11/site-
packages (from fastapi-cli>=0.0.2->fastapi->gradio==3.50) (0.12.3)
Requirement already satisfied: attrs>=22.2.0 in /opt/conda/lib/python3.11/site-
packages (from jsonschema>=3.0->altair<6.0,>=4.2.0->gradio==3.50) (23.1.0)
Requirement already satisfied: jsonschema-specifications>=2023.03.6 in
/opt/conda/lib/python3.11/site-packages (from jsonschema>=3.0-
>altair<6.0,>=4.2.0->gradio==3.50) (2023.6.1)
Requirement already satisfied: referencing>=0.28.4 in
/opt/conda/lib/python3.11/site-packages (from jsonschema>=3.0-
>altair<6.0,>=4.2.0->gradio==3.50) (0.29.1)
Requirement already satisfied: rpds-py>=0.7.1 in /opt/conda/lib/python3.11/site-
packages (from jsonschema>=3.0->altair<6.0,>=4.2.0->gradio==3.50) (0.8.10)
Requirement already satisfied: six>=1.5 in /opt/conda/lib/python3.11/site-
packages (from python-dateutil>=2.7->matplotlib~=3.0->gradio==3.50) (1.16.0)
Requirement already satisfied: httptools>=0.5.0 in
/opt/conda/lib/python3.11/site-packages (from uvicorn>=0.14.0->gradio==3.50)
(0.6.1)
Requirement already satisfied: python-dotenv>=0.13 in
/opt/conda/lib/python3.11/site-packages (from uvicorn>=0.14.0->gradio==3.50)
(1.0.0)
Requirement already satisfied: uvloop!=0.15.0,!=0.15.1,>=0.14.0 in
/opt/conda/lib/python3.11/site-packages (from uvicorn>=0.14.0->gradio==3.50)
(0.19.0)
Requirement already satisfied: watchfiles>=0.13 in
/opt/conda/lib/python3.11/site-packages (from uvicorn>=0.14.0->gradio==3.50)
(0.22.0)
Requirement already satisfied: shellingham>=1.3.0 in
/opt/conda/lib/python3.11/site-packages (from typer>=0.12.3->fastapi-cli>=0.0.2-
>fastapi->gradio==3.50) (1.5.4)
Requirement already satisfied: rich>=10.11.0 in /opt/conda/lib/python3.11/site-
packages (from typer>=0.12.3->fastapi-cli>=0.0.2->fastapi->gradio==3.50)
(13.7.1)
Requirement already satisfied: markdown-it-py>=2.2.0 in
/opt/conda/lib/python3.11/site-packages (from rich>=10.11.0->typer>=0.12.3-
>fastapi-cli>=0.0.2->fastapi->gradio==3.50) (3.0.0)
Requirement already satisfied: pygments<3.0.0,>=2.13.0 in
/opt/conda/lib/python3.11/site-packages (from rich>=10.11.0->typer>=0.12.3-
>fastapi-cli>=0.0.2->fastapi->gradio==3.50) (2.15.1)
Requirement already satisfied: mdurl~=0.1 in /opt/conda/lib/python3.11/site-
packages (from markdown-it-py>=2.2.0->rich>=10.11.0->typer>=0.12.3->fastapi-
cli>=0.0.2->fastapi->gradio==3.50) (0.1.2)
Installing collected packages: pydub, websockets, tzdata, semantic-version,
ffmpy, pandas, gradio-client, altair, gradio
Attempting uninstall: websockets
Found existing installation: websockets 12.0
Uninstalling websockets-12.0:
Successfully uninstalled websockets-12.0
Successfully installed altair-5.3.0 ffmpy-0.4.0 gradio-3.50.0 gradio-client-
0.6.1 pandas-2.2.2 pydub-0.25.1 semantic-version-2.10.0 tzdata-2024.1
websockets-11.0.3
Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.
Collecting anthropic
Downloading anthropic-0.32.0-py3-none-any.whl (866 kB)
•[2K •[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━•[0m •[32m866.6/866.6 kB•[0m
•[31m43.5 MB/s•[0m eta •[36m0:00:00•[0m
•[?25hRequirement already satisfied: openai in /opt/conda/lib/python3.11/site-
packages (1.6.1)
Requirement already satisfied: anyio<5,>=3.5.0 in
/opt/conda/lib/python3.11/site-packages (from anthropic) (3.7.1)
Requirement already satisfied: distro<2,>=1.7.0 in
/opt/conda/lib/python3.11/site-packages (from anthropic) (1.9.0)
Requirement already satisfied: httpx<1,>=0.23.0 in
/opt/conda/lib/python3.11/site-packages (from anthropic) (0.26.0)
Collecting jiter<1,>=0.4.0 (from anthropic)
Downloading jiter-0.5.0-cp311-cp311-
manylinux_2_17_x86_64.manylinux2014_x86_64.whl (319 kB)
•[2K •[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━•[0m •[32m319.2/319.2 kB•[0m
•[31m83.3 MB/s•[0m eta •[36m0:00:00•[0m
•[?25hRequirement already satisfied: pydantic<3,>=1.9.0 in
/opt/conda/lib/python3.11/site-packages (from anthropic) (2.5.3)
Requirement already satisfied: sniffio in /opt/conda/lib/python3.11/site-
packages (from anthropic) (1.3.0)
Requirement already satisfied: tokenizers>=0.13.0 in
/home/jovyan/.local/lib/python3.11/site-packages (from anthropic) (0.13.3)
Requirement already satisfied: typing-extensions<5,>=4.7 in
/opt/conda/lib/python3.11/site-packages (from anthropic) (4.12.0)
Requirement already satisfied: tqdm>4 in /opt/conda/lib/python3.11/site-packages
(from openai) (4.65.0)
Requirement already satisfied: idna>=2.8 in /opt/conda/lib/python3.11/site-
packages (from anyio<5,>=3.5.0->anthropic) (3.4)
Requirement already satisfied: certifi in /opt/conda/lib/python3.11/site-
packages (from httpx<1,>=0.23.0->anthropic) (2023.11.17)
Requirement already satisfied: httpcore==1.* in /opt/conda/lib/python3.11/site-
packages (from httpx<1,>=0.23.0->anthropic) (1.0.2)
Requirement already satisfied: h11<0.15,>=0.13 in
/opt/conda/lib/python3.11/site-packages (from httpcore==1.*->httpx<1,>=0.23.0-
>anthropic) (0.14.0)
Requirement already satisfied: annotated-types>=0.4.0 in
/opt/conda/lib/python3.11/site-packages (from pydantic<3,>=1.9.0->anthropic)
(0.6.0)
Requirement already satisfied: pydantic-core==2.14.6 in
/opt/conda/lib/python3.11/site-packages (from pydantic<3,>=1.9.0->anthropic)
(2.14.6)
Installing collected packages: jiter, anthropic
Successfully installed anthropic-0.32.0 jiter-0.5.0
Note: you may need to restart the kernel to use updated packages.
下面代码创建了一个 Gradio 用户界面,可以在文本框中输入问题,并上传最多三张图像。
这些输入将传递给 query_gpt4_vision 函数,该函数使用 OpenAI GPT-4 Vision 模型生成对问题的回
答。
0.1. 步骤 2:编写 Gradio 与 GPT-4 Vision 应用
在你的 Python 脚本中,编写 Gradio 应用。以下是一个例子,使用 GPT-4 Vision 模型:
import os
googleapi_key = os.getenv('GOOGLE_API_KEY')
#claude_key = os.getenv('CLAUDE_API_KEY')
claude_key = "sk-ant-api03-
E5wqz5k7kfnnpzEDWUE0J3rh_QD0NnHBH3S7hlaURtDzxcdYlkLGmVOYgCQJPk1GGKrPkQfKyxM_ydgf
2PIawg-3hRoagAA"
#print(googleapi_key)
import gradio as gr
from openai import OpenAI
import base64
from PIL import Image
import io
import os
import google.generativeai as genai
import anthropic, openai
# Function to encode the image to base64
def encode_image_to_base64(image):
buffered = io.BytesIO()
image.save(buffered, format="JPEG")
return base64.b64encode(buffered.getvalue()).decode('utf-8')
# Function to query GPT-4 Vision
def query_gpt4_vision(text, image1, image2, image3):
client = OpenAI(api_key=os.getenv('OPENAI_API_KEY'))
messages = [{"role": "user", "content": [{"type": "text", "text": text}]}]
images = [image1, image2, image3]
for image in images:
if image is not None:
base64_image = encode_image_to_base64(image)
image_message = {
"type": "image_url",
"image_url": {"url": f"data:image/jpeg;base64,{base64_image}"}
}
messages[0]["content"].append(image_message)
response = client.chat.completions.create(
#model="gpt-4-vision-preview",
model="gpt-4-turbo",
messages=messages,
max_tokens=1024,
)
return response.choices[0].message.content
def query_gpt4o_vision(text, image1, image2, image3):
client = OpenAI(api_key=os.getenv('OPENAI_API_KEY'))
messages = [{"role": "user", "content": [{"type": "text", "text": text}]}]
images = [image1, image2, image3]
for image in images:
if image is not None:
base64_image = encode_image_to_base64(image)
image_message = {
"type": "image_url",
"image_url": {"url": f"data:image/jpeg;base64,{base64_image}"}
}
messages[0]["content"].append(image_message)
response = client.chat.completions.create(
model="gpt-4o",
#model=model_type,
messages=messages,
max_tokens=1024,
)
return response.choices[0].message.content
def query_gpt4o_mini_vision(text, image1, image2, image3):
client = OpenAI(api_key=os.getenv('OPENAI_API_KEY'))
messages = [{"role": "user", "content": [{"type": "text", "text": text}]}]
images = [image1, image2, image3]
for image in images:
if image is not None:
base64_image = encode_image_to_base64(image)
image_message = {
"type": "image_url",
"image_url": {"url": f"data:image/jpeg;base64,{base64_image}"}
}
messages[0]["content"].append(image_message)
response = client.chat.completions.create(
model="gpt-4o-mini",
#model=model_type,
messages=messages,
max_tokens=1024,
)
return response.choices[0].message.content
# Function to query Gemini-Pro
def query_claude_vision(text, image1, image2, image3):
claude_key = "sk-ant-api03-
E5wqz5k7kfnnpzEDWUE0J3rh_QD0NnHBH3S7hlaURtDzxcdYlkLGmVOYgCQJPk1GGKrPkQfKyxM_ydgf
2PIawg-3hRoagAA"
modelConfig ={
"claude_S": "claude-3-haiku-20240307",
"claude_M": "claude-3-sonnet-20240229",
"claude_B": "claude-3-opus-20240229",
}
client = anthropic.Anthropic(api_key=claude_key)
messages = [{"role": "user", "content": [{"type": "text", "text": text}]}]
images = [image1, image2, image3]
for image in images:
if image is not None:
base64_image = encode_image_to_base64(image)
image_message = {
"type": "image",
"source": {
"type": "base64",
"media_type": "image/jpeg",
"data": base64_image,
}
}
messages[0]["content"].append(image_message)
#print(messages)
response = client.messages.create(
#model="gpt-4-vision-preview",
model=modelConfig["claude_M"],
messages=messages,
max_tokens=1024,
)
return response.content[0].text
def query_gemini_vision(text, image1, image2, image3):
# Or use `os.getenv('GOOGLE_API_KEY')` to fetch an environment variable.
# GOOGLE_API_KEY=userdata.get('GOOGLE_API_KEY')
GOOGLE_API_KEY = os.getenv('GOOGLE_API_KEY')
genai.configure(api_key=GOOGLE_API_KEY)
model = genai.GenerativeModel('gemini-1.5-flash')
images = [image1, image2, image3]
query = [text]
for image in images:
if image is not None:
query.append(image)
response = model.generate_content(query, stream=False)
response.resolve()
return response.text
# 由于Gradio 2.0及以上版本的界面构建方式有所不同,这里使用blocks API来创建更复杂的UI
def main():
with gr.Blocks() as demo:
gr.Markdown("### 输入文本")
input_text = gr.Textbox(lines=2, label="输入文本")
input_images = [
gr.Image(type="pil", label="Upload Image", tool="editor") for i in
range(3)]
output_gpt4 = gr.Textbox(label="GPT-4v-Turbo 输出")
output_gemini = gr.Textbox(label="Gemini-Pro 输出")
output_claude3 = gr.Textbox(label="Claude3 sonnet 输出")
output_gpt4o = gr.Textbox(label="GPT-4o 输出")
output_gpt4o_mini = gr.Textbox(label="GPT-4o-mini 输出")
btn_gpt4 = gr.Button("调用GPT-4")
btn_gemini = gr.Button("调用Gemini-Pro")
btn_claude = gr.Button("调用Claude-3-sonnet")
btn_gpt4o = gr.Button("调用GPT-4o")
btn_gpt4o_mini = gr.Button("调用GPT-4o-mini")
btn_gpt4.click(fn=query_gpt4_vision, inputs=[
input_text] + input_images, outputs=output_gpt4)
btn_gemini.click(fn=query_gemini_vision, inputs=[
input_text] + input_images, outputs=output_gemini)
btn_claude.click(fn=query_claude_vision, inputs=[
input_text] + input_images, outputs=output_claude3)
btn_gpt4o.click(fn=query_gpt4o_vision, inputs=[
input_text] + input_images, outputs=output_gpt4o)
btn_gpt4o_mini.click(fn=query_gpt4o_mini_vision, inputs=[
input_text] + input_images, outputs=output_gpt4o_mini)
demo.launch(share=True)
if __name__ == "__main__":
main()
Running on local URL: http://127.0.0.1:7869
IMPORTANT: You are using gradio version 3.50.0, however version 4.29.0 is
available, please upgrade.
--------
Running on public URL: https://9e8402bec3b2e3dec9.gradio.live
This share link expires in 72 hours. For free permanent hosting and GPU
upgrades, run `gradio deploy` from Terminal to deploy to Spaces
(https://huggingface.co/spaces)
Traceback (most recent call last):
File "/opt/conda/lib/python3.11/site-packages/gradio/routes.py", line 534, in
predict
output = await route_utils.call_process_api(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/conda/lib/python3.11/site-packages/gradio/route_utils.py", line
226, in call_process_api
output = await app.get_blocks().process_api(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/conda/lib/python3.11/site-packages/gradio/blocks.py", line 1550, in
process_api
result = await self.call_function(
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/conda/lib/python3.11/site-packages/gradio/blocks.py", line 1185, in
call_function
prediction = await anyio.to_thread.run_sync(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/conda/lib/python3.11/site-packages/anyio/to_thread.py", line 33, in
run_sync
File "/opt/conda/lib/python3.11/site-packages/anyio/_backends/_asyncio.py",
line 877, in run_sync_in_worker_thread
self.idle_workers.remove(self)
^^^^^^^^^^^^
File "/opt/conda/lib/python3.11/site-packages/anyio/_backends/_asyncio.py",
line 807, in run
class WorkerThread(Thread):
File "/opt/conda/lib/python3.11/site-packages/gradio/utils.py", line 661, in
wrapper
response = f(*args, **kwargs)
^^^^^^^^^^^^^^^^^^
File "/tmp/ipykernel_13834/3272864961.py", line 79, in query_gpt4o_mini_vision
response = client.chat.completions.create(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/conda/lib/python3.11/site-packages/openai/_utils/_utils.py", line
272, in wrapper
msg = f"Missing required argument: {quote(missing[0])}"
^^^^^^^^^^^^^^^^^^^^^
File "/opt/conda/lib/python3.11/site-
packages/openai/resources/chat/completions.py", line 645, in create
n: Optional[int] | NotGiven = NOT_GIVEN,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/conda/lib/python3.11/site-packages/openai/_base_client.py", line
1088, in post
stream: bool,
File "/opt/conda/lib/python3.11/site-packages/openai/_base_client.py", line
853, in request
def is_closed(self) -> bool:
^^^^^^^^^^^^^^^^^
File "/opt/conda/lib/python3.11/site-packages/openai/_base_client.py", line
930, in _request
options: FinalRequestOptions,
^^^^^^^^^^^^^^^^^^^^^^^^^^
openai.BadRequestError: Error code: 400 - {'error': {'message': "Unsupported
parameter: 'max_tokens' is not supported with this model. Use
'max_completion_tokens' instead.", 'type': 'invalid_request_error', 'param':
'max_tokens', 'code': 'unsupported_parameter'}}
0.1. 步骤 3:运行 Gradio 应用
保存并运行你的 Python 脚本。这将启动 Gradio 用户界面,并在终端中显示本地运行的 URL(通常是
http://127.0.0.1:7860)。访问该 URL 即可查看和使用你的 Gradio 应用。
建议:
确保你的机器上已经安装了必要的 Python 包。
如果使用 OpenAI 模型,确保你已经设置了正确的 API 密钥。
0.1. Examples
菜品价格预估:估算一下这桌子菜的价格,给出每道菜的市场价格。假设这是北京的一家中档餐厅
figure 理解:请生成这张图片的 caption
网页设计:生成下面设计图对应的网站源码
视觉结合知识的推断:根据图中信息,猜测这是什么型号的 GPU?请列出所有可能的 GPU 型号,
并给出你的判断依据。猜测大致的生产年限
这张图片中的跳水动作,可以打几分?为什么?给出打分依据
下面图1中给出的是PCB瑕疵检测的一个标准模板,图2是一个带标注信息的瑕疵板,请观察图3中
是否存在瑕疵?
import gradio as gr
from openai import OpenAI
import base64
from PIL import Image
import io
import os
# Function to encode the image to base64
def encode_image_to_base64(image):
buffered = io.BytesIO()
image.save(buffered, format="JPEG")
return base64.b64encode(buffered.getvalue()).decode('utf-8')
# Function to query GPT-4 Vision
def query_gpt4_vision(*inputs):
client = OpenAI(api_key=os.getenv('OPENAI_API_KEY'))
messages = [{"role": "user", "content": []}]
for input_item in inputs:
if isinstance(input_item, str): # Text input
messages[0]["content"].append({"type": "text", "text": input_item})
elif isinstance(input_item, Image.Image): # Image input
base64_image = encode_image_to_base64(input_item)
messages[0]["content"].append({
"type": "image_url",
"image_url": {"url": f"data:image/jpeg;base64,{base64_image}"}
})
response = client.chat.completions.create(
model="gpt-4-turbo",
messages=messages,
max_tokens=1024,
)
return response.choices[0].message.content
# Function to query Gemini-Pro
def query_gemini_vision(*inputs):
# Or use `os.getenv('GOOGLE_API_KEY')` to fetch an environment variable.
# GOOGLE_API_KEY=userdata.get('GOOGLE_API_KEY')
GOOGLE_API_KEY = os.getenv('GOOGLE_API_KEY')
# print(GOOGLE_API_KEY)
genai.configure(api_key=GOOGLE_API_KEY)
model = genai.GenerativeModel('gemini-1.5-flash')
query = []
for item in inputs:
if item is not None:
query.append(item)
response = model.generate_content(query, stream=False)
response.resolve()
return response.text
'''
# Dynamically generate input components
input_components = []
for i in range(2): # Change this number to add more inputs
input_components.append(gr.components.Textbox(lines=2, placeholder=f"Enter
your text input {i+1}..."))
input_components.append(gr.components.Image(type="pil", label=f"Upload Image
{i+1}", tool="editor"))
iface = gr.Interface(
fn=query_gpt4_vision,
inputs=input_components,
outputs=gr.components.Text(update=True),
)
iface.launch(share=True)
'''
# 由于Gradio 2.0及以上版本的界面构建方式有所不同,这里使用blocks API来创建更复杂的UI
def main():
with gr.Blocks() as demo:
gr.Markdown("### 输入文本")
# input_text = gr.Textbox(lines=2, label="输入文本")
input_components = []
for i in range(2): # Change this number to add more inputs
input_components.append(gr.Textbox(
lines=2, placeholder=f"Enter your text input {i+1}..."))
input_components.append(
gr.Image(type="pil", label=f"Upload Image {i+1}",
tool="editor"))
# input_images = [gr.Image(type="pil", label="Upload Image",
tool="editor") for i in range(3)]
output_gpt4 = gr.Textbox(label="GPT-4 输出")
output_other_api = gr.Textbox(label="Gemini-Pro 输出")
btn_gpt4 = gr.Button("调用GPT-4")
btn_other_api = gr.Button("调用Gemini-Pro")
btn_gpt4.click(fn=query_gpt4_vision,
inputs=input_components, outputs=output_gpt4)
btn_other_api.click(fn=query_gemini_vision,
inputs=input_components, outputs=output_other_api)
demo.launch(share=True)
if __name__ == "__main__":
main()
Running on local URL: http://127.0.0.1:7866
IMPORTANT: You are using gradio version 3.50.0, however version 4.29.0 is
available, please upgrade.
--------
Running on public URL: https://94ec6eff09667c3fd8.gradio.live
This share link expires in 72 hours. For free permanent hosting and GPU
upgrades, run `gradio deploy` from Terminal to deploy to Spaces
(https://huggingface.co/spaces)
具身智能场景:
假设你是一个机器人,在厨房从事工作,你会执行的操作包括 靠近(物体坐标), 抓取(物体坐标),
移动(开始坐标,结束坐标),这里的坐标需要根据你的视觉系统来估计 xy 位置,以及深度信息 z。
人类会给你指令,你需要按照人类指令要求的完成对应的操作。比如,人类:把抽屉里的绿色包装
袋拿出来。此时你看到的画面是这样的:
请问接下来你该执行什么指令?只给出指令即可,但是需要包括具体的坐标信息(假设当前画面的
长宽为单位 1,使用你估计的深度信息以及 xy 偏移位置)