Merging images with the API

rasmus.jensen · July 17, 2025, 6:40am

I am trying to merge 2 images in some sort of way.
I have looked at all _J’s posts, and nothing has worked for me.
The 3 things I need to have working:

Both images are pngs, and local.
I should prefer having chatgpt-4o or o3 to be the model I call originally, for better text understanding.
I would like the image model to be gpt-image-1, not DALL-E, again because of quality difference.

Now to the problems:
I get a tool error (function not defined!) even though the image-generation tool should work!

base64_image1 = encode_image(img1)
    base64_image2 = encode_image(img2)
    response = client.chat.completions.create(
        model="chatgpt-4o-latest",
        messages=[
            {
                "role": "user",
                "content": [
                    {"type": "text", "text": prompt},
                    {
                        "type": "image_url",
                        "image_url": {"url" : f"data:image/png;base64,{base64_image1}"
                        }
                    },
                    {
                        "type": "image_url",
                        "image_url": {"url" : f"data:image/png;base64,{base64_image2}"
                        }
                    }
                ],
            }
        ],
        modalities= ["image"],
        tools=[{"type": "image_generation",
            "quality": "high",}],
    )

jeffvpace · July 17, 2025, 7:09am

I would like the image model to be gpt-image-1

Your image model is not “gpt-image-1”. Your showing “chatgpt-4o-latest”. Don’t think image_generation is a tool…

Look here: https://platform.openai.com/docs/api-reference/images/createEdit

rasmus.jensen · July 17, 2025, 7:12am

Yes! I just had to update my OpenAI package, then using images.edit could accept multiple images again!

aprendendo.next · July 17, 2025, 11:24am

Completions API doesn’t support the image tool.

For that, you will need either to use the Responses API or the Images API.

If you need multi-turn, Responses API is recommended. If it is just one shot, Images API should be enough.

_j · July 17, 2025, 2:32pm

The fault, using internal tools on the wrong endpoint, is in documentation: you must select in a drop-down at upper-right either “Chat Completions” or “Responses”, and then still the sections of documentation that are copied and modified for each endpoint and switch on you may not be ideal.

You do not need better understanding from an interloper AI in front of a tool to make images, and with gpt-image-1 as a tool on the Responses API endpoint, it has little impact, as the image model behind the tool is provided significant chat context, including many images from a chat where you also must pay for vision on them, besides more past images than your target being inputs to the tool and costing. It doesn’t run solely on AI-written prompting.

Do you want to chat and occasionally ask for an image? or
Do you want to develop an image tool that specializes in delivering the best results?

The latter, using the edits endpoint, is more focused, where you can observe exactly what is sent as input to receive your product and can directly specify parameters. Chat only amplifies your costs per image.

gpt-image-1 model in any capacity requires you to submit to the ID verification with scans of a government ID and videos of yourself - submitted to an unreliable third-party.

Topic		Replies	Views
How can you use the API to merge two pictures? API	3	679	January 16, 2025
Result tracking for gpt-image-1 Feedback gpt-image-1	8	293	June 4, 2025
Can combine 2 images in browser UI but not with API API gpt-4 , api	5	619	August 14, 2024
How to send image and get image as output from GPT-4o model using API API gpt-4	5	457	June 14, 2025
Can I prompt GPT to create images with prompt+image API image-generation	5	4897	June 22, 2025

Merging images with the API

Related topics