{ "cells": [ { "cell_type": "code", "execution_count": null, "metadata": { "id": "ijGzTHJJUCPY" }, "outputs": [], "source": [ "# Copyright 2024 Google LLC\n", "#\n", "# Licensed under the Apache License, Version 2.0 (the \"License\");\n", "# you may not use this file except in compliance with the License.\n", "# You may obtain a copy of the License at\n", "#\n", "# https://www.apache.org/licenses/LICENSE-2.0\n", "#\n", "# Unless required by applicable law or agreed to in writing, software\n", "# distributed under the License is distributed on an \"AS IS\" BASIS,\n", "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n", "# See the License for the specific language governing permissions and\n", "# limitations under the License." ] }, { "cell_type": "markdown", "metadata": { "id": "JipjpNFHZ1NZ" }, "source": [ "# Imagen 3 Customized Images\n", "\n", "\n", " \n", " \n", " \n", " \n", "
\n", " \n", " \"Google
Run in Colab\n", "
\n", "
\n", " \n", " \"Google
Run in Colab Enterprise\n", "
\n", "
\n", " \n", " \"Vertex
Open in Vertex AI Workbench\n", "
\n", "
\n", " \n", " \"GitHub
View on GitHub\n", "
\n", "
\n", "\n", "
\n", "\n", "Share to:\n", "\n", "\n", " \"LinkedIn\n", "\n", "\n", "\n", " \"Bluesky\n", "\n", "\n", "\n", " \"X\n", "\n", "\n", "\n", " \"Reddit\n", "\n", "\n", "\n", " \"Facebook\n", "" ] }, { "cell_type": "markdown", "metadata": { "id": "G1KDmM_PBAXz" }, "source": [ "| Author |\n", "| --- |\n", "| [Katie Nguyen](https://github.com/katiemn) |" ] }, { "cell_type": "markdown", "metadata": { "id": "2c69e7975c15" }, "source": [ "## Overview\n", "\n", "### Imagen 3\n", "\n", "Imagen 3 on Vertex AI brings Google's state of the art generative AI capabilities to application developers. Imagen 3 is Google's highest quality text-to-image model to date. It's capable of creating images with astonishing detail. Thus, developers have more control when building next-generation AI products that transform their imagination into high quality visual assets. Learn more about [Imagen on Vertex AI](https://cloud.google.com/vertex-ai/generative-ai/docs/image/overview).\n", "\n", "In this tutorial, you will learn how to use the Google Gen AI SDK for Python to generate customized images using few-shot learning with Imagen 3. You'll supply a text prompt and reference images to guide new image generation in the following styles:\n", "\n", "- Subject customization\n", "- Style transfer\n", "- Style customization\n", "- Controlled customization\n", " - Canny edge\n", " - Scribble" ] }, { "cell_type": "markdown", "metadata": { "id": "r11Gu7qNgx1p" }, "source": [ "## Get started\n" ] }, { "cell_type": "markdown", "metadata": { "id": "No17Cw5hgx12" }, "source": [ "### Install Google Gen AI SDK for Python\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "tFy3H3aPgx12" }, "outputs": [], "source": [ "%pip install --upgrade --quiet google-genai ipycanvas ipywidgets" ] }, { "cell_type": "markdown", "metadata": { "id": "dmWOrTJ3gx13" }, "source": [ "### Authenticate your notebook environment (Colab only)\n", "\n", "If you are running this notebook on Google Colab, run the following cell to authenticate your environment.\n" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "id": "NyKGtVQjgx13" }, "outputs": [], "source": [ "import sys\n", "\n", "if \"google.colab\" in sys.modules:\n", " # Support for third party widgets\n", " from google.colab import auth, output\n", "\n", " auth.authenticate_user()\n", " output.enable_custom_widget_manager()" ] }, { "cell_type": "markdown", "metadata": { "id": "Ua6PDqB1iBSb" }, "source": [ "### Import libraries" ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "id": "yTiDo0lRh6sc" }, "outputs": [], "source": [ "import urllib\n", "\n", "from IPython.display import display\n", "from PIL import Image as PIL_Image\n", "from google import genai\n", "from google.genai.types import (\n", " ControlReferenceConfig,\n", " ControlReferenceImage,\n", " EditImageConfig,\n", " GenerateImagesConfig,\n", " Image,\n", " RawReferenceImage,\n", " StyleReferenceConfig,\n", " StyleReferenceImage,\n", " SubjectReferenceConfig,\n", " SubjectReferenceImage,\n", ")\n", "from ipycanvas import Canvas\n", "from ipywidgets import Button\n", "import matplotlib.pyplot as plt" ] }, { "cell_type": "markdown", "metadata": { "id": "DF4l8DTdWgPY" }, "source": [ "### Set Google Cloud project information and create client\n", "\n", "To get started using Vertex AI, you must have an existing Google Cloud project and [enable the Vertex AI API](https://console.cloud.google.com/flows/enableapi?apiid=aiplatform.googleapis.com).\n", "\n", "Learn more about [setting up a project and a development environment](https://cloud.google.com/vertex-ai/docs/start/cloud-environment)." ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "id": "Nqwi-5ufWp_B" }, "outputs": [], "source": [ "import os\n", "\n", "PROJECT_ID = \"[your-project-id]\" # @param {type: \"string\", placeholder: \"[your-project-id]\", isTemplate: true}\n", "if not PROJECT_ID or PROJECT_ID == \"[your-project-id]\":\n", " PROJECT_ID = str(os.environ.get(\"GOOGLE_CLOUD_PROJECT\"))\n", "\n", "LOCATION = os.environ.get(\"GOOGLE_CLOUD_REGION\", \"us-central1\")\n", "\n", "client = genai.Client(vertexai=True, project=PROJECT_ID, location=LOCATION)" ] }, { "cell_type": "markdown", "metadata": { "id": "b84bdfdd7ed3" }, "source": [ "### Define a helper function to display images" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "id": "f936014f357d" }, "outputs": [], "source": [ "def display_images(generated_image, ref_image) -> None:\n", " fig, axis = plt.subplots(1, 2, figsize=(12, 6))\n", " axis[0].imshow(generated_image)\n", " axis[0].set_title(\"Imagen 3\")\n", " axis[1].imshow(ref_image)\n", " axis[1].set_title(\"Reference Image\")\n", " for ax in axis:\n", " ax.axis(\"off\")\n", " plt.show()" ] }, { "cell_type": "markdown", "metadata": { "id": "VLmwIj2RD0Fx" }, "source": [ "### Load the image models\n", "\n", "Imagen 3 Customization: `imagen-3.0-capability-001`\n", "\n", "Imagen 3 Generation: `imagen-3.0-generate-002`" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "id": "F-gd2ypQhh7K" }, "outputs": [], "source": [ "customization_model = \"imagen-3.0-capability-001\"\n", "generation_model = \"imagen-3.0-generate-002\"" ] }, { "cell_type": "markdown", "metadata": { "id": "bf69a0f12947" }, "source": [ "### Subject customization\n", "\n", "Few-shot prompting with Imagen 3 Customization supports the following subjects: animal companion, person, and product. The following example demonstrates how to customize images of a person. To do this create 1 - 4 `SubjectReferenceImage` objects with the `subject_type` set to SUBJECT_TYPE_PERSON. For subject customization involving people you can also create a ```ControlReferenceImage``` with the `control_type` set to CONTROL_TYPE_FACE_MESH. Passing in a control reference image will allow you to better specify the face position in the edited image.\n", "\n", "Next, draft a prompt for the new image and make sure to reference the subject in the prompt by including brackets with the reference id you assigned to each reference image.\n", "\n", "When generating images you can also set the `safety_filter_level` and `person_generation` parameters accordingly:\n", "* `person_generation`: \n", " * `DONT_ALLOW`, `ALLOW_ADULT`,`ALLOW_ALL`\n", "* `safety_filter_level`:\n", " * `BLOCK_LOW_AND_ABOVE`, `BLOCK_MEDIUM_AND_ABOVE`, `BLOCK_ONLY_HIGH`, `BLOCK_NONE`" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "id": "8pyAJlvQsocc" }, "outputs": [], "source": [ "subject_image = Image(gcs_uri=\"gs://cloud-samples-data/generative-ai/image/person.png\")\n", "\n", "subject_reference_image = SubjectReferenceImage(\n", " reference_id=1,\n", " reference_image=subject_image,\n", " config=SubjectReferenceConfig(\n", " subject_description=\"a headshot of a woman\", subject_type=\"SUBJECT_TYPE_PERSON\"\n", " ),\n", ")\n", "\n", "control_reference_image = ControlReferenceImage(\n", " reference_id=2,\n", " reference_image=subject_image,\n", " config=ControlReferenceConfig(control_type=\"CONTROL_TYPE_FACE_MESH\"),\n", ")\n", "\n", "prompt = \"a portrait of a woman[1] in the pose of the control image[2]in a watercolor style by a professional artist, light and low-contrast stokes, bright pastel colors, a warm atmosphere, clean background, grainy paper, bold visible brushstrokes, patchy details\"\n", "\n", "image = client.models.edit_image(\n", " model=customization_model,\n", " prompt=prompt,\n", " reference_images=[subject_reference_image, control_reference_image],\n", " config=EditImageConfig(\n", " edit_mode=\"EDIT_MODE_DEFAULT\",\n", " number_of_images=1,\n", " seed=1,\n", " safety_filter_level=\"BLOCK_MEDIUM_AND_ABOVE\",\n", " person_generation=\"ALLOW_ADULT\",\n", " ),\n", ")\n", "\n", "image_show = PIL_Image.open(\n", " urllib.request.urlopen(\n", " \"https://storage.googleapis.com/cloud-samples-data/generative-ai/image/person.png\"\n", " )\n", ")\n", "\n", "display_images(image.generated_images[0].image._pil_image, image_show)" ] }, { "cell_type": "markdown", "metadata": { "id": "c8e6615908c9" }, "source": [ "### Style transfer\n", "\n", "You can also transfer image styles with Imagen 3 Customization. This entails recreating reference images in a new style based on your text prompt." ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "id": "iCB3dpf4x4wA" }, "outputs": [], "source": [ "image = Image(gcs_uri=\"gs://cloud-samples-data/generative-ai/image/teacup-1.png\")\n", "\n", "raw_ref_image = RawReferenceImage(reference_image=image, reference_id=1)\n", "\n", "prompt = \"transform the subject in the image so that the teacup[1] is made entirely out of chocolate\"\n", "\n", "style_image = client.models.edit_image(\n", " model=customization_model,\n", " prompt=prompt,\n", " reference_images=[raw_ref_image],\n", " config=EditImageConfig(\n", " edit_mode=\"EDIT_MODE_DEFAULT\",\n", " number_of_images=1,\n", " seed=1,\n", " safety_filter_level=\"BLOCK_MEDIUM_AND_ABOVE\",\n", " person_generation=\"ALLOW_ADULT\",\n", " ),\n", ")\n", "\n", "image_show = PIL_Image.open(\n", " urllib.request.urlopen(\n", " \"https://storage.googleapis.com/cloud-samples-data/generative-ai/image/teacup-1.png\"\n", " )\n", ")\n", "\n", "display_images(style_image.generated_images[0].image._pil_image, image_show)" ] }, { "cell_type": "markdown", "metadata": { "id": "4412206fd68a" }, "source": [ "### Style customization\n", "\n", "With style customization, you can add reference images and craft a text prompt to transfer the style of the referenced images to new images. You can do this by creating ```StyleReferenceImage``` objects." ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "id": "COc7UoSSPR_9" }, "outputs": [], "source": [ "style_image = Image(gcs_uri=\"gs://cloud-samples-data/generative-ai/image/neon.png\")\n", "\n", "style_reference_image = StyleReferenceImage(\n", " reference_id=1,\n", " reference_image=style_image,\n", " config=StyleReferenceConfig(style_description=\"neon sign\"),\n", ")\n", "\n", "prompt = \"generate an image of a neon sign [1] with the words: have a great day\"\n", "\n", "style_customization = client.models.edit_image(\n", " model=customization_model,\n", " prompt=prompt,\n", " reference_images=[style_reference_image],\n", " config=EditImageConfig(\n", " edit_mode=\"EDIT_MODE_DEFAULT\",\n", " number_of_images=1,\n", " seed=1,\n", " safety_filter_level=\"BLOCK_MEDIUM_AND_ABOVE\",\n", " person_generation=\"ALLOW_ADULT\",\n", " ),\n", ")\n", "\n", "image_show = PIL_Image.open(\n", " urllib.request.urlopen(\n", " \"https://storage.googleapis.com/cloud-samples-data/generative-ai/image/neon.png\"\n", " )\n", ")\n", "\n", "display_images(style_customization.generated_images[0].image._pil_image, image_show)" ] }, { "cell_type": "markdown", "metadata": { "id": "BZNYxuRFaacD" }, "source": [ "### Controlled customization\n", "\n", "Controlled customization allows you to turn sketches into fully realized images. Imagen 3 Controlled Customization creates new images based on a source image signal or a source image (canny edge/scribble).\n", "\n", "#### Canny edge\n", "\n", "Generate a new image with Imagen 3, apply the Canny Edge filter, and generate a new image based on this ```ControlReferenceImage```.\n" ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "id": "fn-0mjVLaTWs" }, "outputs": [], "source": [ "import cv2\n", "\n", "generation_prompt = \"\"\"\n", "a simple accent chair in a neutral color\n", "\"\"\"\n", "generated_image = client.models.generate_images(\n", " model=generation_model,\n", " prompt=generation_prompt,\n", " config=GenerateImagesConfig(\n", " number_of_images=1,\n", " aspect_ratio=\"1:1\",\n", " safety_filter_level=\"BLOCK_MEDIUM_AND_ABOVE\",\n", " person_generation=\"DONT_ALLOW\",\n", " ),\n", ")\n", "\n", "generated_image.generated_images[0].image.save(\"chair.png\")\n", "img = cv2.imread(\"chair.png\")\n", "\n", "# Setting parameter values\n", "t_lower = 100 # Lower Threshold\n", "t_upper = 150 # Upper threshold\n", "\n", "# Applying the Canny Edge filter\n", "edge = cv2.Canny(img, t_lower, t_upper)\n", "cv2.imwrite(\"chair_edge.png\", edge)\n", "\n", "control_image = ControlReferenceImage(\n", " reference_id=1,\n", " reference_image=Image.from_file(location=\"chair_edge.png\"),\n", " config=ControlReferenceConfig(control_type=\"CONTROL_TYPE_CANNY\"),\n", ")\n", "\n", "edit_prompt = \"A photorealistic image along the lines of a navy suede accent chair in a living room, near big windows\"\n", "\n", "control_image = client.models.edit_image(\n", " model=customization_model,\n", " prompt=edit_prompt,\n", " reference_images=[control_image],\n", " config=EditImageConfig(\n", " edit_mode=\"EDIT_MODE_CONTROLLED_EDITING\",\n", " number_of_images=1,\n", " safety_filter_level=\"BLOCK_MEDIUM_AND_ABOVE\",\n", " person_generation=\"DONT_ALLOW\",\n", " ),\n", ")\n", "\n", "fig, axis = plt.subplots(1, 3, figsize=(12, 6))\n", "axis[0].imshow(generated_image.generated_images[0].image._pil_image)\n", "axis[0].set_title(\"Original Image\")\n", "axis[1].imshow(edge, cmap=\"gray\")\n", "axis[1].set_title(\"Canny Edge\")\n", "axis[2].imshow(control_image.generated_images[0].image._pil_image)\n", "axis[2].set_title(\"Edited Image\")\n", "for ax in axis:\n", " ax.axis(\"off\")\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": { "id": "uL8dmg4Udihr" }, "source": [ "#### Scribble\n", "\n", "In order to use Controlled Customization with a scribble source image you need to supply a control image with a black background and white lines. Run the following cells to generate a canvas where you can draw a scribble image with your mouse. Once complete, click the 'Save Image' button to save your scribble image locally. " ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "9aIstQfkZVdp" }, "outputs": [], "source": [ "# Create the Canvas\n", "canvas = Canvas(width=400, height=300, sync_image_data=True)\n", "canvas.fill_style = \"black\"\n", "canvas.fill_rect(0, 0, canvas.width, canvas.height)\n", "display(canvas)\n", "\n", "# Define drawing state\n", "stroke_color = \"white\"\n", "line_width = 3\n", "drawing = False\n", "last_x, last_y = None, None\n", "\n", "# Define event handlers\n", "\n", "\n", "def handle_mouse_down(x, y):\n", " global drawing, last_x, last_y\n", " drawing = True\n", " last_x, last_y = x, y\n", " canvas.begin_path()\n", " canvas.move_to(x, y)\n", " canvas.stroke_style = stroke_color\n", " canvas.line_width = line_width\n", "\n", "\n", "def handle_mouse_move(x, y):\n", " global drawing, last_x, last_y\n", " if drawing:\n", " canvas.line_to(x, y)\n", " canvas.stroke()\n", " last_x, last_y = x, y\n", "\n", "\n", "def handle_mouse_up(x, y):\n", " global drawing, last_x, last_y\n", " drawing = False\n", " last_x, last_y = None, None\n", "\n", "\n", "def clear_canvas(b):\n", " canvas.clear()\n", " canvas.fill_style = \"black\"\n", " canvas.fill_rect(0, 0, canvas.width, canvas.height)\n", "\n", "\n", "def save_canvas(b):\n", " canvas.to_file(\"scribble.png\")\n", "\n", "\n", "canvas.on_mouse_down(handle_mouse_down)\n", "canvas.on_mouse_move(handle_mouse_move)\n", "canvas.on_mouse_up(handle_mouse_up)\n", "\n", "clear_button = Button(description=\"Clear Canvas\")\n", "clear_button.on_click(clear_canvas)\n", "display(clear_button)\n", "\n", "save_button = Button(description=\"Save Image\")\n", "save_button.on_click(save_canvas)\n", "display(save_button)" ] }, { "cell_type": "markdown", "metadata": { "id": "_Pal9zB5eWAg" }, "source": [ "Supply your scribble image as a ```ControlReferenceImage``` and create a new image by calling ```edit_image```." ] }, { "cell_type": "code", "execution_count": 46, "metadata": { "id": "H6sc6oFosxlg" }, "outputs": [], "source": [ "control_ref_image = ControlReferenceImage(\n", " reference_id=1,\n", " reference_image=Image.from_file(location=\"scribble.png\"),\n", " config=ControlReferenceConfig(control_type=\"CONTROL_TYPE_SCRIBBLE\"),\n", ")\n", "\n", "prompt = \"a beach ball on the sand\"\n", "control_image = client.models.edit_image(\n", " model=customization_model,\n", " prompt=prompt,\n", " reference_images=[control_ref_image],\n", " config=EditImageConfig(\n", " edit_mode=\"EDIT_MODE_CONTROLLED_EDITING\",\n", " number_of_images=1,\n", " safety_filter_level=\"BLOCK_MEDIUM_AND_ABOVE\",\n", " person_generation=\"DONT_ALLOW\",\n", " ),\n", ")\n", "\n", "display_images(\n", " control_image.generated_images[0].image._pil_image, PIL_Image.open(\"scribble.png\")\n", ")" ] } ], "metadata": { "colab": { "name": "imagen3_customization.ipynb", "toc_visible": true }, "kernelspec": { "display_name": "Python 3", "name": "python3" } }, "nbformat": 4, "nbformat_minor": 0 }