##### Copyright 2025 Google LLC.

In [None]:
# @title Licensed under the Apache License, Version 2.0 (the \"License\");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an \"AS IS\" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Gemini API: Error handling

This Colab notebook demonstrates strategies for handling common errors you might encounter when working with the Gemini API:

* **Transient Errors:** Temporary failures due to network issues, server overload, etc.
* **Rate Limits:** Restrictions on the number of requests you can make within a certain timeframe.
* **Timeouts:** When an API call takes too long to complete.

You have two main approaches to explore:

1. **Automatic retries:** A simple way to retry requests when they fail due to transient errors.
2. **Manual backoff and retry:** A more customizable approach that provides finer control over retry behavior.


**Gemini Rate Limits**

The default rate limits for different Gemini models are outlined in the [Gemini API model documentation](https://ai.google.dev/gemini-api/docs/models/gemini#model-variations). If your application requires a higher quota, consider [requesting a rate limit increase](https://ai.google.dev/gemini-api/docs/quota).

In [None]:
%pip install -q -U "google-generativeai>=0.7.2"

In [None]:
import google.generativeai as genai

### Setup your API key

To run the following cells, store your API key in a Colab Secret named `GOOGLE_API_KEY`. If you don't have an API key or need help creating a Colab Secret, see the [Authentication](https://github.com/google-gemini/cookbook/blob/main/quickstarts/Authentication.ipynb) guide.

In [None]:
from google.colab import userdata

GOOGLE_API_KEY = userdata.get("GOOGLE_API_KEY")
genai.configure(api_key=GOOGLE_API_KEY)

### Automatic retries

The Gemini API's client library offers built-in retry mechanisms for handling transient errors. You can enable this feature by using the `request_options` argument with API calls like `generate_content`, `generate_answer`, `embed_content`, and `generate_content_async`.

**Advantages:**

* **Simplicity:** Requires minimal code changes for significant reliability gains.
* **Robust:** Effectively addresses most transient errors without additional logic.

**Customize retry behavior:**

Use these settings in [`retry`](https://googleapis.dev/python/google-api-core/latest/retry.html) to customize retry behavior:

* `predicate`: (callable) Determines if an exception is retryable. Default: [`if_transient_error`](https://github.com/googleapis/python-api-core/blob/main/google/api_core/retry/retry_base.py#L75C4-L75C13)
* `initial`: (float) Initial delay in seconds before the first retry. Default: `1.0`
* `maximum`: (float) Maximum delay in seconds between retries. Default: `60.0`
* `multiplier`: (float) Factor by which the delay increases after each retry. Default: `2.0`
* `timeout`: (float) Total retry duration in seconds. Default: `120.0`

In [None]:
from google.api_core import retry

model = genai.GenerativeModel("gemini-2.0-flash")
prompt = "Write a story about a magic backpack."

model.generate_content(
 prompt, request_options={"retry": retry.Retry(predicate=retry.if_transient_error)}
)

response:
GenerateContentResponse(
 done=True,
 iterator=None,
 result=protos.GenerateContentResponse({
 "candidates": [
 {
 "content": {
 "parts": [
 {
 "text": "Flora found it nestled between a chipped ceramic gnome and a stack of moth-eaten doilies at Mrs. Higgins' \"Everything Must Go\" yard sale. It was a simple, canvas backpack, faded green with worn leather straps. Nothing outwardly magical, but a faint shimmer, like heat rising off asphalt, clung to it. Mrs. Higgins, a woman who smelled perpetually of lavender and forgotten secrets, just winked and said, \"That one's got stories, dear. Take good care of 'em.\"\n\nFlora, a daydreamer with a penchant for adventure, paid the dollar without hesitation. She loved stories. She loved adventures even more.\n\nThe first hint of magic came the next day. Flora, late for her history class, rummaged through the backpack for her textbook. Instead of a heavy tome, she pulled out a perfectly ripe, crimson apple. She blinked. Maybe she'd packed

### Manually increase timeout when responses take time

If you encounter `ReadTimeout` or `DeadlineExceeded` errors, meaning an API call exceeds the default timeout (600 seconds), you can manually adjust it by defining `timeout` in the `request_options` argument.

In [None]:
model = genai.GenerativeModel("gemini-2.0-flash")
prompt = "Write a story about a magic backpack."

model.generate_content(
 prompt, request_options={"timeout": 900}
) # Increase timeout to 15 minutes

response:
GenerateContentResponse(
 done=True,
 iterator=None,
 result=protos.GenerateContentResponse({
 "candidates": [
 {
 "content": {
 "parts": [
 {
 "text": "Maya hated school. Not the learning part, surprisingly, but the social minefield of the cafeteria, the stale air in the hallways, and the constant pressure to fit in. So, when her eccentric Aunt Clara, a woman known for her mismatched socks and pronouncements like \"The moon is made of cheese! Don't let anyone tell you otherwise!\", gifted her a battered, leather backpack, Maya wasn't exactly thrilled.\n\n\"It's... vintage,\" Aunt Clara had said, her eyes twinkling. \"And a little bit... special.\"\n\nSpecial turned out to be an understatement. On her first day carrying it, Maya reached inside for a pencil and instead pulled out a perfectly ripe mango. Baffled, she rummaged again and found a tiny, intricately carved wooden bird. The next day, it was a feather that shimmered with iridescent colours.\n\nThe backpack, she realiz

**Caution:** While increasing timeouts can be helpful, be mindful of setting them too high, as this can delay error detection and potentially waste resources.

### Manually implement backoff and retry with error handling

For finer control over retry behavior and error handling, you can use the [`retry`](https://googleapis.dev/python/google-api-core/latest/retry.html) library (or similar libraries like [`backoff`](https://pypi.org/project/backoff/) and [`tenacity`](https://tenacity.readthedocs.io/en/latest/)). This gives you precise control over retry strategies and allows you to handle specific types of errors differently.

In [None]:
from google.api_core import retry, exceptions

model = genai.GenerativeModel("gemini-2.0-flash")


@retry.Retry(
 predicate=retry.if_transient_error,
 initial=2.0,
 maximum=64.0,
 multiplier=2.0,
 timeout=600,
)
def generate_with_retry(model, prompt):
 response = model.generate_content(prompt)
 return response


prompt = "Write a one-liner advertisement for magic backpack."

generate_with_retry(model=model, prompt=prompt)

response:
GenerateContentResponse(
 done=True,
 iterator=None,
 result=protos.GenerateContentResponse({
 "candidates": [
 {
 "content": {
 "parts": [
 {
 "text": "Unzip the impossible with the Magic Backpack \u2013 adventures await!\n"
 }
 ],
 "role": "model"
 },
 "finish_reason": "STOP",
 "avg_logprobs": -0.7733027384831355
 }
 ],
 "usage_metadata": {
 "prompt_token_count": 10,
 "candidates_token_count": 13,
 "total_token_count": 23
 },
 "model_version": "gemini-2.0-flash"
 }),
)

### Test the error handling with retry mechanism

To validate that your error handling and retry mechanism work as intended, define a `generate_content` function that deliberately raises a `ServiceUnavailable` error on the first call. This setup will help you ensure that the retry decorator successfully handles the transient error and retries the operation.

In [None]:
from google.api_core import retry, exceptions


@retry.Retry(
 predicate=retry.if_transient_error,
 initial=2.0,
 maximum=64.0,
 multiplier=2.0,
 timeout=600,
)
def generate_content_first_fail(model, prompt):
 if not hasattr(generate_content_first_fail, "call_counter"):
 generate_content_first_fail.call_counter = 0

 generate_content_first_fail.call_counter += 1

 try:
 if generate_content_first_fail.call_counter == 1:
 raise exceptions.ServiceUnavailable("Service Unavailable")

 response = model.generate_content(prompt)
 return response.text
 except exceptions.ServiceUnavailable as e:
 print(f"Error: {e}")
 raise


model = genai.GenerativeModel("gemini-2.0-flash")
prompt = "Write a one-liner advertisement for magic backpack."

generate_content_first_fail(model=model, prompt=prompt)

Error: 503 Service Unavailable


'Carry anything, anywhere, with the backpack that defies reality!\n'