Getting a lot of "service unavailable" errors on gemini-2.0-flash

503 UNAVAILABLE. {‘error’: {‘code’: 503, ‘message’: ‘The model is overloaded. Please try again later.’, ‘status’: ‘UNAVAILABLE’}}

Anyone else having the same issue?

4 Likes

Hey @zlmk - this issue should be resolved now. Please let me know if you’re continuing to see issues

3 Likes

At the moment I am getting persistent 503 when trying to use gemini 2.0 flash (able to use flash lite)

1 Like

Facing same error tested just now multiple times.

Text input structured output with 2.0 flash

1 Like

Sorry about that. I believe this should be resolved now - we just pushed another fix. Please continue to let me know if you’re still seeing issues!

2 Likes

Still get 503 errors.

Thanks for flagging, Toby. I’ll take a look

1 Like

still getting the 503 issue

1 Like

Hi @Toby_Hain & @Ahsham_Abdullah,

Are you still facing 503 error issue?

As of early June, I was still encountering this issue. I am thinking of upgrading to Google AI Pro, wondered if that would improve the situation?

As of June 23 I am getting 503 from gemini-2.0-flash on us-east1. This want’s occurring on us-central1, but we had to switch to us-east1 due to failure standing up instances through CloudRun in central. As of last night CloudRun failed to provision service, and it was failing all night, so this morning we made the call to switch to us-east1. Overall, the service recovered, but we are now seeing persistent 503s and some other Gemini errors which were not present in central. @Vishal for visibility

Can you check if this is still an issue? 2.0 Flash was overloaded earlier today which may have coincided with you switching over to us-east1

I don’t understand why you sell resources to OpenAI when your own services are frequently overloaded. Your APIs are unstable, unreliable and have way higher latency than the equivalent models of OpenAI.

It is no longer an issue as of right now. I’ll be watching it for the next few days, as we have experienced spikes in 503, and other timeouts in the past. That was the reason for deploying in us-central1 originally, and we would stick to it if it wasn’t for the issues provisioning CloudRun instances for the last 24hrs. Thank you for the follow up, Vishal. We are all clear now.

1 Like

OpenAI is likely to have dedicated resources, which can be provisioned by reserving instances for the rest of us, as well. Its a tradeoff between costs and availability.

Still happening guys, this is a serious issue, how can people use it in production with this issue? Kindly fix it, thanks.

I keep getting it non stop today… this is serious problem

1 Like

Yes, it’s unusable for me today. 2.0 flash is failing with model is overloaded and my fallback to flash lite is failing as well, though I’m not sure if it is overloaded or just not calling the functions correctly. Usually the fallback to the lite model performs the function calling correctly.