Trying to train an AutoML Vision classification model in Vertex AI, but every time I start training I get:
"Training pipeline failed with error message: Internal error occurred. Please retry in a few minutes."
Tried different datasets, model names, and regions (europe-west4, us-central1) same error.
Anyone else experiencing this? Could this be related to the current GCE C3 VM issues?
Hi @stanley2001,
Welcome to Google Cloud Community!
The internal error message you're getting suggests that there could be an infrastructure issue, which is often tied to the underlying VM or hardware resources used for model training. Here are a few troubleshooting steps you can try to resolve the problem:
1. General Troubleshooting:
2. Potential GCE C3 VM-Related Issues: While the generic "Internal error" doesn't definitively point to C3 VM problems, here's what to consider:
3. Data Issues
If the issue persists, you can reach out to Google Cloud Support. When reaching out, include detailed information and relevant screenshots of the errors you’ve encountered. This will assist them in diagnosing and resolving your issue more efficiently.
Was this helpful? If so, please accept this answer as “Solution”. If you need additional assistance, reply here within 2 business days and I’ll be happy to help.
Thankyou for your reply. I will take a look and see if i can fix it.