-
Notifications
You must be signed in to change notification settings - Fork 81
Store initialization race condition - Table 'langchain_pg_collection' is already defined for this MetaData instance #165
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I'm having the same issue. This seems related to langchain-ai/langchain#14699 [2025-03-11 21:21:00,989: ERROR/ForkPoolWorker-8] Task esp_ai_ext.tasks.task_retriever[378edf3a-ad12-4b1b-a669-c5f93b4d46b7] raised unexpected: InvalidRequestError("Table 'langchain_pg_collection' is already defined for this MetaData instance. Specify 'extend_existing=True' to redefine options and columns on an existing Table object.") |
Hey @CJ-Lab7, I have more context about this issue. If you enable the tables or collections creations, you can have another race condition more "probable". We are doing retries, but we are planning on implementing a mutex on the Vectorstore creation. I hope this helps you, and let me know if you want to know more |
Thanks @MartinGotelli, that helped. I was instantiating the vectorstore in langgraph nodes running in parallel which caused the issue |
Looks like this won't happen in the new v2 if you still need v1, #209 should do it |
sqlalchemy.exc.InvalidRequestError: Table 'langchain_pg_collection' is already defined for this MetaData instance
This error occurs when you create two instances of the PGVector instance at the same time. Since the models (CollectionStore and EmbeddingStore) are created "dynamically", they can be instantiated at the same time, since SQLAlchemy uses a metadata cache, this introduces a race condition.
I have a couple of suggestions, one can be a simple mutex on the _get_embedding_collection_store method, and another one can be defining the models with extend_existing or keep_existing table_args. Finally, receiving table args by parameter and sending them to the models.
What do you think? I can create the PR, but I want to know what you prefer.
I would go with adding table args with "extend_existing" as True
EDIT: There is also a race condition on the table creation, a mutex makes sense in there
The text was updated successfully, but these errors were encountered: