-
Notifications
You must be signed in to change notification settings - Fork 48
docs: add sample for getting started with BQML #141
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Here is the summary of changes. You are about to add 1 region tag.
This comment is generated by snippet-bot.
|
# When writing a DataFrame to a BigQuery table, include destinaton table | ||
# and parameters, index defaults to "True". |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment has nothing to do with BigQuery ML models. Please fix.
Note: The important thing here is that we're taking our trained model and writing it to a permanent location.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, corrected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI: The comment is still talking about tables not models. I'll make a comment with a suggested edit.
@@ -0,0 +1,13 @@ | |||
# Copyright 2023 Google LLC |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can remove this file for now. Let's do a separate PR for the K-Means tutorials.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Corrected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file still needs to be deleted.
|
||
# The model.fit() call above created a temporary model. | ||
# Use the to_gbq() method to write to a permanent location. | ||
model.to_gbq("bqml_tutorial.sample_model", replace=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI: We're getting
E google.api_core.exceptions.BadRequest: 400 Concurrent update on same model: bigframes-dev:bqml_tutorial.sample_model is not supported. Share your usecase with the BigQuery DataFrames team at the [https://bit.ly/bigframes-feedback](https://www.google.com/url?q=https://bit.ly/bigframes-feedback&sa=D) survey.
failure in our test suite: https://fusion2.corp.google.com/invocations/8a7513c8-e7c9-4b5b-82fe-9a83c176fbc1/targets/bigframes%2Fpresubmit%2Fe2e/log
I think we'll need a test fixture for this to create a temporary place for the model and clean it up when the test finishes.
-
Create a file called
samples/snippets/conftest.py
. -
In the
conftest.py
file you create, add a fixture calledrandom_model_id
, similar to this one: https://github.com/googleapis/python-bigquery/blob/f804d639fe95bef5d083afe1246d756321128b05/samples/snippets/conftest.py#L101-L111 except it'll calldelete_model(...)
instead ofdelete_table(...)
.You'll also need to add "prefixer" https://github.com/googleapis/python-bigquery/blob/f804d639fe95bef5d083afe1246d756321128b05/samples/snippets/conftest.py#L21 and
bigquery_client
fixture https://github.com/googleapis/python-bigquery/blob/f804d639fe95bef5d083afe1246d756321128b05/samples/snippets/conftest.py#L33-L36 -
Update your code sample to use the new
random_model_id
fixture.Look how we do it in the remote functions test:
python-bigquery-dataframes/samples/snippets/remote_function.py
Lines 17 to 23 in 81125f9
your_gcp_project_id = project_id # [START bigquery_dataframes_remote_function] import bigframes.pandas as bpd # Set BigQuery DataFrames options bpd.options.bigquery.project = your_gcp_project_id your_model_id = random_model_id
and callingmodel.to_gbq( your_model_id, # "project.dataset.model_id" or "dataset.model_id" replace=True, )
|
||
|
||
def test_bqml_getting_started(): | ||
# [START bigquery_getting_started_bqml_tutorial] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's use bigquery_dataframes_bqml_getting_started
for our region tags.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds great, will make edits now.
third_party/geopandas/LICENSE.txt
Outdated
@@ -0,0 +1,25 @@ | |||
Copyright (c) 2013-2022, GeoPandas developers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Revert this
Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:
Fixes #<issue_number_goes_here> 🦕