-
Notifications
You must be signed in to change notification settings - Fork 48
fix: read_gbq_table
respects primary keys even when filters
are set
#689
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Closes internal issues 338039517 (primary key inconsistency), 338037499 (LIMIT for max_results), 340540991 (avoid running query immediately if time travel is supported), 337925142 (push down column filters to when we create the time travel subquery). feat: `read_gbq_query` supports `filters` perf: use a `LIMIT` clause when `max_results` is set perf: don't run query immediately from `read_gbq_table` if `filters` is set
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mostly looks good, seems we've lost some validation on unknown column names though?
…able-primary_keys
bigframes/session/__init__.py
Outdated
index_cols=index_cols, | ||
columns=columns, | ||
filters=filters, | ||
time_travel_timestamp=time_travel_timestamp, | ||
) | ||
|
||
for key in columns: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@TrevorBergeron Which "validation for unknown column names'" are we missing? I still see this here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you like me to move it to before we create the ibis expression? It's possible ibis will fail with a dry run error when we try to access the columns in it now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see tests/system/small/test_session.py::test_read_gbq_w_columns[unknown_col]
failure now. Likely this is the issue. I'll move this validation to after we tech the table metadata.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in 87365db
…y_keys' into b338039517-read_gbq_table-primary_keys
…y_keys' into b338039517-read_gbq_table-primary_keys
Closes internal issues 338039517 (primary key inconsistency), 338037499 (LIMIT for max_results), 340540991 (avoid running query immediately if time travel is supported), 337925142 (push down column filters to when we create the time travel subquery).
feat:
read_gbq_query
supportsfilters
perf: use a
LIMIT
clause whenmax_results
is setperf: don't run query immediately from
read_gbq_table
iffilters
is set🦕