Skip to content

to_dataframe fails when fetching timestamp values outside nanosecond bounds #168

Closed
@tswast

Description

@tswast

Environment details

  • OS type and version: macOS Catalina (10.15.5)
  • Python version: python --version: Python 3.7.6
  • pip version: pip --version: pip 20.0.2
  • google-cloud-bigquery version: pip show google-cloud-bigquery
Name: google-cloud-bigquery
Version: 1.24.0
Summary: Google BigQuery API client library
Home-page: https://github.com/GoogleCloudPlatform/google-cloud-python
Author: Google LLC
Author-email: [email protected]
License: Apache 2.0
Location: /Users/swast/miniconda3/envs/ibis-dev/lib/python3.7/site-packages
Requires: google-cloud-core, google-api-core, google-resumable-media, google-auth, protobuf, six

Code example

Code:

from google.cloud import bigquery                                                                                                    
client = bigquery.Client()                                                                                                           
df = client.query("SELECT TIMESTAMP '4567-01-01 00:00:00' AS `tmp`").to_dataframe()  

Stack trace

---------------------------------------------------------------------------
ArrowInvalid                              Traceback (most recent call last)
<ipython-input-3-6b8b40790c39> in <module>
----> 1 df = client.query("SELECT TIMESTAMP '4567-01-01 00:00:00' AS `tmp`").to_dataframe()

~/miniconda3/envs/ibis-dev/lib/python3.7/site-packages/google/cloud/bigquery/job.py in to_dataframe(self, bqstorage_client, dtypes, progress_bar_type, create_bqstorage_client)
   3372             dtypes=dtypes,
   3373             progress_bar_type=progress_bar_type,
-> 3374             create_bqstorage_client=create_bqstorage_client,
   3375         )
   3376 

~/miniconda3/envs/ibis-dev/lib/python3.7/site-packages/google/cloud/bigquery/table.py in to_dataframe(self, bqstorage_client, dtypes, progress_bar_type, create_bqstorage_client)
   1729                 create_bqstorage_client=create_bqstorage_client,
   1730             )
-> 1731             df = record_batch.to_pandas()
   1732             for column in dtypes:
   1733                 df[column] = pandas.Series(df[column], dtype=dtypes[column])

~/miniconda3/envs/ibis-dev/lib/python3.7/site-packages/pyarrow/array.pxi in pyarrow.lib._PandasConvertible.to_pandas()

~/miniconda3/envs/ibis-dev/lib/python3.7/site-packages/pyarrow/table.pxi in pyarrow.lib.Table._to_pandas()

~/miniconda3/envs/ibis-dev/lib/python3.7/site-packages/pyarrow/pandas_compat.py in table_to_blockmanager(options, table, categories, ignore_metadata, types_mapper)
    764     _check_data_column_metadata_consistency(all_columns)
    765     columns = _deserialize_column_index(table, all_columns, column_indexes)
--> 766     blocks = _table_to_blocks(options, table, categories, ext_columns_dtypes)
    767 
    768     axes = [columns, index]

~/miniconda3/envs/ibis-dev/lib/python3.7/site-packages/pyarrow/pandas_compat.py in _table_to_blocks(options, block_table, categories, extension_columns)
   1100     columns = block_table.column_names
   1101     result = pa.lib.table_to_blocks(options, block_table, categories,
-> 1102                                     list(extension_columns.keys()))
   1103     return [_reconstruct_block(item, columns, extension_columns)
   1104             for item in result]

~/miniconda3/envs/ibis-dev/lib/python3.7/site-packages/pyarrow/table.pxi in pyarrow.lib.table_to_blocks()

~/miniconda3/envs/ibis-dev/lib/python3.7/site-packages/pyarrow/error.pxi in pyarrow.lib.check_status()

ArrowInvalid: Casting from timestamp[us, tz=UTC] to timestamp[ns] would result in out of bounds timestamp: 81953424000000000

Potential solutions

In order of my preference:

Metadata

Metadata

Assignees

Labels

api: bigqueryIssues related to the googleapis/python-bigquery API.priority: p2Moderately-important priority. Fix may not be included in next release.type: bugError or flaw in code with unintended results or allowing sub-optimal usage patterns.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions