Skip to content

groupby(as_index=False).agg() cause error on Google Colab environment #271

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
m-wakatsuru opened this issue Dec 13, 2023 · 1 comment · Fixed by #273
Closed

groupby(as_index=False).agg() cause error on Google Colab environment #271

m-wakatsuru opened this issue Dec 13, 2023 · 1 comment · Fixed by #273
Assignees
Labels
api: bigquery Issues related to the googleapis/python-bigquery-dataframes API.

Comments

@m-wakatsuru
Copy link

sentence

When I executed the groupby function with as_index and agg() in google colab, I got an error.
I’ve already checked that I can execute only with as_index parameter except agg() and only with agg() except as_index parameter.

I wonder if this is the specification of bigframes or merely a bug.

Environment details

  • OS type and version: Linux(Google Colab)
  • Python version: 3.10.12
  • pip version: 23.1.2
  • bigframes version: 0.14.1

Steps to reproduce

  1. copy and paste a code example below to your google colab
  2. insert your variables into this code

Code example

import bigframes.pandas as bpd

bpd.options.bigquery.project = "{PROJECT_ID}"
df = bpd.read_gbq("{PROJECT_ID}.{DATASET_ID}.{TABLE_NAME}")
df_g = df.groupby("column_x", as_index=False).agg({
    "column_y":["min","median","max"]
})
df_g.head()

Stack trace

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-6-a11b52d5f92f> in <cell line: 1>()
----> 1 df_g = df.groupby("column_x", as_index=False).agg({
      2     "column_y":["min","median","max"]
      3 })
      4 df_g

/usr/local/lib/python3.10/dist-packages/bigframes/core/log_adapter.py in wrapper(*args, **kwargs)
     39         if api_method_name.startswith("__") or not api_method_name.startswith("_"):
     40             add_api_method(api_method_name)
---> 41         return method(*args, **kwargs)
     42 
     43     return wrapper

/usr/local/lib/python3.10/dist-packages/bigframes/core/groupby/__init__.py in agg(self, func, **kwargs)
    243                 return self._agg_string(func)
    244             elif utils.is_dict_like(func):
--> 245                 return self._agg_dict(func)
    246             elif utils.is_list_like(func):
    247                 return self._agg_list(func)

/usr/local/lib/python3.10/dist-packages/bigframes/core/log_adapter.py in wrapper(*args, **kwargs)
     39         if api_method_name.startswith("__") or not api_method_name.startswith("_"):
     40             add_api_method(api_method_name)
---> 41         return method(*args, **kwargs)
     42 
     43     return wrapper

/usr/local/lib/python3.10/dist-packages/bigframes/core/groupby/__init__.py in _agg_dict(self, func)
    287         )
    288         if want_aggfunc_level:
--> 289             agg_block = agg_block.with_column_labels(
    290                 utils.combine_indices(
    291                     pd.Index(column_labels),

/usr/local/lib/python3.10/dist-packages/bigframes/core/blocks.py in with_column_labels(self, value)
    618         label_list = value.copy() if isinstance(value, pd.Index) else pd.Index(value)
    619         if len(label_list) != len(self.value_columns):
--> 620             raise ValueError(
    621                 f"The column labels size `{len(label_list)} ` should equal to the value"
    622                 + f"columns size: {len(self.value_columns)}."

ValueError: The column labels size `3 ` should equal to the valuecolumns size: 4.
@product-auto-label product-auto-label bot added the api: bigquery Issues related to the googleapis/python-bigquery-dataframes API. label Dec 13, 2023
@TrevorBergeron
Copy link
Contributor

Thanks you for your report. This is indeed a bug with as_index+agg(). I have started work on a fix: #273.

gcf-merge-on-green bot pushed a commit that referenced this issue Dec 19, 2023
Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:
- [ ] Make sure to open an issue as a [bug/issue](https://togithub.com/googleapis/python-bigquery-dataframes/issues/new/choose) before writing your code!  That way we can discuss the change, evaluate designs, and agree on the general idea
- [ ] Ensure the tests and linter pass
- [ ] Code coverage does not decrease (if any source code was changed)
- [ ] Appropriate docs were updated (if necessary)

Fixes #271 🦕
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquery Issues related to the googleapis/python-bigquery-dataframes API.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants