-
Notifications
You must be signed in to change notification settings - Fork 48
docs: add llm kmeans notebook as an included example #177
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you mind adding a "Clean up" section like: https://github.com/googleapis/python-bigquery-dataframes/blob/main/notebooks/generative_ai/bq_dataframes_llm_code_generation.ipynb.
nit: We should also be consistent with this structure, adding: Installation, authentication, etc... section
Some of those sections are not needed anymore or not needed for this notebook right? For example |
Ok! I have added all the sections that I think are needed, including all the code needed for the Vertex AI connection. The clean up section luckily it sounds like is not necessary based on discussion with @GarrettWu . I have tested the most recent version of the notebook in Colab. Based on discussion with @shobsi , we have deny listed this type of notebook from being run by the e2e tester, so I deny listed this one as well. Looks like we'll have to rely on manual testing for the time being. Ready for re-review. Thank you! |
🤖 I have created a release *beep* *boop* --- ## [0.14.0](https://togithub.com/googleapis/python-bigquery-dataframes/compare/v0.13.0...v0.14.0) (2023-11-14) ### Features * Add 'cross' join support ([#176](https://togithub.com/googleapis/python-bigquery-dataframes/issues/176)) ([765446a](https://togithub.com/googleapis/python-bigquery-dataframes/commit/765446a929abe1ac076c3037afa7892f64105356)) * Add 'index', 'pad', 'nearest' interpolate methods ([#162](https://togithub.com/googleapis/python-bigquery-dataframes/issues/162)) ([6a28403](https://togithub.com/googleapis/python-bigquery-dataframes/commit/6a2840349a23035bdfdabacd1e231b41bbb5ed7a)) * Add series.sample (identical to existing dataframe.sample) ([#187](https://togithub.com/googleapis/python-bigquery-dataframes/issues/187)) ([37914a4](https://togithub.com/googleapis/python-bigquery-dataframes/commit/37914a4077c681881491f5c36d1a9c9f4255e18f)) * Add unordered sql compilation ([#156](https://togithub.com/googleapis/python-bigquery-dataframes/issues/156)) ([58f420c](https://togithub.com/googleapis/python-bigquery-dataframes/commit/58f420c91d94ca085e9810f36513ffe772bfddcf)) * Log most recent API calls as `recent-bigframes-api-xx` labels on BigQuery jobs ([#145](https://togithub.com/googleapis/python-bigquery-dataframes/issues/145)) ([4ea33b7](https://togithub.com/googleapis/python-bigquery-dataframes/commit/4ea33b7433532ae3a386a6ffa9eb57360ea39526)) * Read_gbq creates order deterministically without table copy ([#191](https://togithub.com/googleapis/python-bigquery-dataframes/issues/191)) ([8ab81de](https://togithub.com/googleapis/python-bigquery-dataframes/commit/8ab81dee4d0eee499094f2dd576550f0c59d7551)) * Support `date_series.astype("string[pyarrow]")` to cast DATE to STRING ([#186](https://togithub.com/googleapis/python-bigquery-dataframes/issues/186)) ([aee0e8e](https://togithub.com/googleapis/python-bigquery-dataframes/commit/aee0e8e2518c59bd1e0b07940c3309871fde8899)) * Support `series.at[row_label] = scalar` ([#173](https://togithub.com/googleapis/python-bigquery-dataframes/issues/173)) ([0c8bd33](https://togithub.com/googleapis/python-bigquery-dataframes/commit/0c8bd33806bb99206b8b12dbdf7d7485c6ffb759)) * Temporary resources no longer use BigQuery Sessions ([#194](https://togithub.com/googleapis/python-bigquery-dataframes/issues/194)) ([4a02cac](https://togithub.com/googleapis/python-bigquery-dataframes/commit/4a02cac88c7d7b46bed1fa813a862fc2ef9ef084)) ### Bug Fixes * All sort operation are now stable ([#195](https://togithub.com/googleapis/python-bigquery-dataframes/issues/195)) ([3a2761f](https://togithub.com/googleapis/python-bigquery-dataframes/commit/3a2761f3c38d0de8b8eda47fffa15b8412aa84b0)) * Default to 7 days expiration for `read_csv`, `read_json`, `read_parquet` ([#193](https://togithub.com/googleapis/python-bigquery-dataframes/issues/193)) ([03606cd](https://togithub.com/googleapis/python-bigquery-dataframes/commit/03606cda30eb7645bfd4534460112dcca56b0ab0)) * Deprecate the `remote_service_type` in llm model ([#180](https://togithub.com/googleapis/python-bigquery-dataframes/issues/180)) ([a8a409a](https://togithub.com/googleapis/python-bigquery-dataframes/commit/a8a409ab0bd1f99dfb442df0703bf8786e0fe58e)) * For reset_index on unnamed multiindex, always use level_[n] label ([#182](https://togithub.com/googleapis/python-bigquery-dataframes/issues/182)) ([f95000d](https://togithub.com/googleapis/python-bigquery-dataframes/commit/f95000d3f88662be4d88c8b0152f1b838e99ec55)) * Match pandas behavior when assigning listlike to empty dfs ([#172](https://togithub.com/googleapis/python-bigquery-dataframes/issues/172)) ([c1d1f42](https://togithub.com/googleapis/python-bigquery-dataframes/commit/c1d1f42a21cc089877f79ebb46a39ddef6958e04)) * Use anonymous dataset instead of session dataset for temp tables ([#181](https://togithub.com/googleapis/python-bigquery-dataframes/issues/181)) ([800d44e](https://togithub.com/googleapis/python-bigquery-dataframes/commit/800d44eb5eb77da5d87b2e005f5a2ed53842e7b5)) * Use random table for `read_pandas` ([#192](https://togithub.com/googleapis/python-bigquery-dataframes/issues/192)) ([741c75e](https://togithub.com/googleapis/python-bigquery-dataframes/commit/741c75e5797e26a1487ff3da76a07953d9537f3f)) * Use random table when loading data for `read_csv`, `read_json`, `read_parquet` ([#175](https://togithub.com/googleapis/python-bigquery-dataframes/issues/175)) ([9d2e6dc](https://togithub.com/googleapis/python-bigquery-dataframes/commit/9d2e6dc1ae4e11e80da4aabe0daa3a6044137cc6)) ### Documentation * Add code samples for `read_gbq_function` using community UDFs ([#188](https://togithub.com/googleapis/python-bigquery-dataframes/issues/188)) ([7506eab](https://togithub.com/googleapis/python-bigquery-dataframes/commit/7506eabf2e58159507809e36abfe90c417dfe92f)) * Add docstring code samples for `Series.apply` and `DataFrame.map` ([#185](https://togithub.com/googleapis/python-bigquery-dataframes/issues/185)) ([c816d84](https://togithub.com/googleapis/python-bigquery-dataframes/commit/c816d843e6f3c5a944cd4395ed0e1e91cec49812)) * Add llm kmeans notebook as an included example ([#177](https://togithub.com/googleapis/python-bigquery-dataframes/issues/177)) ([d49ae42](https://togithub.com/googleapis/python-bigquery-dataframes/commit/d49ae42a379fafd601cc94227e7f8f14b3d5f8c3)) * Use `head()` to get top `n` results, not to preview results ([#190](https://togithub.com/googleapis/python-bigquery-dataframes/issues/190)) ([87f84c9](https://togithub.com/googleapis/python-bigquery-dataframes/commit/87f84c9e58e7d0ea521ac386c9f02791cdddd19f)) --- This PR was generated with [Release Please](https://togithub.com/googleapis/release-please). See [documentation](https://togithub.com/googleapis/release-please#release-please).
Original notebook written by Jiaxunwu.
I have tested the notebook by running it on Colab with a test prod project. Any feedback on content or format would be appreciated. I didn't change very much code from Jiaxun's original. Thanks!