Skip to content

Conversation

@jqnatividad
Copy link
Collaborator

No description provided.

…es in Examples column

Remove trailing parenthetical counts from values like "Other (n)" when f.rank == 0.0 so the parenthetical count isn't duplicated (the numeric count is already shown separately). Introduce raw_value and apply truncation to the cleaned value before formatting (uses rfind(" (") to strip the last parenthetical), preserving the displayed [count].
@jqnatividad jqnatividad changed the title fix(describegpt): remove confusing redundant count for "Other fix(describegpt): remove confusing redundant count for "Other" values in the Data Dictionary Examples column Feb 11, 2026
@jqnatividad jqnatividad requested a review from Copilot February 11, 2026 00:03
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates describegpt’s code-based dictionary generation to avoid showing a redundant “unique values” parenthetical in the Examples column for rank-0 (“Other”) frequency buckets, and refreshes the corresponding documentation examples.

Changes:

  • Strip the trailing (n) portion from rank-0 frequency values (e.g., Other (123)Other) when rendering Examples.
  • Regenerate/refresh describegpt documentation outputs to reflect the new Examples formatting.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 7 comments.

File Description
src/cmd/describegpt.rs Adjusts Examples rendering for rank-0 (“Other”) bucket values to drop the redundant parenthetical count.
docs/describegpt/nyc311-describegpt.md Updates generated example output; now shows Other [...] instead of Other (n) [...].
docs/describegpt/nyc311-describegpt-spanish.md Same refresh for Spanish example output.
docs/describegpt/nyc311-describegpt-mandarin.md Same refresh for Mandarin example output.

…r" in Examples column

Append "…" to frequency bucket entries (rank=0) after stripping the
parenthetical count, so bucket "Other… [4,091]" is visually distinct
from a literal data value "Other [2,006]". This addresses the Copilot
review feedback on PR #3449 about ambiguous output in fields like
Taxi Pick Up Location.

- Add ellipsis suffix to bucket entries for disambiguation
- Add describegpt_other_bucket_disambiguation test
- Update all sample doc files with the new Other… format

Co-Authored-By: Claude Opus 4.6 <[email protected]>
@jqnatividad jqnatividad merged commit 3bdd3e5 into master Feb 11, 2026
15 of 16 checks passed
@jqnatividad jqnatividad deleted the describegpt-datadictionary-other_count branch February 11, 2026 00:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant