Skip to content

Commit e0b8e7a

Browse files
authored
Merge pull request #22331 from mgmur/add-blurb-about-hash-collisions-to-uniqueCount-docs
Add note about hash collisions for uniqueCount below 256
2 parents 62d5645 + f9d8341 commit e0b8e7a

File tree

1 file changed

+4
-0
lines changed

1 file changed

+4
-0
lines changed

src/content/docs/nrql/nrql-syntax-clauses-functions.mdx

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2120,6 +2120,10 @@ SELECT histogram(duration, 10, 20) FROM PageView SINCE 1 week ago
21202120

21212121
Use the `uniqueCount()` function to get the number of unique values recorded for an attribute over a specified time range. To count the unique combinations of multiple attribute values, specify those attributes with the function. You can include up to 32 attributes. This function provides an exact result for up to 256 unique values when you call it without the `precision` argument. For more than 256 unique values, the result is approximate. You can specify a `precision` value within the range of 256 to 50,000 to increase the threshold for exact results. When unique values exceed the set threshold, the function uses the [HyperLogLog probabilistic data structure](https://algo.inria.fr/flajolet/Publications/FlFuGaMe07.pdf) to provide an approximate result.
21222122

2123+
<Callout variant="tip" title="Note">
2124+
**How exact counting works under 256 values**: When counting fewer than 256 unique string values, `uniqueCount()` uses a hash-based counting method to determine uniqueness. This method generates a 32-bit integer hash code from the characters in each string. While this approach efficiently distributes hash values for most strings, hash collisions can occasionally occur—meaning two different strings may generate the same hash code. This is more likely to happen when counting strings that are extremely similar to each other. In rare cases where a hash collision occurs, the function may slightly undercount the true number of unique values.
2125+
</Callout>
2126+
21232127
Use the `uniqueCount()` function by specifying the attributes and optionally set the precision argument as follows:
21242128

21252129
```sql

0 commit comments

Comments
 (0)