Skip to content

Use putUnencodedChars for Funneling strings.#729

Merged
RustedBones merged 4 commits intospotify:mainfrom
LeifW:patch-1
Apr 27, 2023
Merged

Use putUnencodedChars for Funneling strings.#729
RustedBones merged 4 commits intospotify:mainfrom
LeifW:patch-1

Conversation

@LeifW
Copy link
Contributor

@LeifW LeifW commented Apr 20, 2023

As per the warning on PrimitiveSink.putString JavaDoc:

Warning: This method, which reencodes the input before processing it, is useful only for cross-language compatibility. For other use cases, prefer putUnencodedChars(java.lang.CharSequence), which is faster, produces the same output across Java releases, and processes every char in the input, even if some are invalid.

Or maybe leave this out and let people make their own decision as to what they want for the Funnel[String] instance?

As per the warning on [PrimitiveSink.putString](https://javadoc.io/doc/com.google.guava/guava/latest/com/google/common/hash/PrimitiveSink.html#putString(java.lang.CharSequence,java.nio.charset.Charset)) JavaDoc:
> Warning: This method, which reencodes the input before processing it, is useful only for cross-language compatibility. For other use cases, prefer putUnencodedChars(java.lang.CharSequence), which is faster, produces the same output across Java releases, and processes every char in the input, even if some are invalid.

Or maybe leave this out and let people make their own decision as to what they want for the `Funnel[String]` instance?

implicit val booleanFunnel: Funnel[Boolean] = funnel[Boolean](_.putBoolean(_))
implicit val stringFunnel: Funnel[String] = funnel[String](_.putString(_, Charsets.UTF_8))
implicit val stringFunnel: Funnel[String] = Funnels.unencodedCharsFunnel.asInstanceOf[Funnel[String]]
Copy link
Contributor Author

@LeifW LeifW Apr 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Guess instead of using the cast, this could also be funnel[String](_.putUnencodedChars(_))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we wrote our own typeclass for this (that was equivalent to Funnel but defined in Scala), we could use Scala's SAM support to define instances, like implicit val stringFunnel: OurFunnel[String] = _ putUnencodedChars _, add methods to it like contramap (which is what the by method in here seems to be?), widen, add a variance annotation to it, etc... which could then be converted to a Guava Funnel at use-site?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since Funnel is invariant in the type param, maybe we could define helpers like other libraries have, e.g.

def narrowFunnel[AA, A <: AA](funnel: Funnel[AA]): Funnel[A] = funnel.asInstanceOf[Funnel[A]]

so then stringFunnel could be defined as:

implicit val stringFunnel: Funnel[String] = narrowFunnel(Funnels.unencodedCharsFunnel)

?
And maybe even something to not need the cast on the "unboxed" funnels like Int, Long, etc...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a simple solution would be to define a single implicit def for all CharSequence instances

  implicit def charSequenceFunnel[T <: CharSequence]: Funnel[T] =
    Funnels.unencodedCharsFunnel().asInstanceOf[Funnel[T]]

@RustedBones RustedBones merged commit 2a54755 into spotify:main Apr 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants