Skip to content

Performance of Constant string printing [ci: last-only] #10897

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Nov 18, 2024

Conversation

som-snytt
Copy link
Contributor

@som-snytt som-snytt commented Oct 23, 2024

Use a StringBuilder to build a String. Since quotes are always added, it's not obvious that string add is worthwhile after chars have been copied to the builder. Could also prescan for an escapable char.

Fixes scala/scala-dev#878

@scala-jenkins scala-jenkins added this to the 2.13.16 milestone Oct 23, 2024
@SethTisue SethTisue requested a review from retronym October 23, 2024 21:49
@SethTisue
Copy link
Member

SethTisue commented Oct 23, 2024

@retronym want to test this out on the customer codebase...?

@SethTisue SethTisue added support context is a commercial support case performance the need for speed. usually compiler performance, sometimes runtime performance. labels Oct 23, 2024
@som-snytt som-snytt marked this pull request as ready for review October 23, 2024 22:18
@retronym
Copy link
Member

retronym commented Oct 24, 2024

Some context -- this method is called indirectly by Zinc's Extract API phase to generate its representation of annotations.

@som-snytt som-snytt changed the title Performance of Constant string printing Performance of Constant string printing [ci: last-only] Oct 24, 2024
@som-snytt som-snytt marked this pull request as draft October 24, 2024 15:10
@som-snytt
Copy link
Contributor Author

som-snytt commented Oct 24, 2024

The last commit is, "pre-construct with string add to enclose in quotes" then "expand where necessary using replace".

That optimizes for the common case, where the hit for expansion (escaping) is array copy (which only sounds bad).

The savings is the append per char (which is bad).

The assumption is that the pre-construction is optimal, so it relies on the platform.

@som-snytt som-snytt marked this pull request as ready for review October 24, 2024 15:34
Copy link
Member

@SethTisue SethTisue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but I'm not sure whether to merge without Jason's approval

@SethTisue SethTisue assigned retronym and unassigned SethTisue Oct 25, 2024
@som-snytt
Copy link
Contributor Author

I'd like @retronym guess about the last commit and comment. I didn't write a test to estimate what ratio of escapes would make it less desirable, but maybe annotations have fewer of them.

The other choice is prescan for escapable chars; then only strings with escapes must suffer construction costs. Maybe I'll add that while we're waiting.

@som-snytt
Copy link
Contributor Author

Actually I saw retronym doing thyme timings at c3fcd4e and felt bad for being lazy, so I will try to measure if this is useful.

I was at that commit because while exercising -Wnonunit-statement, I noticed the use of StringBuffer instead of StringBuilder. It was like a blast from the past.

Copy link
Member

@lrytz lrytz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm late to the party, so don't know what was said in the welcome speech..

@som-snytt som-snytt force-pushed the sd/878-constant-string branch from 92a7e7e to 7ed7368 Compare November 15, 2024 13:45
@som-snytt
Copy link
Contributor Author

@lrytz squashed, rebased, unwound experiment using replace (I haven't tested performance yet), added pre-check for whether there are escapes to expand.

Probably the ticket doesn't require extreme performance.

Using string concat (stradd) by default assumes that, while we must be JDK 8 compat, hopefully it's using modern intrinsic for adding quote chars, which I assume to be array copy of underlying chars. However, also untested.

@lrytz lrytz merged commit ff5c154 into scala:2.13.x Nov 18, 2024
3 checks passed
@som-snytt
Copy link
Contributor Author

som-snytt commented Nov 18, 2024

[info] Benchmark                   Mode  Cnt      Score     Error  Units
[info] ConstantBenchmark.measure   avgt   40  14007.372 ± 176.270  ns/op
[info] ConstantBenchmark.measure0  avgt   40  32518.810 ± 318.486  ns/op

I tested by moving the SymbolTable4Testing from junit to testkit. I'll PR that in case of utility.

You were wise to merge before I benched the replace version.

Small improvement by sizing a string builder for simple wrapping in quotes:

[info] ConstantBenchmark.measure  avgt   40  13924.241 ± 96.251  ns/op

This is with a 3000 char string, so probably it allocates a big array plus margin when appending; maybe it must arraycopy on toString; is there no arraycopy if exactly sized?

@som-snytt som-snytt deleted the sd/878-constant-string branch November 18, 2024 16:42
@lrytz
Copy link
Member

lrytz commented Nov 20, 2024

Thank you for the benchmark!

is there no arraycopy if exactly sized?

There must be a copy, the StringBuilder can continue to be used. Unless they do something fancy, like a dirty bit in the string builder and lazy array copy, but I didn't see anything like that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance the need for speed. usually compiler performance, sometimes runtime performance. support context is a commercial support case
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Optimize Constant.escape
5 participants