-
Notifications
You must be signed in to change notification settings - Fork 13.6k
[C++20] [Modules] Named Module Units would generate initializers unconditionally #56794
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@llvm/issue-subscribers-c-20 |
@llvm/issue-subscribers-clang-modules |
I just ran into this (Clang 17) and was about to open a new issue, but found this instead, so I'd like to leave my comment on the matter here: While inspecting the disassemblies of a program generated with meta-programming and one that's handwritten and does exactly the same, I noticed the following: Due to the meta-programming facilities being in modules, the first disassembly is filled with module initializers that effecitively do nothing. Not only does this result in consistently ever so slightly worse total runtimes and cache statistics (valgrind), it also results in much larger binary sizes (3.5 times which until now I attributed to the long mangled names from the templates). The disassembly itself has almost 5 times as many lines. I could observe this with -flto=thin and -flto=full as well as with the linkers lld and mold. So in short: Startup performance is hit, binary sizes bloat up and it's not mitigated by LTO. Ideally, if the module initializer does nothing, there isn't one to begin with and perhaps linking against the object file could be omited, as it does not contain any meaningful information. I imagine this could also help build speeds. |
Sorry for forgetting to update this. The consensus is that the initializer should be emitted all the way. Since this is part of BMI standard. Otherwise the object files compiled by clang won't be linked with object files compiled by GCC. But the callers are free to elide the call to the initializers if the compiler makes sure that it is not necessary. (Not implemented in clang yet.) For the runtime performance and code size issues, I feel your use case may be too special. I've tested modules with our server use cases. The performance looks good and code size goes down (except BMI size). The conclusion is that this may be optimized but I guess it can't achieve your goals. |
…t init anything Close llvm#56794 And see llvm#67582 for a detailed backgrond for the issue. As required by the Itanium ABI, the module units have to generate the initialization function. However, the importers are allowed to elide the call to the initialization function if they are sure the initialization function doesn't do anything. This patch implemented this semantics.
… init anything (#67638) Close #56794 And see #67582 for a detailed backgrond for the issue. As required by the Itanium ABI, the module units have to generate the initialization function. However, the importers are allowed to elide the call to the initialization function if they are sure the initialization function doesn't do anything. This patch implemented this semantics.
… init anything (llvm#67638) Close llvm#56794 And see llvm#67582 for a detailed backgrond for the issue. As required by the Itanium ABI, the module units have to generate the initialization function. However, the importers are allowed to elide the call to the initialization function if they are sure the initialization function doesn't do anything. This patch implemented this semantics.
I am seeing unnecessary calls to module initializers again, and I have been for a while. |
Would you like to provide a reproducer? It will be helpful. |
Uh oh!
There was an error while loading. Please reload this page.
For, the following line
and option
It would generate:
This is bad. A.cppm has nothing but it generates actual codes. It would be worse if we import something:
Then the generated code would be:
Yeah, it calls the initializer from A. This is a call across TUs so it can't be inlined without LTO. As a result, the startup performance may decrease if there are many many imports.
This is not about the correctness, though.
The text was updated successfully, but these errors were encountered: