Skip to content

Towards faster symbol lookup via DT_GNU_HASH #73855

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
LifeIsStrange opened this issue Jun 28, 2020 · 17 comments
Open

Towards faster symbol lookup via DT_GNU_HASH #73855

LifeIsStrange opened this issue Jun 28, 2020 · 17 comments
Labels
A-linkage Area: linking into static, shared libraries and binaries C-feature-request Category: A feature request, i.e: not implemented / a PR. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@LifeIsStrange
Copy link

LifeIsStrange commented Jun 28, 2020

As explains this blog, https://flapenguin.me/elf-dt-gnu-hash ELF has an alternative hashmap for symbol lookup that is up to 50% faster!

Therefore, rustc should pass to the linker the following argument: --hash-style SHT_GNU_HASH

Also: This saves a few hundreds bytes to a few kilobytes for typical executables. (DT_HASH is usually larger than DT_GNU_HASH because it does not skip undefined dynsym entries)

@jonas-schievink jonas-schievink added A-linkage Area: linking into static, shared libraries and binaries C-feature-request Category: A feature request, i.e: not implemented / a PR. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Jun 28, 2020
@tesuji
Copy link
Contributor

tesuji commented Jun 29, 2020

Is this a static linker argument?
Does it make the rustc binaries/its generated one incompatible with old linux/gnu systems?

@LifeIsStrange
Copy link
Author

@lzutao
https://github.com/llvm/llvm-project/blob/master/clang/lib/Driver/ToolChains/Linux.cpp
Support is documented here:

// Do not use 'gnu' hash style for Mips targets because .gnu.hash
  // and the MIPS ABI require .dynsym to be sorted in different ways.
  // .gnu.hash needs symbols to be grouped by hash code whereas the MIPS
  // ABI requires a mapping between the GOT and the symbol table.
  // Android loader does not support .gnu.hash until API 23.
  // Hexagon linker/loader does not support .gnu.hash
  if (!IsMips && !IsHexagon) {
    if (Distro.IsRedhat() || Distro.IsOpenSUSE() || Distro.IsAlpineLinux() ||
        (Distro.IsUbuntu() && Distro >= Distro::UbuntuMaverick) ||
        (IsAndroid && !Triple.isAndroidVersionLT(23)))
      ExtraOpts.push_back("--hash-style=gnu");

    if (Distro.IsDebian() || Distro.IsOpenSUSE() ||
        Distro == Distro::UbuntuLucid || Distro == Distro::UbuntuJaunty ||
        Distro == Distro::UbuntuKarmic ||
        (IsAndroid && Triple.isAndroidVersionLT(23)))
      ExtraOpts.push_back("--hash-style=both");
  }

So every target is supported except Android 23, MIPS and hexagon.
It would be beneficial for rust to enable it for more targets as so many distros will never here about this despite being a net improvement.

Rust has already done such a thing in the past with relro support #29877

@tesuji
Copy link
Contributor

tesuji commented Jun 29, 2020

@tmiasko
Copy link
Contributor

tmiasko commented Jun 29, 2020

For the most targets rustc invokes a linker through a compiler driver which already configures --hash-style=gnu when appropriate.

@LifeIsStrange
Copy link
Author

@tmiasko It would be nice to link the related code but great to hear that! Feel free to close the issue then :)

@mati865
Copy link
Contributor

mati865 commented Jun 29, 2020

@LifeIsStrange in other words: Rust does not call linker directly but instead invokes system C compiler (GCC or Clang). That compiler will use proper --hash-style (like in the snippet that you pasted above).

@LifeIsStrange
Copy link
Author

@mati865 oh thanks for the clarification, then I still believe that this is harmfully too conservative as many distros are not registered e.g archlinux

@mati865
Copy link
Contributor

mati865 commented Jun 29, 2020

@LifeIsStrange this is arguably Clang's issue, another issue is Distro.IsOpenSUSE() appearing twice...
GCC is not affected: https://git.archlinux.org/svntogit/packages.git/tree/trunk/PKGBUILD?h=packages/gcc&id=a9fc6ff951ae117dd8ca01506c331e8c2c843b64#n83

@LifeIsStrange
Copy link
Author

LifeIsStrange commented Jun 29, 2020

@mati865 I believe that it's easier to change rust that it is to change clang but if this is what the rust community want we can close this thread and wait for upstream instead of overriding it's current behavior.
I don't think it would be a bold move, especially since GCC seems to already have it by default.

@mati865
Copy link
Contributor

mati865 commented Jun 29, 2020

if this is what the rust community want we can close this thread

Well, there are plans to directly link with LLD at some point and in that mode Rust will have to choose hash-style itself. When using other compiler as the linker IMO Rust should follow what the other compiler does but let's cc Rust's linkage expert @petrochenkov

wait for upstream instead of overriding it's current behavior.

That in Clang code hasn't been updated for years and I doubt anybody is going to touch that mess soon...

@tmiasko
Copy link
Contributor

tmiasko commented Jun 29, 2020

@LifeIsStrange could you clarify what OS, distro, linker, compilation flags do you use that GNU hash section is not generated?

@LifeIsStrange
Copy link
Author

@tmiasko I'm on manjaro (a archlinux) and I did not test it but by looking at the linux.cpp code from llvm it seems clear that it is not enabled.

Is there a command to check whether a binary has gnu.hash set? (and ideally that the classic hash section is empty so that it doesn't needlessly take storage)

@tmiasko
Copy link
Contributor

tmiasko commented Jun 29, 2020

On Arch the rustc will be using GCC as a compiler driver which as indicated earlier by mati865 is configured to use --hash-style=gnu. For example rustc-macros from Arch package repository:

$ readelf -S /usr/lib/librustc_macros-8398e2791a3728ac.so | grep -A1 -i hash
  [ 2] .gnu.hash         GNU_HASH         0000000000000298  00000298
         0000000000005df8  0000000000000000   A       3     0     8

@mati865
Copy link
Contributor

mati865 commented Jun 30, 2020

@tmiasko Clang is using --hash-style=both though:

 mateusz@arch  /tmp  gcc hello.c -o hello && readelf -S hello | grep -A1 -i hash
  [ 4] .gnu.hash         GNU_HASH         0000000000000308  00000308
       000000000000001c  0000000000000000   A       5     0     8
 mateusz@arch  /tmp  clang hello.c -o hello && readelf -S hello | grep -A1 -i hash
  [ 3] .hash             HASH             00000000000002e8  000002e8
       0000000000000030  0000000000000004   A       5     0     8
  [ 4] .gnu.hash         GNU_HASH         0000000000000318  00000318
       000000000000001c  0000000000000000   A       5     0     8

I'll prepare patch for Clang when I have a time unless I forget.

@LifeIsStrange
Copy link
Author

LifeIsStrange commented Jun 30, 2020

On manjaro with a binary compiled on my system on nightly rust I get:

@tmiasko readelf -S myBinary | grep -A1 -i hash
  [ 4] .gnu.hash         GNU_HASH         0000000000000340  00000340
       0000000000000050  0000000000000000   A       5     0     8

So as you said @tmiasko it seems to works fine for rust, at least on arch

@nagisa
Copy link
Member

nagisa commented Jul 4, 2020

I can imagine people wanting to both force turn-on and turn-off this functionality. This could be implemented as a -C level rustc flag.

@bjorn3
Copy link
Member

bjorn3 commented Jul 13, 2023

The official rustc 1.70.0 build for x86_64-unknown-linux-gnu uses DT_GNU_HASH and no DT_HASH for the standard library. And user compiled code will use whichever the default for their distro is unless they explicitly override it.

I can imagine people wanting to both force turn-on and turn-off this functionality. This could be implemented as a -C level rustc flag.

-Clink-arg=--hash-style=sysv/gnu/both is enough for this, right? I don't see a dedicated option as being warranted.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-linkage Area: linking into static, shared libraries and binaries C-feature-request Category: A feature request, i.e: not implemented / a PR. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

7 participants