Skip to content

Enable -Wassign-enum in the Cirrus clang build (plus two fixes) #10641

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 8 commits into from

Conversation

MaxKellermann
Copy link
Contributor

See #10617 (review)

I've submitted the two fixes to PHP-8.1 as well: #10640 - if that PR gets merged, I'll drop these commits here. You decide.

@MaxKellermann
Copy link
Contributor Author

Some code fails with this warning - e.g. enums are (ab)used by the JIT and others as bit fields, which is not correct.
That should be fixed - but I still suggest merging this now, so everybody can see what needs to be fixed.

@iluuu1994
Copy link
Member

That should be fixed - but I still suggest merging this now, so everybody can see what needs to be fixed.

This is the only ASAN build for push, so I don't think it should be failing, especially in the build step.

@MaxKellermann
Copy link
Contributor Author

How would you suggest fixing those bit fields?

@MaxKellermann
Copy link
Contributor Author

Dropped the two fix commits because #10640 was merged to PHP-8.1.

@iluuu1994
Copy link
Member

How would you suggest fixing those bit fields?

I was hoping there was an attribute to specify whether an enum could be bit-wised but it doesn't seem so... Depends on how many errors there are. I think it would be fair to convert those cases to ints, and specifying the enum type in a comment. If there are many such cases (i.e. dozens) we should discuss this first.

@MaxKellermann
Copy link
Contributor Author

I tried to find out whether using enums as bit fields is even allowed in the standard, and apparently, the standard doesn't disallow it. Though I think using real bit fields would be preferable.

While this clang warning is very useful to find obvious mistakes (like the ones I already fixed), it can easily be silenced by adding an explicit cast. This adds some clutter to the code for a warning workaround, but in this case (if bit fields are not desired) I think it's worth it.

@MaxKellermann
Copy link
Contributor Author

I don't think it should be failing

I've added workarounds. Let's see if I've found them all. Waiting for CI to finish.

@MaxKellermann
Copy link
Contributor Author

Now the compiler succeeds, but the linker fails:

/tmp/cirrus-ci-build/ext/intl/intl_convertcpp.cpp:28: undefined reference to `__ubsan_vptr_type_cache'
/usr/bin/ld: /tmp/cirrus-ci-build/ext/intl/intl_convertcpp.cpp:28: undefined reference to `__ubsan_handle_dynamic_type_cache_miss'
/usr/bin/ld: /tmp/cirrus-ci-build/ext/intl/intl_convertcpp.cpp:36: undefined reference to `__ubsan_vptr_type_cache'

I think this is due to patch "set CXXFLAGS as well", and I believe PHP needs to call clang++ instead of clang for linking if a C++ object is being linked. But that's out of this PR's scope, so I'll drop this patch for now. It fixes a mistake, but I'll figure this out later in another PR.

@iluuu1994
Copy link
Member

@MaxKellermann Sounds good, thank you!

@MaxKellermann
Copy link
Contributor Author

That worked. ASAN_DEBUG_NTS build succeeded (only the tests are still running atm).

@MaxKellermann
Copy link
Contributor Author

but I'll figure this out later in another PR.

#10663

@dstogov
Copy link
Member

dstogov commented Feb 22, 2023

C allows using bitmasks in enums. Instead of fixing the problems by typecasting hacks, it's better to include the necessary bits into the enums declarations.

The bigger problem that -Wassign-enum is not available in GCC, and will probably never available (see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67314 ). They think - "clang is wrong to implement it for c".

@MaxKellermann
Copy link
Contributor Author

better to include the necessary bits into the enums declarations.

The problem here is the inverted masks (for bit-wise AND). You don't want to add inverted versions of all flags, do you?

The bigger problem that -Wassign-enum is not available in GCC

I don't think that alone is a problem. If a warning option is only available in clang and we decide it's a benefit (because it finds bugs - and indeed this one found a few in PHP), we should take advantage of it.

@iluuu1994
Copy link
Member

iluuu1994 commented Feb 22, 2023

I don't think that alone is a problem. If a warning option is only available in clang and we decide it's a benefit (because it finds bugs - and indeed this one found a few in PHP), we should take advantage of it.

I agree with this. I understand it's annoying to have a failing build after building successfully locally but that's still better than a potential regression.

@dstogov
Copy link
Member

dstogov commented Feb 22, 2023

better to include the necessary bits into the enums declarations.

The problem here is the inverted masks (for bit-wise AND). You don't want to add inverted versions of all flags, do you?

We have to add just ENUM_MAX_RANGE (that is power of 2 minus 1).

The bigger problem that -Wassign-enum is not available in GCC

I don't think that alone is a problem. If a warning option is only available in clang and we decide it's a benefit (because it finds bugs - and indeed this one found a few in PHP), we should take advantage of it.

Finding problems is good, hiding them is not. This might help to find real problem in a couple of places, but adding type casts will hide problems forever.

@MaxKellermann
Copy link
Contributor Author

We have to add just ENUM_MAX_RANGE (that is power of 2 minus 1).

No, that doesn't work. Why would it?

Take this line:

stop &= ~ZEND_JIT_TRACE_HALT;

clang complains about this line, because ~ZEND_JIT_TRACE_HALT is not in the enum's range.

ZEND_JIT_TRACE_HALT is defined here:

ZEND_JIT_TRACE_HALT = 0x40

Therefore, ~ZEND_JIT_TRACE_HALT == ~0x40 == 0xffffffbf == -65. This value is not found anywhere in enum _zend_jit_trace_stop.

Let's say you add some ENUM_MAX_RANGE with value "power of 2 minus 1" - which value exactly is that? I don't know, but certainly it's not 0xffffffbf, therefore the clang warning will persist.

@MaxKellermann
Copy link
Contributor Author

adding type casts will hide problems forever

Which problems will it hide?
The explicit cast tells clang "trust me, this value is fine", and we add those casts where we decide it's fine.

C allows using bitmasks in enums

That is correct, and I had already acknowledged that (prior to researching it, I wasn't sure).

The code was correct already before I added the casts, and this clang warning is indeed a false positive. It is arguable whether adding workarounds for false positives is a good idea. The other type of workaround (using the underlying type as bit field) is worse IMO because it removes information from the source code, making it less readable and more obscure. Anyway, there is no right and now wrong about that decision; it is personal taste.

Not my idea - I created this PR only because @iluuu1994 asked to enable this warning.

@dstogov
Copy link
Member

dstogov commented Feb 22, 2023

Let's say you add some ENUM_MAX_RANGE with value "power of 2 minus 1" - which value exactly is that? I don't know, but certainly it's not 0xffffffbf, therefore the clang warning will persist.

If ZEND_JIT_TRACE_HALT = 0x40 is the higher possible bit, the ENUM_MAX_RANGE should be 0x7f. This should eliminate warning without type casting, of course, if -Wassign-enum in CLANG follows C semantic.

@MaxKellermann
Copy link
Contributor Author

If ZEND_JIT_TRACE_HALT = 0x40 is the higher possible bit, the ENUM_MAX_RANGE should be 0x7f. This should eliminate warning without type casting, of course, if -Wassign-enum in CLANG follows C semantic.

I don't understand. How can adding an enumerator with value 0x7f suddenly allow 0xffffffbf? Adding 0x7f would allow 0x7f and nothing else. ("Allow" means convince clang to not emit a warning.)

This isn't about C semantics. As I already said, the code is correct. Warnings are not about incorrect code, but about suspicious code that may be buggy.

@dstogov
Copy link
Member

dstogov commented Feb 22, 2023

stop &= ~ZEND_JIT_TRACE_HALT; resets a single bit. If stop was in range [0..0x7f] it won't go out of range.

@iluuu1994
Copy link
Member

iluuu1994 commented Feb 22, 2023

We have to add just ENUM_MAX_RANGE (that is power of 2 minus 1).

I think what you meant was "multiplied by 2", or "shift left by 1", and minus 1.

This seems to behave accordingly.

https://godbolt.org/z/qvxsxP4oP

Oops, I messed up the third value. This doesn't actually work. We'd need a case for each possible combination, which we obviously can't once we have more than 2-3 flags.

@MaxKellermann
Copy link
Contributor Author

stop &= ~ZEND_JIT_TRACE_HALT; resets a single bit. If stop was in range [0..0x7f] it won't go out of range.

That's not how this clang warning is implemented. I repeat: clang sees the value 0xffffffbf and figures it's out of range because no such enumerator exists.

@dstogov
Copy link
Member

dstogov commented Feb 27, 2023

stop &= ~ZEND_JIT_TRACE_HALT; resets a single bit. If stop was in range [0..0x7f] it won't go out of range.

That's not how this clang warning is implemented. I repeat: clang sees the value 0xffffffbf and figures it's out of range because no such enumerator exists.

Then, this is a clang bug (or feature), because we don't assign 0xffffffbf to enum.
I don't like this, because of type casts (as I told, they may just hide missed or new problems) and make code less readable.
Probably, a better approach could be introducing macros (or inline fubctions) with "good" names.

@iluuu1994
Copy link
Member

I tried something here: 9b76379

This adds a macro to create bit enums. Basically, it just declares the enum, calculates the mask by |ing all values together, and declares a function that asserts the value is correct and then casts it accordingly.

@dstogov Are you happy with this approach? This way we have compile time safety for non-bit enums and at least runtime safety (in debug mode) for bit-enums.

@MaxKellermann
Copy link
Contributor Author

MaxKellermann commented Feb 27, 2023

Are you happy with this approach?

You didn't ask me, but I'll drop my opinion anyway: I don't like it, because I hate the preprocessor, and this adds just more weird macros.

C has a native way to declare bit fields. For zend_ffi_flags, it could look like this:

struct zend_ffi_flags {
  bool is_const:1;
  bool is_owned:1;
  bool is_persistent:1;
};

This isn't obscure and you can use it without fiddling bits and masks. No hard-coded shifted values like (1 << 2). This:

  cdata->flags = (zend_ffi_flags)(cdata->flags & ~(ZEND_FFI_FLAG_OWNED|ZEND_FFI_FLAG_PERSISTENT));

becomes:

  cdata->flags.is_owned = false;
  cdata->flags.is_persistent = false;

Initialization is also easy; this:

  cdata->flags = ZEND_FFI_FLAG_OWNED;

becomes:

  cdata->flags = { .is_owned = true };

Really, can it be any easier?

To trade some performance for memory overhead, omit the :1.
Nice side effects: you can have bit fields with more than 32 members, and it can be packed at byte borders, and can be smaller than 4 bytes.

This is completely type safe; bad operations lead to compiler errors (not warnings, not optional).

There is no overhead for passing those structs around if they fit in a register; it even results in identical machine code! Let me godbolt this for you: https://godbolt.org/z/a6ncYT1n1

(If this convinces you, I'd be happy to submit a PR.)

@iluuu1994
Copy link
Member

I tried to keep it as close to the original code to avoid breaking code. Granted, that's not relevant for the .c file I picked. If we can adjust the declaration using bitfields does seem preferable. There are some enums I don't think we can change, like php_stream_xport_crypt_method_t. It's also exposed in userland.

Whether a new concept like bit enums are worth it for these few cases, I'm not sure.

@MaxKellermann
Copy link
Contributor Author

There are some enums I don't think we can change, like php_stream_xport_crypt_method_t.

But that's not a bit field, is it? I don't see any bit-wise operations on it.

@MaxKellermann MaxKellermann deleted the clang_wassign_enum branch March 6, 2023 04:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants