Tackling Undefined Behaviour Casts

Currently the result of certain floating-point casts are Undefined (as in can cause Undefined Behaviour):

This is an annoying wart on Rust’s current implementation, and we should fix it. Note that at least on x86_64 linux the example f64 as f32 cast just produces inf (which is is pretty reasonable IMHO), while the f32 to u8 example seems to produce completely random results (not sure if actual undefs are being made, but that seems believable).

I’m happy with these “nonsense” casts having unspecified behaviour so that we can e.g. inherit whatever the platform decides to do, as long as it doesn’t violate memory safety like the current design can. A solution that doesn’t add overhead seems ideal to me. Having to specify that e.g. 1000.0 as u8 == u8::MAX may be too cumbersome. Although note that this has a complex interaction with cross-compilation and const-evaluation.

I lack the requisite familiarity with LLVM to know what the best way forward is, though. I’d also be interested to hear if there are usecases for these casts having specified behaviour.

2 Likes

This makes undefs.

Just to be clear, are you referring to the following?

Yes

My understanding of the Rust literature is that creating undefs is undefined behavior (it’s in the list), but that only some of the LLVM uses of undef actually lead to undefined behavior (e.g. floating point division by an undef). Is it possibly to articulate which instances of undef actually lead to UB, for example “just the ones that lead LLVM to UB”, or is it more complicated than that?

My first thought on this, is make such things panic, as it’s already a case with wrapping arithmetics. I’m not a language designer, so I’m not sure I have a voice here, but as a language user it seems reasonable and consistent behavior.

I agree that it would make sense for them to panic in debug builds, but it is still necessary to figure out what should happen for builds without overflow checks.

Undefs, huh? Undefs are fun. They tend to propagate. After a few minutes of wrangling…

#[inline(never)]
pub fn f(ary: &[u8; 5]) -> &[u8] {
    let idx = 1e100f64 as usize;
    &ary[idx..]
}

fn main() {
    println!("{}", f(&[1; 5])[0xdeadbeef]);
}

segfaults on my system (latest nightly) with -O.

8 Likes

You can access platform-specific behavior through LLVM intrinsics; on x86, for example, you can use @llvm.x86.sse.cvttss2si and friends. A bit annoying, but workable.

There are essentially three behaviors Rust can provide: saturate, fail (either Option or panic), and platform-specific. No matter what as does, it’s probably a good idea to make all of these available as standard library functions. I would guess the right default for as is to panic in debug builds, and use platform-specific behavior in release builds. This parallels integer overflow: the performance cost of checking the conversion by default is probably too high.

1 Like

We’ve previously established that as is an unchecked op regardless of build mode (1000u32 as u8 just truncates), so doing anything special in debug builds is almost certainly not going to happen. We do however have plans for “checked cast” variants somewhere in the std lib.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.