-
Notifications
You must be signed in to change notification settings - Fork 7.8k
[RFC] Define proper semantics for range() function #10826
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
32b04c4
to
aab3eff
Compare
aab3eff
to
c55fa01
Compare
ext/standard/tests/array/range/range_inputs_float_NAN_values.phpt
Outdated
Show resolved
Hide resolved
if (start_type >= IS_STRING || end_type >= IS_STRING) { | ||
/* If one of the inputs is NOT a string */ | ||
if (UNEXPECTED(start_type + end_type < 2*IS_STRING)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is probably fine for possible values (long, array, string), but can we make this explicit, like ((1 << start_type) | (1 << end_type)) != MAY_BE_STRING
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, but I don't think this would handle the case range('1', '9');
as both would be marked as IS_ARRAY
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure why I marked three lines instead of two, I was mainly referring to the if in 2842.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Either way, this still causes an issue if both inputs are string digits as start_type == IS_ARRAY
and end_type == IS_ARRAY
means that ((1 << start_type) | (1 << end_type))
will never be equal to MAY_BE_STRING
.
I thought about using the MAY_BE_*
constants and do some bit mask checks but found it simpler to just use a new value that is greater than IS_STRING
. I can however tweak it if necessary.
@@ -0,0 +1,97 @@ | |||
--TEST-- | |||
Test range() function digits |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test range() function digits | |
Test range() function with string digits |
I would also add test cases with string numbers bigger than '9', i.e. '12'. That should be also clear what to expect from such parameters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Such a test exists in range_inputs_numeric_basic.phpt
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've split them up into more files
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the confusion. The description of the RFC is correct, however I meant test cases like range("12","A"). In your implementation it returns:
Warning: range(): Argument #1 ($start) must be a string if argument #2 ($end) is a string, argument #2 ($end) converted to 0 in Command line code on line 1
array(13) {
[0]=>
int(12)
[1]=>
int(11)
[2]=>
int(10)
[3]=>
int(9)
[4]=>
int(8)
[5]=>
int(7)
[6]=>
int(6)
[7]=>
int(5)
[8]=>
int(4)
[9]=>
int(3)
[10]=>
int(2)
[11]=>
int(1)
[12]=>
int(0)
}
The logic behind is arguable. I'm not sure exactly how it should behave for such case, but for sure the description is kind misleading. Because both of the arguments are string already.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
range()
can only work with single byte strings. Thus the behaviour of trying to interpret the numeric string as an int is the correct one. I will amend the error message to precise "single byte strings"
This is a BC break but currently passing arrays, objects, or resources doesn't throw as it should.
Add way more cases with strings to showcase the current madness.
Have a clear error message for this case, and eliminate a bunch of useless conditions as this is now pre-checked
This changes the values of e.g. range(1, 5, 2.0); from being floats to ints.
The start and end value are not necessarily the issue, it only is when the step parameter does to small increments
Analysis of OSS packages shows this never happens
This reverts commit 75c76d6.
331f11f
to
b9c7382
Compare
RFC: To be written
range()
currently has outright insane behaviour. It performs no type checking whatsoever, does not handle INF or NAN values properly, will type juggle strings to integers in situations which are unneeded or when it doesn't make any sense. And all of this fails silently.This PR amends the behaviour of
range()
to TypeError or ValueError on edge cases (like NAN values) that make no sense whatsoever, and are unlikely to be used in the wild. In other weird cases anE_WARNING
is now emitted with the intended purpose to be promoted to ValueError in PHP 9.This PR does not change the behaviour of generating character arrays, which perform ASCII code point increments, therefore the peculiar behaviour of
range('A', 'z');
remains unchanged with no warnings.