-
Notifications
You must be signed in to change notification settings - Fork 7.8k
Fix GH-12143: Optimize round #12268
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix GH-12143: Optimize round #12268
Conversation
I would disagree with that. 0.285 is not a valid IEEE-754 double precision floating point number and transformed into You can't satisfy everyone and thus correctly rounding the internal representation seems to be the right choice. |
I understand your opinion. I was also quite worried. I look at implementations in other languages, the behavior is different, such as Ruby is The reason I chose So this is a debate about whether we should change the clearly intended behavior of I don't have any strong opinions, but this PR is the result of fixing bugs while maintaining the current php specifications. |
I'm concerned about complicating the functionality, but I have an idea to create a new third argument to select the mode for this problem we're talking about. Or bitmask. |
I don't think of floating point numbers as "things that have the only correct value," but rather as "values with a range," which include some degree of fluctuation due to errors. Therefore, I thought that if the value in IEEE754 is the same, it should be treated as the same value even if the decimal representation is different. If we follow this idea, the range of values This is my personal opinion, and I have no intention of forcing it without discussion. |
FWIW, the Python documentation notes:
|
I see. I understand the philosophy of Python. |
I would like to ask for a review, please. |
The Python implementation seems to be located around: https://github.com/python/cpython/blob/3.12/Python/bltinmodule.c#L2357 |
To make the purpose of this PR easier to understand, I have added key points to the PR description. |
But that entire premise is flawed. As per:
All numbers from 0.28499999999999995 to 0.28500000000000000 have the same internal representation. Most of them are smaller than 0.285, thus treating them as equal to 0.285 doesn't really make sense. Instead it should be treated as the value in the middle to minimize the error and indeed the Rounding |
Thank you for your detailed speculation. It's very easy to understand. I would like to hear other people's opinions on this matter. This is because there is no clearly defined correct answer to this problem, and this is equivalent to the act of "determining language specifications". If the "premise" are correct, I think my changes completely solve the problem. In other words, the remaining debate is whether the premise is correct. However, I feel that an RFC is necessary in order to revoke something determined by an RFC. |
By the way, the referenced article touches on this issue as follows:
|
I've spent way too much time looking into rounding and how it works and trying to come up with some straight forward solutions. But I agree with Tim here, FP numbers are FP numbers, and I don't understand why This whole code is kinda bonkers, I don't know if there is a reasonable way to extract the fractional part as a 64bit integer. But if yes, this might make the most sense to do and work with integers. |
Thank you. If we were to take Tim's suggestion, I would be reverting the helper changes and fixing some test expectations.
Will it be a problem in a 32-bit environment? (edit) Oops, there are no problem if we use int64_t. (edit2) If we convert it to an integer type and then process it, it may not work as Tim suggested.
If you're talking about my changes, you might be right. |
tmp_value = value / f1; | ||
} | ||
/* This value is beyond our precision, so rounding it is pointless */ | ||
if (fabs(tmp_value) >= 1e15) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By the way, to increase the number of digits that can be processed, you can simply change this to 1e16.
Make this a draft and delete it once #12291 is merged. |
Because that is what people that use the language expect. If they write |
How is this different to a user expecting Also, how is If we want to provide accurate |
Admittedly, I was also concerned about that. This is similar to It seems like the argument is whether or not to consider that the value |
There has been since forever a massive warning about FP numbers in the docs: https://www.php.net/manual/en/language.types.float.php This is a known issue with floating point numbers. And I don't see why If people need arbitrary precision, then they should use the BCMath extension. |
If Admittedly, in the current situation, we are seeking |
before (4 times):
after (4 times):
|
24e7083
to
82ea718
Compare
Travis is slow... Only Travis has not passed the test yet, so I would like to wait for it to turn green if possible. |
feeb2fb
to
9a66d69
Compare
When I consider things like mainframes, there's a lot more to think about... (edit) I may be able to do it with |
21e7898
to
1de4149
Compare
Oh, it worked! |
edce2fd
to
ae63dc0
Compare
ae63dc0
to
b9d4f3c
Compare
I fixed a few things that bothered me, and all the changes were completed. |
ext/standard/math.c
Outdated
#define PHP_ROUND_BASIC_EDGE_CASE() do {\ | ||
if (places > 0) {\ | ||
edge_case = fabs((integral + copysign(0.5, integral)) / exponent);\ | ||
} else {\ | ||
edge_case = fabs((integral + copysign(0.5, integral)) * exponent);\ | ||
}\ | ||
} while (0) | ||
#define PHP_ROUND_ZERO_EDGE_CASE() do {\ | ||
if (places > 0) {\ | ||
edge_case = fabs((integral) / exponent);\ | ||
} else {\ | ||
edge_case = fabs((integral) * exponent);\ | ||
}\ | ||
} while (0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not a fan of those sort of macros because they make the code confusing.
integral
and exponent
should be arguments to the macro at minimum, and I'd prefer if edge case was returned.
Also, why not have these as inline functions (possibly marked with zend_always_inline
) as this would be IMHO clearer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, that makes sense. I'll fix it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed those!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'm happy now :)
ext/standard/math.c
Outdated
|
||
/* {{{ php_round_helper | ||
Actually performs the rounding of a value to integer in a certain mode */ | ||
static inline double php_round_helper(double integral, double value, double exponent, int places, int mode) { | ||
static inline double php_round_helper(double integral, const double value, const double exponent, const int places, const int mode) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const
for non pointer parameters doesn't do anything as far as I know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't even know why I decided to do this... I might have been half asleep...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I fixed it!
I'll wait a little longer and if it looks okay, I'll merge it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work
(edit)
As the policy regarding edge case determination for
round()
has been finalized in the next RFC, the contents of this pull request is adopted.https://wiki.php.net/rfc/change_the_edge_case_of_round
This PR completely fixes #12143.
I am very sorry that I ended up changing about 40% of the changes that @TimWolla made, but this was the only way to solve the
round()
problem.I completely removed "pre-round" and improved value comparison in
php_round_helper
.For example, numbers such as
0.285
and1.235
may not be treated as edge cases even if we usemodf()
.This is because they are treated internally by PHP as
0.28499999999999998
and1.2350000000000001
, respectively.This does not mean that
0.28499999999999998
is correct and0.285
is incorrect. Both are3fd23d70a3d70a3d
in IEEE754, only the decimal representation is different.Therefore, the only way to truly determine the edge case is to generate an edge case value and compare it to the passed value.
Regards.
Click here for more information about round.
https://wiki.php.net/rfc/rounding
Main points
The discussion is scattered all over the place and difficult to understand, so I will summarize the main points.
There were two problems with round
round(0.49999999999999994, 0)
, where "adding 0.5 will carry up due to error'' are incorrectly rounded.round(1.70000000000145, 13)
1 was already fixed and I was trying to fix 2.
Learn more about problem 2
"pre-round" performs the following calculations.
Wrongly rounded due to "pre-round".
Why is "pre-round" necessary?
The need for "pre-round" is summarized in the following article, "Analysis of the problems of the previous round() implementation" and "Pre-rounding to the value's precision if possible".
https://wiki.php.net/rfc/rounding
Simply put, "pre-round" was needed to solve the kind of problem where the result of
round(0.285, 2)
is0.28
.About my changes
Removed "pre-round" and used helpers to accurately determine edge cases.
This allows us to correctly determine the use cases covered by "pre-round", while eliminating errors caused by "pre-round".