Fix GH-12143: Optimize round #12268

SakiTakamachi · 2023-09-22T09:20:40Z

(edit)

As the policy regarding edge case determination for round() has been finalized in the next RFC, the contents of this pull request is adopted.
https://wiki.php.net/rfc/change_the_edge_case_of_round

This PR completely fixes #12143.

I am very sorry that I ended up changing about 40% of the changes that @TimWolla made, but this was the only way to solve the round() problem.

I completely removed "pre-round" and improved value comparison in php_round_helper.
For example, numbers such as 0.285 and 1.235 may not be treated as edge cases even if we use modf().

This is because they are treated internally by PHP as 0.28499999999999998 and 1.2350000000000001, respectively.
This does not mean that 0.28499999999999998 is correct and 0.285 is incorrect. Both are 3fd23d70a3d70a3d in IEEE754, only the decimal representation is different.

Therefore, the only way to truly determine the edge case is to generate an edge case value and compare it to the passed value.

Regards.

Click here for more information about round.

https://wiki.php.net/rfc/rounding

Main points

The discussion is scattered all over the place and difficult to understand, so I will summarize the main points.

There were two problems with round

Values such as round(0.49999999999999994, 0), where "adding 0.5 will carry up due to error'' are incorrectly rounded.
"pre-round" incorrectly rounds values like round(1.70000000000145, 13)

1 was already fixed and I was trying to fix 2.

Learn more about problem 2

"pre-round" performs the following calculations.

round(1.70000000000145, 13)

1.70000000000145
17000000000014.5 // digit adjustment
17000000000015 // pre-round
1700000000001.5 // digit adjustment
1700000000002 // round
1.700000000002 // return

Wrongly rounded due to "pre-round".

Why is "pre-round" necessary?

The need for "pre-round" is summarized in the following article, "Analysis of the problems of the previous round() implementation" and "Pre-rounding to the value's precision if possible".

https://wiki.php.net/rfc/rounding

Simply put, "pre-round" was needed to solve the kind of problem where the result of round(0.285, 2) is 0.28.

About my changes

Removed "pre-round" and used helpers to accurately determine edge cases.

This allows us to correctly determine the use cases covered by "pre-round", while eliminating errors caused by "pre-round".

TimWolla · 2023-09-22T15:43:59Z

This is because they are treated internally by PHP as 0.28499999999999998 and 1.2350000000000001, respectively.
This does not mean that 0.28499999999999998 is correct and 0.285 is incorrect. Both are 3fd23d70a3d70a3d in IEEE754, only the decimal representation is different.

I would disagree with that. 0.285 is not a valid IEEE-754 double precision floating point number and transformed into 0.28499999999999998 when parsing the code. Thus the expectation that it would be rounded to 0.29 would be incorrect and not an issue with the rounding functionality. The rounding functionality should only concern itself with numbers that are representable as IEEE-754 double precision floating point numbers, because otherwise we have the problem that users that intentionally want to work with 0.28499999999999998 instead of 0.285 would see incorrect rounding.

You can't satisfy everyone and thus correctly rounding the internal representation seems to be the right choice.

SakiTakamachi · 2023-09-22T15:57:17Z

@TimWolla

I understand your opinion. I was also quite worried.

I look at implementations in other languages, the behavior is different, such as Ruby is 0.29 and ~~Python is 0.28~~.
(edit: Python rounds to even numbers, so it was an inappropriate comparison target. )

The reason I chose 0.29 is that this is exactly the value used in the round() test.

https://github.com/php/php-src/blob/master/ext/standard/tests/math/round_prerounding.phpt

So this is a debate about whether we should change the clearly intended behavior of round().

I don't have any strong opinions, but this PR is the result of fixing bugs while maintaining the current php specifications.

SakiTakamachi · 2023-09-22T16:13:50Z

I'm concerned about complicating the functionality, but I have an idea to create a new third argument to select the mode for this problem we're talking about.

Or bitmask.

SakiTakamachi · 2023-09-22T17:17:21Z

I don't think of floating point numbers as "things that have the only correct value," but rather as "values with a range," which include some degree of fluctuation due to errors.

Therefore, I thought that if the value in IEEE754 is the same, it should be treated as the same value even if the decimal representation is different.

If we follow this idea, the range of values 3fd23d70a3d70a3d includes 0.285, so this is treated as an edge case.

This is my personal opinion, and I have no intention of forcing it without discussion.

TimWolla · 2023-09-23T14:37:35Z

(edit: Python rounds to even numbers, so it was an inappropriate comparison target. )

FWIW, the Python documentation notes:

Note The behavior of round() for floats can be surprising: for example, round(2.675, 2) gives 2.67 instead of the expected 2.68. This is not a bug: it’s a result of the fact that most decimal fractions can’t be represented exactly as a float. See Floating Point Arithmetic: Issues and Limitations for more information.

2.675 is 2.6749999999999998 in reality, so that explains why Python rounds this to 2.67. Interestingly PHP scales this by multiplying with 100, resulting in 267.5. I wonder what Python does there internally to not lose precision when rounding to a given number of places.

SakiTakamachi · 2023-09-23T14:59:36Z

I see.

I understand the philosophy of Python.

SakiTakamachi · 2023-09-23T15:03:15Z

@Girgias

I would like to ask for a review, please.

Girgias · 2023-09-23T15:03:51Z

(edit: Python rounds to even numbers, so it was an inappropriate comparison target. )

FWIW, the Python documentation notes:

Note The behavior of round() for floats can be surprising: for example, round(2.675, 2) gives 2.67 instead of the expected 2.68. This is not a bug: it’s a result of the fact that most decimal fractions can’t be represented exactly as a float. See Floating Point Arithmetic: Issues and Limitations for more information.

2.675 is 2.6749999999999998 in reality, so that explains why Python rounds this to 2.67. Interestingly PHP scales this by multiplying with 100, resulting in 267.5. I wonder what Python does there internally to not lose precision when rounding to a given number of places.

The Python implementation seems to be located around: https://github.com/python/cpython/blob/3.12/Python/bltinmodule.c#L2357

SakiTakamachi · 2023-09-23T16:27:06Z

To make the purpose of this PR easier to understand, I have added key points to the PR description.

TimWolla · 2023-09-23T16:43:42Z

Simply put, "pre-round" was needed to solve the kind of problem where the result of "round(0.285, 2)" is "0.28".

But that entire premise is flawed. As per:

#include <stdio.h>
#include <math.h>

int
main(void) {
	printf("%.17g\n", 0.28499999999999994);
	printf("%.17g\n", 0.28499999999999995);
	printf("%.17g\n", 0.28499999999999996);
	printf("%.17g\n", 0.28499999999999997);
	printf("%.17g\n", 0.28499999999999998);
	printf("%.17g\n", 0.28499999999999999);
	printf("%.17g\n", 0.28500000000000000);
	printf("%.17g\n", 0.28500000000000001);
}

All numbers from 0.28499999999999995 to 0.28500000000000000 have the same internal representation. Most of them are smaller than 0.285, thus treating them as equal to 0.285 doesn't really make sense. Instead it should be treated as the value in the middle to minimize the error and indeed the %.17g representation is 0.28499999999999998 which is in the middle.

Rounding 0.285 to anything other than 0.28 (or whatever the nearest representation is) would IMO be incorrect.

SakiTakamachi · 2023-09-23T16:51:42Z

@TimWolla

Thank you for your detailed speculation. It's very easy to understand.

I would like to hear other people's opinions on this matter.

This is because there is no clearly defined correct answer to this problem, and this is equivalent to the act of "determining language specifications".

If the "premise" are correct, I think my changes completely solve the problem.

In other words, the remaining debate is whether the premise is correct.

However, I feel that an RFC is necessary in order to revoke something determined by an RFC.

SakiTakamachi · 2023-09-23T16:59:52Z

By the way, the referenced article touches on this issue as follows:

Of course, one may argue that pre-rounding is not necessary and that this is simply the problem with FP arithmetics. This is true on the one hand, but the introduction of the places parameter made it clear that round() is to operate as if the numbers were stored as decimals. We can't revert that and this seems to me to be the best solutions for FP numbers one can get.

Girgias · 2023-09-23T19:07:03Z

I've spent way too much time looking into rounding and how it works and trying to come up with some straight forward solutions.

But I agree with Tim here, FP numbers are FP numbers, and I don't understand why round() tries to act as if they are rational numbers. They are not. Also, the linked document was written in 2008, FP controls are a C99 standard (although compilers still seem to do whatever they want) which is what php-src now uses, so maybe that's something we should consider again.

This whole code is kinda bonkers, I don't know if there is a reasonable way to extract the fractional part as a 64bit integer. But if yes, this might make the most sense to do and work with integers.

SakiTakamachi · 2023-09-24T01:17:20Z

Thank you.

If we were to take Tim's suggestion, I would be reverting the helper changes and fixing some test expectations.

This whole code is kinda bonkers, I don't know if there is a reasonable way to extract the fractional part as a 64bit integer. But if yes, this might make the most sense to do and work with integers.

Will it be a problem in a 32-bit environment?

(edit) Oops, there are no problem if we use int64_t.

(edit2)

If we convert it to an integer type and then process it, it may not work as Tim suggested.

var_dump((int) (0.285 * 1000));
// 285

If you're talking about my changes, you might be right.

SakiTakamachi · 2023-09-24T02:25:19Z

ext/standard/math.c

+		tmp_value = value / f1;
+	}
+	/* This value is beyond our precision, so rounding it is pointless */
+	if (fabs(tmp_value) >= 1e15) {


By the way, to increase the number of digits that can be processed, you can simply change this to 1e16.

SakiTakamachi · 2023-09-24T02:55:08Z

I just followed the existing specifications, so I don't have any strong claims. Since the opinions of @TimWolla and @Girgias are in agreement, I have no objection to changing the specifications.

Is it okay if I continue to make changes to the specifications without preparing an RFC?

SakiTakamachi · 2023-09-24T13:03:01Z

Make this a draft and delete it once #12291 is merged.

derickr · 2023-09-25T09:05:42Z

But I agree with Tim here, FP numbers are FP numbers, and I don't understand why round() tries to act as if they are rational numbers.

Because that is what people that use the language expect.

If they write round(0.285, 2), and they get 0.28, they very much could consider that to be a bug.

Girgias · 2023-09-26T02:37:25Z

But I agree with Tim here, FP numbers are FP numbers, and I don't understand why round() tries to act as if they are rational numbers.

Because that is what people that use the language expect.

If they write round(0.285, 2), and they get 0.28, they very much could consider that to be a bug.

How is this different to a user expecting 0.1 + 0.2 === 0.3 to be true, when in reality it is false? As users could also consider this a bug.

Also, how is $f2 = 0.284999999999999698019; echo round($f2, 2), "\n"; rounding to 0.29 not a bug? This is several floating points below the representation chosen for 0.285, and is a floating point number representable exactly.

If we want to provide accurate ~~decimal~~rational numbers for users, then let's actually add such a type.
But lying and fudging floating point numbers in ways that are not documented to the point that I now need to question if implementing numerical algorithms will actually give me correct results in PHP is a massive issue.

SakiTakamachi · 2023-09-26T03:15:03Z

How is this different to a user expecting 0.1 + 0.2 === 0.3 to be true, when in reality it is false? As users could also consider this a bug.

Admittedly, I was also concerned about that.

This is similar to floor(0.99999999999999995) being 1.(Although the direction is opposite)

It seems like the argument is whether or not to consider that the value 0.285 does not exist in FP.

Girgias · 2023-09-26T03:34:00Z

How is this different to a user expecting 0.1 + 0.2 === 0.3 to be true, when in reality it is false? As users could also consider this a bug.

Admittedly, I was also concerned about that.

This is similar to floor(0.99999999999999995) being 1.(Although the direction is opposite)

It seems like the argument is whether or not to consider that the value 0.285 does not exist in FP.

There has been since forever a massive warning about FP numbers in the docs: https://www.php.net/manual/en/language.types.float.php

This is a known issue with floating point numbers. And I don't see why round() should make an exception out of this.

If people need arbitrary precision, then they should use the BCMath extension.

SakiTakamachi · 2023-09-26T03:56:47Z

If BC Math had something like bcround() for example, it might be a good solution to satisfy the user's request to round 0.285 to 0.29.

Admittedly, in the current situation, we are seeking round() to play the role that BC Math should have, and it is undeniable that it is a little distorted.

SakiTakamachi · 2024-01-30T05:35:01Z

<?php
$start = microtime(true);
$n = 0.28499999999999995;
for ($i = 0; $i < 100000; $i++) {
    round(0.28499999999999995, 10, PHP_ROUND_TOWARD_ZERO);
    $n += 0.00000001;
}
var_dump(microtime(true) - $start);

before (4 times):

float(0.0359339714050293)
float(0.03636503219604492)
float(0.03540921211242676)
float(0.03581380844116211)

after (4 times):

float(0.030965805053710938)
float(0.02510809898376465)
float(0.030919790267944336)
float(0.032182931900024414)

SakiTakamachi · 2024-01-30T13:13:56Z

Travis is slow... Only Travis has not passed the test yet, so I would like to wait for it to turn green if possible.

SakiTakamachi · 2024-01-30T15:18:35Z

When I consider things like mainframes, there's a lot more to think about...

(edit)

I may be able to do it with fesetround(int round) of <fenv.h> etc.

SakiTakamachi · 2024-01-31T00:44:38Z

Oh, it worked!

SakiTakamachi · 2024-01-31T11:18:59Z

I fixed a few things that bothered me, and all the changes were completed.

Girgias · 2024-01-31T13:52:01Z

ext/standard/math.c

+#define PHP_ROUND_BASIC_EDGE_CASE() do {\
+		if (places > 0) {\
+			edge_case = fabs((integral + copysign(0.5, integral)) / exponent);\
+		} else {\
+			edge_case = fabs((integral + copysign(0.5, integral)) * exponent);\
+		}\
+	} while (0)
+#define PHP_ROUND_ZERO_EDGE_CASE() do {\
+		if (places > 0) {\
+			edge_case = fabs((integral) / exponent);\
+		} else {\
+			edge_case = fabs((integral) * exponent);\
+		}\
+	} while (0)


I'm not a fan of those sort of macros because they make the code confusing.

integral and exponent should be arguments to the macro at minimum, and I'd prefer if edge case was returned.

Also, why not have these as inline functions (possibly marked with zend_always_inline) as this would be IMHO clearer.

Thanks, that makes sense. I'll fix it.

Fixed those!

Girgias

I think I'm happy now :)

Girgias · 2024-02-01T16:19:34Z

ext/standard/math.c


 /* {{{ php_round_helper
 	   Actually performs the rounding of a value to integer in a certain mode */
-static inline double php_round_helper(double integral, double value, double exponent, int places, int mode) {
+static inline double php_round_helper(double integral, const double value, const double exponent, const int places, const int mode) {


const for non pointer parameters doesn't do anything as far as I know.

I don't even know why I decided to do this... I might have been half asleep...

I fixed it!

SakiTakamachi · 2024-02-02T13:10:21Z

I'll wait a little longer and if it looks okay, I'll merge it.

bukka

Nice work

github-actions bot added the Extension: standard label Sep 22, 2023

SakiTakamachi marked this pull request as ready for review September 22, 2023 10:15

SakiTakamachi requested a review from bukka as a code owner September 22, 2023 10:15

This was referenced Sep 22, 2023

Fix GH-12143: Improved handling of adjusting result digits #12162

Closed

round(): Validate the rounding mode #12252

Merged

SakiTakamachi mentioned this pull request Sep 23, 2023

round: Bypass the precision logic when rounding to 0 places #12284

Open

SakiTakamachi commented Sep 24, 2023

View reviewed changes

SakiTakamachi mentioned this pull request Sep 24, 2023

FIX GH-12143: Remove pre-rounding #12291

Closed

SakiTakamachi marked this pull request as draft September 24, 2023 13:03

SakiTakamachi force-pushed the fix/gh-12143-optimize-round branch 4 times, most recently from 24e7083 to 82ea718 Compare January 30, 2024 13:11

SakiTakamachi force-pushed the fix/gh-12143-optimize-round branch 2 times, most recently from feeb2fb to 9a66d69 Compare January 30, 2024 13:33

SakiTakamachi force-pushed the fix/gh-12143-optimize-round branch 2 times, most recently from 21e7898 to 1de4149 Compare January 30, 2024 23:52

SakiTakamachi requested a review from bukka January 31, 2024 00:44

SakiTakamachi force-pushed the fix/gh-12143-optimize-round branch 3 times, most recently from edce2fd to ae63dc0 Compare January 31, 2024 09:40

Changes the CPU rounding mode only during a specific process.

b9d4f3c

SakiTakamachi force-pushed the fix/gh-12143-optimize-round branch from ae63dc0 to b9d4f3c Compare January 31, 2024 09:44

[skip ci] NEWS/UPGRADING

1a36254

Girgias reviewed Jan 31, 2024

View reviewed changes

use zend_always_inline

e7e4725

Girgias approved these changes Feb 1, 2024

View reviewed changes

Remove unnecessary "const"

0c64e30

bukka approved these changes Feb 2, 2024

View reviewed changes

SakiTakamachi closed this in 78970ef Feb 3, 2024

SakiTakamachi deleted the fix/gh-12143-optimize-round branch February 3, 2024 13:24

SakiTakamachi mentioned this pull request Jun 3, 2024

round 123456789012.1245 failed with precision 2 #14451

Closed

Fix GH-12143: Optimize round #12268

Fix GH-12143: Optimize round #12268

Conversation

SakiTakamachi commented Sep 22, 2023 • edited Loading

Main points

There were two problems with round

Learn more about problem 2

Why is "pre-round" necessary?

About my changes

TimWolla commented Sep 22, 2023

SakiTakamachi commented Sep 22, 2023 • edited Loading

SakiTakamachi commented Sep 22, 2023 • edited Loading

SakiTakamachi commented Sep 22, 2023 • edited Loading

TimWolla commented Sep 23, 2023

SakiTakamachi commented Sep 23, 2023

SakiTakamachi commented Sep 23, 2023 • edited Loading

Girgias commented Sep 23, 2023

SakiTakamachi commented Sep 23, 2023

TimWolla commented Sep 23, 2023 • edited Loading

SakiTakamachi commented Sep 23, 2023 • edited Loading

SakiTakamachi commented Sep 23, 2023

Girgias commented Sep 23, 2023

SakiTakamachi commented Sep 24, 2023 • edited Loading

SakiTakamachi Sep 24, 2023

Choose a reason for hiding this comment

SakiTakamachi commented Sep 24, 2023

SakiTakamachi commented Sep 24, 2023

derickr commented Sep 25, 2023

Girgias commented Sep 26, 2023 • edited Loading

SakiTakamachi commented Sep 26, 2023 • edited Loading

Girgias commented Sep 26, 2023

SakiTakamachi commented Sep 26, 2023 • edited Loading

SakiTakamachi commented Jan 30, 2024

SakiTakamachi commented Jan 30, 2024

SakiTakamachi commented Jan 30, 2024 • edited Loading

SakiTakamachi commented Jan 31, 2024

SakiTakamachi commented Jan 31, 2024

Girgias Jan 31, 2024

Choose a reason for hiding this comment

SakiTakamachi Jan 31, 2024

Choose a reason for hiding this comment

SakiTakamachi Feb 1, 2024

Choose a reason for hiding this comment

Girgias left a comment

Choose a reason for hiding this comment

Girgias Feb 1, 2024

Choose a reason for hiding this comment

SakiTakamachi Feb 1, 2024

Choose a reason for hiding this comment

SakiTakamachi Feb 2, 2024

Choose a reason for hiding this comment

SakiTakamachi commented Feb 2, 2024

bukka left a comment

Choose a reason for hiding this comment

SakiTakamachi commented Sep 22, 2023 •

edited

Loading

SakiTakamachi commented Sep 22, 2023 •

edited

Loading

SakiTakamachi commented Sep 22, 2023 •

edited

Loading

SakiTakamachi commented Sep 22, 2023 •

edited

Loading

SakiTakamachi commented Sep 23, 2023 •

edited

Loading

TimWolla commented Sep 23, 2023 •

edited

Loading

SakiTakamachi commented Sep 23, 2023 •

edited

Loading

SakiTakamachi commented Sep 24, 2023 •

edited

Loading

Girgias commented Sep 26, 2023 •

edited

Loading

SakiTakamachi commented Sep 26, 2023 •

edited

Loading

SakiTakamachi commented Sep 26, 2023 •

edited

Loading

SakiTakamachi commented Jan 30, 2024 •

edited

Loading