Skip to content

Alternative fix for GH-10168 (alternative to GH-10500) #10524

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 4 commits into from

Conversation

nielsdos
Copy link
Member

@nielsdos nielsdos commented Feb 6, 2023

Alternative to GH-10500 as suggested by @dstogov .
I don't know if I did it 100% correctly though...
According to Callgrind, the instructions count for Zend/bench.php is:

  • 3,341,844,801 with zend_copy_to_variable (this approach was my first attempt, see commit log, but this caused 2 test failures in readdir tests. I probably did something wrong here...).
  • If I swap out the zend_copy_to_variable with ZVAL_COPY, I get 3,340,811,227 though (done in last commit).
    This is slightly higher than my first attempt in Fix GH-10168: heap-buffer-overflow at zval_undefined_cv  #10500.

nielsdos and others added 2 commits February 6, 2023 22:00
The problem is that we're using the variable_ptr in the opcode handler
*after* it has already been destroyed. The solution is to create a
specialised version of zend_assign_to_variable which takes in two
destination zval pointers.
@nielsdos nielsdos changed the title Fix 10168 alternative Alternative fix for GH-10168 (alternative to GH-10500) Feb 6, 2023
Copy link
Member

@iluuu1994 iluuu1994 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's an interesting discrepancy when using static properties as storage:

class Test
{
    static ?Test $a = null;

    public function __construct() {
	    var_dump(self::$a = $this);
    }

    function __destruct() {
        self::$a = null;
    }
}
new Test();
new Test();
object(Test)#1 (0) {
}
NULL

The second log looks wrong. This works fine with the existing tests.

Using ZVAL_COPY for the result looks correct. That was the previous behavior.

@iluuu1994 iluuu1994 requested a review from dstogov February 6, 2023 23:09
@dstogov
Copy link
Member

dstogov commented Feb 7, 2023

@nielsdos thanks for trying this. I played with variants of this and #10500 allis the morning.
After all, I think #10500 with minimal changes better.
I'll add my suggestions at #10500

@iluuu1994
Copy link
Member

@dstogov Just note that there's another issue that was discovered here (Zend/tests/gh10168_3.phpt) that should probably be ported over to the other PR.

@dstogov
Copy link
Member

dstogov commented Feb 7, 2023

@dstogov Just note that there's another issue that was discovered here (Zend/tests/gh10168_3.phpt) that should probably be ported over to the other PR.

Ugh. This changes everything :(
This case better fits into this PR.
Let me take another look by tomorrow morning.

Copy link
Member

@dstogov dstogov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. Lets go with this approach.

@iluuu1994 iluuu1994 closed this in 71ddede Feb 8, 2023
@iluuu1994
Copy link
Member

Thank you @nielsdos!

@iluuu1994
Copy link
Member

iluuu1994 commented Feb 11, 2023

I looked into this again. This was actually never a use-after-free. Here's what happened:

https://3v4l.org/EbPmd

class Test
{
    static $instances;

    public function __construct() {
        var_dump(self::$instances[0] = $this);
    }

    function __destruct() {
        unset(self::$instances[0]);
    }
}

new Test();
new Test();
object(Test)#1 (0) {
}
UNKNOWN:0
  1. The first new Test() allocates a new instance
  2. Test is assigned to self::$instances[0]
  3. The second new Test() allocates a new instance
  4. The DIM 0 of the array is fetched, which points into the array
  5. The second Test is assigned to the offset 0
  6. The destructor of the first Test is run, unsets offset 0, now containing an UNDEF zval
  7. The DIM offset is used to set EX_VAR(opline->result.var), copying the UNDEF zval
  8. The next instruction reads the UNDEF, tries to throw an "undefined variable" error, but fails looking up the variable name (buffer overflow)

The same can happen for untyped properties.

https://3v4l.org/9l3PK

class Box {
    public $value;
}

$box = new Box();

class Test
{
    public function __construct() {
        global $box;
        var_dump($box->value = $this);
    }

    function __destruct() {
        global $box;
        unset($box->value);
    }
}

new Test();
new Test();
object(Test)#2 (0) {
}
UNKNOWN:0

The reason I'm documenting this is that for all existing cases, the value is copied into result after the destructor has run. IMO this behavior is pretty unexpected and can lead VARs containing UNDEF values, as in this case. We can fix this by running the destructor after assigning result (this is what #10546 attempts to do) but this can't be done universally (i.e. for write_property) without breaking backwards compatibility. It's also technically a behavioral change.

The alternative approach would be checking for UNDEF values before copying them into result in the VM handlers, and then assigning something else like NULL. The behavior is still dubious IMO but might require less changes. I can try this approach if it is preferred. In that case we should probably revert this PR. As Niels mentioned below, a use-after-free could still happen if the array moves. So this approach is not viable.

Zend/tests/gh10168_3.phpt on the other hand is a true use-after-free. The reason for that is that zend_assign_to_variable returns the DEREFed zval, which might go away if the destructor releases it. This can only be reasonably solved by copying to the result before running the destructor, which is what this PR did.

@dstogov Let me know what you think
/cc @nielsdos

@nielsdos
Copy link
Member Author

@iluuu1994
I'm a bit confused. I believe there actually is a use after free possible in the example and flow you posted (although the test cases did not test for the example I'm about to give) (and it's for the old code because this PR should've fixed that). If the destructor in step 6 causes the array to grow so much that it needs to reallocate memory, then the fetched zval in step 4 can refer to old memory. Then at step 7 that old memory is used.
But I might be wrong.

@iluuu1994
Copy link
Member

@nielsdos Oh. You're right of course. A use-after-free is still possible when the array is reallocated. So I guess the only solution is to postpone the destructor after all...

@nielsdos
Copy link
Member Author

@iluuu1994 I'm just having some random thoughts now, but I might share them as well because it doesn't hurt to throw some ideas I gues... One possible implementation idea might be to add a (intrusive) linked list field to EX(...) where we add the Z_COUNTEDs of the objects-to-be-destructed to. Then the VM could at the end of each opcode handling check if there is something in the linked list, and if there is loop over it to execute the destructors. This could move all the special handling code until after the opcode. But ig this is both a BC break because the destructors are postponed, and an ABI break.

@iluuu1994
Copy link
Member

iluuu1994 commented Feb 11, 2023

@nielsdos The use-after-free happens on the zval slot, not on the object directly (bucket in the array, in this case). Thus I don't think that will work.

Oh, you meant for delaying the constructor, not just freeing the object. Hmm, let me think about that.

@iluuu1994
Copy link
Member

So, I don't think checking for dead objects after every opcode handler is a good idea (pretty sure Dmitry doesn't think so either). We could probably limit it to certain handlers, but determining which ones might not be so easy. Missing one would not be catastrophic (the object would just be cleaned later) but would make the point in which the destructor is called somewhat unstable. We could try this approach to see how it performs and whether it simplifies or complicates the code. IMO while it would be nice for this issue to go away, the cases are quite unlikely to occur in real-life code.

@dstogov
Copy link
Member

dstogov commented Feb 13, 2023

In PHP-5.0 we had a buffer (array) of "garbage" zvals and cleaned them after each VM instruction through zend_clean_garbage(). Of course, this made huge performance penalty.

@dstogov
Copy link
Member

dstogov commented Feb 14, 2023

Looking into this and the following #10546 proposal, I think this approach is wrong.
The patch becomes to complicated, it spreads logic among the code, duplicates code...

I propose to revert this and start developing the fix again. I think it should be targeted to master only.

I think, I found a simpler approach. It's actually based on @nielsdos original idea.
You may see the PoC patch here https://gist.github.com/dstogov/35d3aa1fb5d495805088c2983dbfa55c
It seems, this fixes all the *.phpt tests from #10546 proposal

I'm limited in time now, @iluuu1994 can you please verify my patch and think if it can/should be improved.

@iluuu1994
Copy link
Member

iluuu1994 commented Feb 14, 2023

@dstogov Thank you for looking into this! From a high level it seems like this approach should also solve the issues #10546 does. I will have a detailed look tomorrow and check all code paths again.

I think the following line is wrong:

- if (ZEND_VM_SPEC || UNEXPECTED(RETURN_VALUE_USED(opline))) {
+ if (UNEXPECTED(RETURN_VALUE_USED(opline))) {

The copy should not be done for all specializations unconditionally, but just the ones actually used when the return value is used.

Edit: Although that condition is guarded by if (!ZEND_VM_SPEC || UNEXPECTED(RETURN_VALUE_USED(opline))) { so it wouldn't be reached in that case anyway.

@dstogov
Copy link
Member

dstogov commented Feb 15, 2023

I use ZEND_VM_SPEC to avoid code duplication in unspecialized executor. (I mean zend_assign_to_variable() and zend_assign_to_variable_ex() code).

The code:

	if (!ZEND_VM_SPEC || UNEXPECTED(RETURN_VALUE_USED(opline))) {
		zend_refcounted *garbage = NULL;
		
		value = zend_assign_to_variable_ex(variable_ptr, value, OP2_TYPE, EX_USES_STRICT_TYPES(), &garbage);
		if (ZEND_VM_SPEC || UNEXPECTED(RETURN_VALUE_USED(opline))) {
			ZVAL_COPY(EX_VAR(opline->result.var), value);
		}

if ZEND_VM_SPEC == 1 it's translated as

	if (UNEXPECTED(RETURN_VALUE_USED(opline))) {
		zend_refcounted *garbage = NULL;
		
		value = zend_assign_to_variable_ex(variable_ptr, value, OP2_TYPE, EX_USES_STRICT_TYPES(), &garbage);
		if (1) {
			ZVAL_COPY(EX_VAR(opline->result.var), value);
		}

if ZEND_VM_SPEC == 0 it's translated as

	if (1) {
		zend_refcounted *garbage = NULL;
		
		value = zend_assign_to_variable_ex(variable_ptr, value, OP2_TYPE, EX_USES_STRICT_TYPES(), &garbage);
		if (UNEXPECTED(RETURN_VALUE_USED(opline))) {
			ZVAL_COPY(EX_VAR(opline->result.var), value);
		}

Seems right, but please verify this.

@iluuu1994
Copy link
Member

@dstogov

I think my assumption was correct. This is the generated code for specialized handlers:

if (0 || UNEXPECTED(0)) {
	zend_refcounted *garbage = NULL;

	value = zend_assign_to_variable_ex(variable_ptr, value, IS_CONST, EX_USES_STRICT_TYPES(), &garbage);
	if (1 || UNEXPECTED(0)) {
		ZVAL_COPY(EX_VAR(opline->result.var), value);
	}

In this case, the second if will always run, or will always be eliminated (because the outer if never evaluates to true).

This is the unspecialized handler:

if (1 || UNEXPECTED(RETURN_VALUE_USED(opline))) {
	zend_refcounted *garbage = NULL;

	value = zend_assign_to_variable_ex(variable_ptr, value, opline->op2_type, EX_USES_STRICT_TYPES(), &garbage);
	if (0 || UNEXPECTED(RETURN_VALUE_USED(opline))) {
		ZVAL_COPY(EX_VAR(opline->result.var), value);
	}

In this case, the 0 || is meaningless. The upper 1 makes sense.

This is a minor point, I thought it was a little confusing which is why I mentioned it but it doesn't do any harm.

Do you think given that this targets master we should also fix the write_property case (Zend/tests/gh10168_6.phpt)? Using the same approach (zval **garbage param) would make sense.

Zend/tests/gh10168_8.phpt fails with opcache.jit=1205 but I don't know how to fix that one.

I'll now look over all cases again to make sure we've caught every case. I'll create a new PR afterwards.

@dstogov
Copy link
Member

dstogov commented Feb 20, 2023

Probably we should use && at the second condition. Thanks for checking this.

@iluuu1994
Copy link
Member

I think the inner ZEND_VM_SPEC should be removed completely because otherwise the unspecialized handler won't copy to result when it needs to.

if (1 || UNEXPECTED(RETURN_VALUE_USED(opline))) {
	zend_refcounted *garbage = NULL;

	value = zend_assign_to_variable_ex(variable_ptr, value, opline->op2_type, EX_USES_STRICT_TYPES(), &garbage);
	if (0 && UNEXPECTED(RETURN_VALUE_USED(opline))) {
		ZVAL_COPY(EX_VAR(opline->result.var), value);
	}

Keeping the || works too, it just doesn't do anything.

@dstogov
Copy link
Member

dstogov commented Feb 20, 2023

&& is wrong of course. || seems were right. Note that for specialized VM we generate two variants

SPEC+!RETURN_VALUE_USED

if (0 || UNEXPECTED(0)) {
   // the reset doesn't matter because we always take the "else" part

SPEC+RETRUN_VALUE_USED

if (0 || UNEXPECTED(1)) {
	zend_refcounted *garbage = NULL;

	value = zend_assign_to_variable_ex(variable_ptr, value, opline->op2_type, EX_USES_STRICT_TYPES(), &garbage);
	if (1 || UNEXPECTED(1)) { // we won't come here if return value is not used according to the first condition 
		ZVAL_COPY(EX_VAR(opline->result.var), value);
	}

!SPEC

if (1 || UNEXPECTED(RETURN_VALUE_USED(opline))) { // we never take the "else" part 
	zend_refcounted *garbage = NULL;

	value = zend_assign_to_variable_ex(variable_ptr, value, opline->op2_type, EX_USES_STRICT_TYPES(), &garbage);
	if (0 || UNEXPECTED(RETURN_VALUE_USED(opline))) { // we always check RETURN_VALUE_USED() here
		ZVAL_COPY(EX_VAR(opline->result.var), value);
	}

@dstogov
Copy link
Member

dstogov commented Feb 20, 2023

No problem. You may be still right. It's better to re-check me. :)

@iluuu1994
Copy link
Member

Sorry, I misunderstood your response and so deleted my last comment. I still think the inner || is unnecessary. We have three cases:

  • SPEC+!RETURN_VALUE_USED
    • if (0 || UNEXPECTED(0)) {
    • second if doesn't matter as it's always eliminated
  • SPEC+RETRUN_VALUE_USED
    • if (0 || UNEXPECTED(1)) {
    • if (1 || UNEXPECTED(1)) {
    • 1 || is meaningless, as RETURN_VALUE_USED(opline)) is replaced with 1
  • !SPEC
    • if (1 || UNEXPECTED(RETURN_VALUE_USED(opline))) {
    • if (0 || UNEXPECTED(RETURN_VALUE_USED(opline))) {
    • 0 || doesn't do anything

But it's not harmful so I'm happy just keeping it there 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants