Skip to content

Modifying a copied by-ref iterated array resets the array position #11244

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
survik1 opened this issue May 15, 2023 · 2 comments · Fixed by #11248
Closed

Modifying a copied by-ref iterated array resets the array position #11244

survik1 opened this issue May 15, 2023 · 2 comments · Fixed by #11248

Comments

@survik1
Copy link

survik1 commented May 15, 2023

Description

Hello, we are finally pushing our code to PHP 8 and I have encountered this BC break which seems more like bug than feature to me. This behavior is still present in 8.2 and the change was not documented (unless i missed it ofc).

https://3v4l.org/aNejW

<?php

$_ENV = [
    'ZASIS_UI_BACKGROUNDCOLOR' => '#FFF1EC',
    'ZASIS_UI_INFOPANEL_ENABLE' => '1',
    'ZASIS_UI_INFOPANEL_TEXT' => 'DEBUG IS ON',
    'ZASIS_UI_INFOPANEL_BACKGROUNDCOLOR' => '#F44',
];

$arr = [
    'ui' => [
        'backgroundColor' => 'ZASIS_UI_BACKGROUNDCOLOR',
        'infoPanel' => [
            'enable' => 'ZASIS_UI_INFOPANEL_ENABLE',
            'text' => 'ZASIS_UI_INFOPANEL_TEXT',
            'fontColor' => 'ZASIS_UI_INFOPANEL_FONTCOLOR',
            'backgroundColor' => 'ZASIS_UI_INFOPANEL_BACKGROUNDCOLOR',
        ]
    ]
];

function loadEnv(&$data): void
{
    if (is_array($data)) {
        foreach ($data as $key => &$value) {
            try {
                loadEnv($value);
            } catch (\InvalidArgumentException $e) {
                unset($data[$key]);
            }
        }
    } elseif (array_key_exists($data, $_ENV)) {
        $data = $_ENV[$data];
    } else {
        throw new \InvalidArgumentException();
    }
}
loadEnv($arr);
print_r($arr);

Output in PHP >=8.0:

Array
(
    [ui] => Array
        (
            [backgroundColor] => #FFF1EC
            [infoPanel] => Array
                (
                    [backgroundColor] => #F44
                )

        )

)

Output in PHP >=7.1 <8.0:

Array
(
    [ui] => Array
        (
            [backgroundColor] => #FFF1EC
            [infoPanel] => Array
                (
                    [enable] => 1
                    [text] => DEBUG IS ON
                    [backgroundColor] => #F44
                )

        )

)

PHP Version

PHP 8.2.6

Operating System

No response

@KapitanOczywisty
Copy link

Minimal reproducible example: https://3v4l.org/oU2Wj

$data = range(0,3);

function foo($data): void
{
    foreach ($data as $key => &$value) {
        echo "{$value}\n";
        try {
            if($value === 2) {
                throw new \Exception;
            }
        } catch (\Exception $e) {
            echo "remove {$value}\n";
            unset($data[$key]);
        }
    }
}

foo($data);

Output 8.0+

0
1
2
remove 2
0
1
3

Output before 8.0

0
1
2
remove 2
3

@survik1 It's best to avoid using references whenever possible, they were always big source of problems. Just removing reference from the foreach fixes this particular issue: https://3v4l.org/Otugh

@bwoebi
Copy link
Member

bwoebi commented May 15, 2023

A simpler example:

$data = [0, 1, 2];

foreach ($data as $key => &$value) {
    echo "{$value}\n";
    if ($value === 1) {
        $cow_copy = $data;
        echo "unset $value\n";
        unset($data[$key]);
    }
}

giving

0
1
unset 1
0 // restarting iteration here at the position of the internal array pointer
2

The problem is modifying a by-ref iterated array whose RC > 1.

This causes duplication of the array, assigning a new array to $data. The foreach iterator then thinks it's a new array and starts over...

This is not a new issue, but an issue existing since PHP 7: https://3v4l.org/VoceP. It just happens that since PHP 8 exceptions do a CoW copy (reference counter increase) of the underlying value instead of the reference, exhibiting the side effect of the referenced array needing to be duplicated upon modification, causing this issue. Which should be an implementation detail without visible side effects, except in this case where it apparently surfaces this issue.

@bwoebi bwoebi self-assigned this May 16, 2023
@bwoebi bwoebi changed the title Since PHP 8, throwing Exception inside recursive foreach loop causes reset of internal pointer Modifying a copied by-ref iterated array resets the array position May 16, 2023
bwoebi added a commit to bwoebi/php-src that referenced this issue May 16, 2023
When executing a foreach ($ht as &$ref), foreach calls zend_hash_iterator_pos_ex() on every iteration. If the HashTable contained in the $ht variable is not the tracked HashTable, it will reset the position to the internal array pointer of the array currently in $ht.
This behaviour is generally fine, but undesirable for copy-on-write copies of the iterated HashTable. This may trivially occur when the iterated over HashTable is assigned to some variable, then the iterated over variable modified, leading to array separation, changing the HashTable pointer in the variable. Thus foreach happily restarting iteration.
This behaviour (despite existing since PHP 7.0) is considered a bug, if not only for the behaviour being unexpected to the user, also copy-on-write should not have trivially observable side-effects by mere assignment.

The bugfix consists of duplicating HashTableIterators whenever zend_array_dup() is called (the primitive used on array separation).
When a further access to the HashPosition through the HashTableIterators API happens and the HashTable does not match the tracked one, all the duplicates (which are tracked by single linked list) are searched for the wanted HashTable. If found, the HashTableIterator is replaced by the found copy and all other copies are removed.
This ensures that we always end up tracking the correct HashTable.

Fixes phpGH-11244.

Signed-off-by: Bob Weinand <bobwei9@hotmail.com>
bwoebi added a commit to bwoebi/php-src that referenced this issue May 16, 2023
When executing a foreach ($ht as &$ref), foreach calls zend_hash_iterator_pos_ex() on every iteration. If the HashTable contained in the $ht variable is not the tracked HashTable, it will reset the position to the internal array pointer of the array currently in $ht.
This behaviour is generally fine, but undesirable for copy-on-write copies of the iterated HashTable. This may trivially occur when the iterated over HashTable is assigned to some variable, then the iterated over variable modified, leading to array separation, changing the HashTable pointer in the variable. Thus foreach happily restarting iteration.
This behaviour (despite existing since PHP 7.0) is considered a bug, if not only for the behaviour being unexpected to the user, also copy-on-write should not have trivially observable side-effects by mere assignment.

The bugfix consists of duplicating HashTableIterators whenever zend_array_dup() is called (the primitive used on array separation).
When a further access to the HashPosition through the HashTableIterators API happens and the HashTable does not match the tracked one, all the duplicates (which are tracked by single linked list) are searched for the wanted HashTable. If found, the HashTableIterator is replaced by the found copy and all other copies are removed.
This ensures that we always end up tracking the correct HashTable.

Fixes phpGH-11244.

Signed-off-by: Bob Weinand <bobwei9@hotmail.com>
bwoebi added a commit that referenced this issue Aug 27, 2023
When executing a foreach ($ht as &$ref), foreach calls zend_hash_iterator_pos_ex() on every iteration. If the HashTable contained in the $ht variable is not the tracked HashTable, it will reset the position to the internal array pointer of the array currently in $ht.
This behaviour is generally fine, but undesirable for copy-on-write copies of the iterated HashTable. This may trivially occur when the iterated over HashTable is assigned to some variable, then the iterated over variable modified, leading to array separation, changing the HashTable pointer in the variable. Thus foreach happily restarting iteration.
This behaviour (despite existing since PHP 7.0) is considered a bug, if not only for the behaviour being unexpected to the user, also copy-on-write should not have trivially observable side-effects by mere assignment.

The bugfix consists of duplicating HashTableIterators whenever zend_array_dup() is called (the primitive used on array separation).
When a further access to the HashPosition through the HashTableIterators API happens and the HashTable does not match the tracked one, all the duplicates (which are tracked by single linked list) are searched for the wanted HashTable. If found, the HashTableIterator is replaced by the found copy and all other copies are removed.
This ensures that we always end up tracking the correct HashTable.

Fixes GH-11244.

Signed-off-by: Bob Weinand <bobwei9@hotmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants