Skip to content

RFC: Make unserialize() emit a warning for trailing bytes #9630

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
May 1, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions NEWS
Original file line number Diff line number Diff line change
Expand Up @@ -124,6 +124,8 @@ PHP NEWS

- Standard:
. E_NOTICEs emitted by unserialize() have been promoted to E_WARNING. (timwolla)
. unserialize() now emits a new E_WARNING if the input contains unconsumed
bytes. (timwolla)
. Make array_pad's $length warning less confusing. (nielsdos)
. E_WARNING emitted by strtok in the caase both arguments are not provided when
starting tokenisation. (David Carlier)
Expand Down
5 changes: 4 additions & 1 deletion UPGRADING
Original file line number Diff line number Diff line change
Expand Up @@ -87,8 +87,11 @@ PHP 8.3 UPGRADE NOTES
(Alex Dowad)

- Standard:
. E_NOTICEs emitted by unserialized() have been promoted to E_WARNING.
. E_NOTICEs emitted by unserialize() have been promoted to E_WARNING.
RFC: https://wiki.php.net/rfc/improve_unserialize_error_handling
. unserialize() now emits a new E_WARNING if the input contains unconsumed
bytes.
RFC: https://wiki.php.net/rfc/unserialize_warn_on_trailing_data
. array_pad() is now only limited by the maximum number of elements an array
can have. Before, it was only possible to add at most 1048576 elements at a
time.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ class Test {
public ?object $prop;
}
$s = <<<'STR'
O:4:"Test":2:{s:4:"prop";R:1;s:4:"prop";N;}}
O:4:"Test":2:{s:4:"prop";R:1;s:4:"prop";N;}
STR;
var_dump(unserialize($s));

Expand Down
26 changes: 26 additions & 0 deletions ext/standard/tests/serialize/unserialize_extra_data_001.phpt
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
--TEST--
Test unserialize() with extra data at the end of a valid value
--FILE--
<?php

var_dump(unserialize('i:5;i:6;'));
var_dump(unserialize('N;i:6;'));
var_dump(unserialize('b:1;i:6;'));
var_dump(unserialize('a:1:{s:3:"foo";b:1;}i:6;'));

?>
--EXPECTF--
Warning: unserialize(): Extra data starting at offset 4 of 8 bytes in %s on line %d
int(5)

Warning: unserialize(): Extra data starting at offset 2 of 6 bytes in %s on line %d
NULL

Warning: unserialize(): Extra data starting at offset 4 of 8 bytes in %s on line %d
bool(true)

Warning: unserialize(): Extra data starting at offset 20 of 24 bytes in %s on line %d
array(1) {
["foo"]=>
bool(true)
}
42 changes: 42 additions & 0 deletions ext/standard/tests/serialize/unserialize_extra_data_002.phpt
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
--TEST--
Test unserialize() with extra data at the end of a valid value with nested unserialize
--FILE--
<?php

final class Foo {
public $foo;

public function __unserialize(array $foo)
{
$this->foo = unserialize($foo['bar']);
}

public function __serialize(): array
{
return [
'bar' => serialize($this->foo) . 'garbage',
];
}
}

$f = new Foo;
$f->foo = ['a', 'b', 'c'];

var_dump(unserialize(serialize($f) . 'garbage'));

?>
--EXPECTF--
Warning: unserialize(): Extra data starting at offset 81 of 88 bytes in %s on line %d

Warning: unserialize(): Extra data starting at offset 42 of 49 bytes in %s on line %d
object(Foo)#2 (1) {
["foo"]=>
array(3) {
[0]=>
string(1) "a"
[1]=>
string(1) "b"
[2]=>
string(1) "c"
}
}
42 changes: 42 additions & 0 deletions ext/standard/tests/serialize/unserialize_extra_data_003.phpt
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
--TEST--
Test unserialize() with extra data at the end of a valid value with Serializable
--FILE--
<?php

final class Foo implements Serializable {
public $foo;

public function unserialize(string $foo)
{
$this->foo = unserialize($foo);
}

public function serialize(): string
{
return serialize($this->foo) . 'garbage';
}
}

$f = new Foo;
$f->foo = ['a', 'b', 'c'];

var_dump(unserialize(serialize($f) . 'garbage'));

?>
--EXPECTF--
Deprecated: Foo implements the Serializable interface, which is deprecated. Implement __serialize() and __unserialize() instead (or in addition, if support for old PHP versions is necessary) in %s on line %d

Warning: unserialize(): Extra data starting at offset 42 of 49 bytes in %s on line %d

Warning: unserialize(): Extra data starting at offset 64 of 71 bytes in %s on line %d
object(Foo)#2 (1) {
["foo"]=>
array(3) {
[0]=>
string(1) "a"
[1]=>
string(1) "b"
[2]=>
string(1) "c"
}
}
18 changes: 13 additions & 5 deletions ext/standard/var.c
Original file line number Diff line number Diff line change
Expand Up @@ -1406,11 +1406,19 @@ PHPAPI void php_unserialize_with_options(zval *return_value, const char *buf, co
zval_ptr_dtor(return_value);
}
RETVAL_FALSE;
} else if (BG(unserialize).level > 1) {
ZVAL_COPY(return_value, retval);
} else if (Z_REFCOUNTED_P(return_value)) {
zend_refcounted *ref = Z_COUNTED_P(return_value);
gc_check_possible_root(ref);
} else {
if ((char*)p < buf + buf_len) {
if (!EG(exception)) {
php_error_docref(NULL, E_WARNING, "Extra data starting at offset " ZEND_LONG_FMT " of %zd bytes",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Making a note of this since I forgot how this works, and not many people are familiar with how the unserializer works:

If this were to start throwing an exception instead in the next php major version, this would call zval_ptr_dtor to free the allocated values instead of falling through, then RETVAL_FALSE, like the above branch does right now.

The implementation of zval_ptr_dtor calls gc_check_possible_root(), so if that's done, we'll properly collect reference cycles when the gc runs if this throws.

(zend_long)((char*)p - buf), buf_len);
}
}
if (BG(unserialize).level > 1) {
ZVAL_COPY(return_value, retval);
} else if (Z_REFCOUNTED_P(return_value)) {
zend_refcounted *ref = Z_COUNTED_P(return_value);
gc_check_possible_root(ref);
}
}

cleanup:
Expand Down