Skip to content

Lexing memory corruption #10634

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
iluuu1994 opened this issue Feb 20, 2023 · 3 comments · Fixed by #10866
Closed

Lexing memory corruption #10634

iluuu1994 opened this issue Feb 20, 2023 · 3 comments · Fixed by #10866

Comments

@iluuu1994
Copy link
Member

iluuu1994 commented Feb 20, 2023

Description

Most likely caused (or revealed) by f291d37.

The following code:

oss-fuzz Issue 55793

<?yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy&#
heap-buffer-overflow

But I expected this output instead:

PHP Version

master

Operating System

No response

@nielsdos
Copy link
Member

I've had a clue about the <?yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy&# case (and its variants with // or /* instead of #) for a couple of days now, and I just verified my hypotheses. We're not relying on re2c's bounds checking mechanism because re2c:yyfill:check = 0; is set. We just return 0 if we read over the end of the input in YYFILL. But that means if we go over the end in the comment regexes, we don't know that and it's just like the 0 bytes are part of the token.
To demonstrate that this is likely the root cause I created an ugly PoC patch: since a 0 byte already is considered as an end-of-file, we can just block those in the regex.

PoC patch:

diff --git a/Zend/zend_language_scanner.l b/Zend/zend_language_scanner.l
index 7abd91b23a..7f6e4f7b10 100644
--- a/Zend/zend_language_scanner.l
+++ b/Zend/zend_language_scanner.l
@@ -1369,9 +1369,9 @@ TOKENS [;:,.|^&+-/*=%!~$<>?@]
 ANY_CHAR [^]
 NEWLINE ("\r"|"\n"|"\r\n")
 OPTIONAL_WHITESPACE [ \n\r\t]*
-MULTI_LINE_COMMENT "/*"([^*]*"*"+)([^*/][^*]*"*"+)*"/"
-SINGLE_LINE_COMMENT "//".*[\n\r]
-HASH_COMMENT "#"(([^[].*[\n\r])|[\n\r])
+MULTI_LINE_COMMENT "/*"([^*\x00]*"*"+)([^*/][^*]*"*"+)*"/"
+SINGLE_LINE_COMMENT "//"[^\x00]*[\n\r]
+HASH_COMMENT "#"(([^[\x00][^\x00]*[\n\r])|[\n\r])
 WHITESPACE_OR_COMMENTS ({WHITESPACE}|{MULTI_LINE_COMMENT}|{SINGLE_LINE_COMMENT}|{HASH_COMMENT})+
 OPTIONAL_WHITESPACE_OR_COMMENTS ({WHITESPACE}|{MULTI_LINE_COMMENT}|{SINGLE_LINE_COMMENT}|{HASH_COMMENT})*
 

For the other code fragment you posted, it seems to have a different root cause than the lexer problem described above.

@iluuu1994
Copy link
Member Author

iluuu1994 commented Feb 26, 2023

@nielsdos Thank you for having a look at this! I have no clue how re2c works so this would have probably taken me a while to figure out.

You're indeed right that the second issue is a different one. I had a look at that one. Here's what happens (I greatly simplified the test case). This is in fact not a new bug, it also happens in PHP-8.1.

class B { const HW = A::HW . " extended by B"; }

spl_autoload_register(function ($class) {
    class A { const HW = "this is A"; }
    var_dump(B::HW);
});

var_dump(new B());
ASAN output

string(23) "this is A extended by B"
=================================================================
==587089==ERROR: AddressSanitizer: heap-use-after-free on address 0x60c000008998 at pc 0x000001a961e2 bp 0x7ffce2c99f70 sp 0x7ffce2c99f68
READ of size 8 at 0x60c000008998 thread T0
    #0 0x1a961e1 in zend_ast_evaluate_inner /home/ilutov/Developer/php-src/Zend/zend_ast.c:546
    #1 0x1a95d63 in zend_ast_evaluate_ex /home/ilutov/Developer/php-src/Zend/zend_ast.c:522
    #2 0x15c8f62 in zval_update_constant_with_ctx /home/ilutov/Developer/php-src/Zend/zend_execute_API.c:696
    #3 0x15c92ba in zval_update_constant_ex /home/ilutov/Developer/php-src/Zend/zend_execute_API.c:710
    #4 0x17a29c6 in ZEND_FETCH_CLASS_CONSTANT_SPEC_CONST_CONST_HANDLER /home/ilutov/Developer/php-src/Zend/zend_vm_execute.h:7243
    #5 0x19de180 in execute_ex /home/ilutov/Developer/php-src/Zend/zend_vm_execute.h:57172
    #6 0x15cd115 in zend_call_function /home/ilutov/Developer/php-src/Zend/zend_execute_API.c:938
    #7 0x15cea26 in zend_call_known_function /home/ilutov/Developer/php-src/Zend/zend_execute_API.c:1032
    #8 0xf557b2 in spl_perform_autoload /home/ilutov/Developer/php-src/ext/spl/php_spl.c:448
    #9 0x15d0943 in zend_lookup_class_ex /home/ilutov/Developer/php-src/Zend/zend_execute_API.c:1198
    #10 0x15d3041 in zend_fetch_class_by_name /home/ilutov/Developer/php-src/Zend/zend_execute_API.c:1664
    #11 0x17c99dd in ZEND_NEW_SPEC_CONST_UNUSED_HANDLER /home/ilutov/Developer/php-src/Zend/zend_vm_execute.h:10454
    #12 0x19dfe1e in execute_ex /home/ilutov/Developer/php-src/Zend/zend_vm_execute.h:57468
    #13 0x19f4e41 in zend_execute /home/ilutov/Developer/php-src/Zend/zend_vm_execute.h:60935
    #14 0x163aa29 in zend_execute_scripts /home/ilutov/Developer/php-src/Zend/zend.c:1798
    #15 0x13cfd7e in php_execute_script /home/ilutov/Developer/php-src/main/main.c:2482
    #16 0x1d856ab in do_cli /home/ilutov/Developer/php-src/sapi/cli/php_cli.c:964
    #17 0x1d87d13 in main %s:%d
    #18 0x7f73b0a4a50f in __libc_start_call_main (/lib64/libc.so.6+0x2750f)
    #19 0x7f73b0a4a5c8 in __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x275c8)
    #20 0x603e24 in _start (/home/ilutov/Developer/php-src/sapi/cli/php+0x603e24)

0x60c000008998 is located 24 bytes inside of 128-byte region [0x60c000008980,0x60c000008a00)
freed by thread T0 here:
    #0 0x7f73b16b9388 in __interceptor_free.part.0 (/lib64/libasan.so.8+0xb9388)
    #1 0x1527f17 in tracked_free /home/ilutov/Developer/php-src/Zend/zend_alloc.c:2845
    #2 0x1526179 in _efree_custom /home/ilutov/Developer/php-src/Zend/zend_alloc.c:2480
    #3 0x1526486 in _efree /home/ilutov/Developer/php-src/Zend/zend_alloc.c:2600
    #4 0x1a9e9a9 in zend_ast_ref_destroy /home/ilutov/Developer/php-src/Zend/zend_ast.c:1176
    #5 0x162993b in rc_dtor_func /home/ilutov/Developer/php-src/Zend/zend_variables.c:57
    #6 0x15bc122 in zval_ptr_dtor_nogc /home/ilutov/Developer/php-src/Zend/zend_variables.h:35
    #7 0x15c8f8e in zval_update_constant_with_ctx /home/ilutov/Developer/php-src/Zend/zend_execute_API.c:699
    #8 0x15c92ba in zval_update_constant_ex /home/ilutov/Developer/php-src/Zend/zend_execute_API.c:710
    #9 0x17a29c6 in ZEND_FETCH_CLASS_CONSTANT_SPEC_CONST_CONST_HANDLER /home/ilutov/Developer/php-src/Zend/zend_vm_execute.h:7243
    #10 0x19de180 in execute_ex /home/ilutov/Developer/php-src/Zend/zend_vm_execute.h:57172
    #11 0x15cd115 in zend_call_function /home/ilutov/Developer/php-src/Zend/zend_execute_API.c:938
    #12 0x15cea26 in zend_call_known_function /home/ilutov/Developer/php-src/Zend/zend_execute_API.c:1032
    #13 0xf557b2 in spl_perform_autoload /home/ilutov/Developer/php-src/ext/spl/php_spl.c:448
    #14 0x15d0943 in zend_lookup_class_ex /home/ilutov/Developer/php-src/Zend/zend_execute_API.c:1198
    #15 0x15d2def in zend_fetch_class /home/ilutov/Developer/php-src/Zend/zend_execute_API.c:1620
    #16 0x15b83b1 in zend_get_class_constant_ex /home/ilutov/Developer/php-src/Zend/zend_constants.c:336
    #17 0x1a9a02a in zend_ast_evaluate_inner /home/ilutov/Developer/php-src/Zend/zend_ast.c:855
    #18 0x1a95d63 in zend_ast_evaluate_ex /home/ilutov/Developer/php-src/Zend/zend_ast.c:522
    #19 0x1a9614f in zend_ast_evaluate_inner /home/ilutov/Developer/php-src/Zend/zend_ast.c:544
    #20 0x1a95d63 in zend_ast_evaluate_ex /home/ilutov/Developer/php-src/Zend/zend_ast.c:522
    #21 0x15c8f62 in zval_update_constant_with_ctx /home/ilutov/Developer/php-src/Zend/zend_execute_API.c:696
    #22 0x15c92ba in zval_update_constant_ex /home/ilutov/Developer/php-src/Zend/zend_execute_API.c:710
    #23 0x17a29c6 in ZEND_FETCH_CLASS_CONSTANT_SPEC_CONST_CONST_HANDLER /home/ilutov/Developer/php-src/Zend/zend_vm_execute.h:7243
    #24 0x19de180 in execute_ex /home/ilutov/Developer/php-src/Zend/zend_vm_execute.h:57172
    #25 0x15cd115 in zend_call_function /home/ilutov/Developer/php-src/Zend/zend_execute_API.c:938
    #26 0x15cea26 in zend_call_known_function /home/ilutov/Developer/php-src/Zend/zend_execute_API.c:1032
    #27 0xf557b2 in spl_perform_autoload /home/ilutov/Developer/php-src/ext/spl/php_spl.c:448
    #28 0x15d0943 in zend_lookup_class_ex /home/ilutov/Developer/php-src/Zend/zend_execute_API.c:1198
    #29 0x15d3041 in zend_fetch_class_by_name /home/ilutov/Developer/php-src/Zend/zend_execute_API.c:1664

previously allocated by thread T0 here:
    #0 0x7f73b16ba6af in __interceptor_malloc (/lib64/libasan.so.8+0xba6af)
    #1 0x1527c50 in tracked_malloc /home/ilutov/Developer/php-src/Zend/zend_alloc.c:2826
    #2 0x1525fef in _malloc_custom /home/ilutov/Developer/php-src/Zend/zend_alloc.c:2471
    #3 0x15263bb in _emalloc /home/ilutov/Developer/php-src/Zend/zend_alloc.c:2590
    #4 0x1a9deb3 in zend_ast_copy /home/ilutov/Developer/php-src/Zend/zend_ast.c:1118
    #5 0x15ad394 in zend_const_expr_to_zval /home/ilutov/Developer/php-src/Zend/zend_compile.c:10147
    #6 0x158baf6 in zend_compile_class_const_decl /home/ilutov/Developer/php-src/Zend/zend_compile.c:7718
    #7 0x158bd8d in zend_compile_class_const_group /home/ilutov/Developer/php-src/Zend/zend_compile.c:7733
    #8 0x15ae289 in zend_compile_stmt /home/ilutov/Developer/php-src/Zend/zend_compile.c:10264
    #9 0x157864f in zend_compile_stmt_list /home/ilutov/Developer/php-src/Zend/zend_compile.c:6283
    #10 0x15ae0ef in zend_compile_stmt /home/ilutov/Developer/php-src/Zend/zend_compile.c:10202
    #11 0x158fc15 in zend_compile_class_decl /home/ilutov/Developer/php-src/Zend/zend_compile.c:7999
    #12 0x15adcb1 in zend_compile_top_stmt /home/ilutov/Developer/php-src/Zend/zend_compile.c:10177
    #13 0x15ad996 in zend_compile_top_stmt /home/ilutov/Developer/php-src/Zend/zend_compile.c:10166
    #14 0x14b0859 in zend_compile Zend/zend_language_scanner.l:619
    #15 0x14b277c in compile_string Zend/zend_language_scanner.l:815
    #16 0x173b0fc in zend_include_or_eval /home/ilutov/Developer/php-src/Zend/zend_execute.c:4878
    #17 0x1801a35 in ZEND_INCLUDE_OR_EVAL_SPEC_TMPVAR_HANDLER /home/ilutov/Developer/php-src/Zend/zend_vm_execute.h:14730
    #18 0x19e3820 in execute_ex /home/ilutov/Developer/php-src/Zend/zend_vm_execute.h:58068
    #19 0x15cd115 in zend_call_function /home/ilutov/Developer/php-src/Zend/zend_execute_API.c:938
    #20 0x15cea26 in zend_call_known_function /home/ilutov/Developer/php-src/Zend/zend_execute_API.c:1032
    #21 0xf557b2 in spl_perform_autoload /home/ilutov/Developer/php-src/ext/spl/php_spl.c:448
    #22 0x15d0943 in zend_lookup_class_ex /home/ilutov/Developer/php-src/Zend/zend_execute_API.c:1198
    #23 0x15d3041 in zend_fetch_class_by_name /home/ilutov/Developer/php-src/Zend/zend_execute_API.c:1664
    #24 0x17c99dd in ZEND_NEW_SPEC_CONST_UNUSED_HANDLER /home/ilutov/Developer/php-src/Zend/zend_vm_execute.h:10454
    #25 0x19dfe1e in execute_ex /home/ilutov/Developer/php-src/Zend/zend_vm_execute.h:57468
    #26 0x19f4e41 in zend_execute /home/ilutov/Developer/php-src/Zend/zend_vm_execute.h:60935
    #27 0x163aa29 in zend_execute_scripts /home/ilutov/Developer/php-src/Zend/zend.c:1798
    #28 0x13cfd7e in php_execute_script /home/ilutov/Developer/php-src/main/main.c:2482
    #29 0x1d856ab in do_cli /home/ilutov/Developer/php-src/sapi/cli/php_cli.c:964

SUMMARY: AddressSanitizer: heap-use-after-free /home/ilutov/Developer/php-src/Zend/zend_ast.c:546 in zend_ast_evaluate_inner
Shadow bytes around the buggy address:
  0x0c187fff90e0: fa fa fa fa fa fa fa fa fd fd fd fd fd fd fd fd
  0x0c187fff90f0: fd fd fd fd fd fd fd fd fa fa fa fa fa fa fa fa
  0x0c187fff9100: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c187fff9110: fa fa fa fa fa fa fa fa fd fd fd fd fd fd fd fd
  0x0c187fff9120: fd fd fd fd fd fd fd fd fa fa fa fa fa fa fa fa
=>0x0c187fff9130: fd fd fd[fd]fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c187fff9140: fa fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00
  0x0c187fff9150: 00 00 00 00 00 00 00 fa fa fa fa fa fa fa fa fa
  0x0c187fff9160: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c187fff9170: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c187fff9180: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==587089==ABORTING

new B() triggers the evaluation of its constants, starting with the binary op AST. That in turn triggers the autoloader for A, which then dumps the B::HW constant. At that point, the outer evaluation of the constants hasn't finished, so it will try loading Bs constants again (because the constant B::HW still contains a constant AST). A is now declared, so it can evaluate the constant and store it in B::HW. When it is done, it frees the constant ASTs and returns back to the first constant evaluation. The outer evaluation continues, trying to access the now freed constant AST.

This can be fixed by reusing the IS_CONSTANT_VISITED mechanism. That code should probably be moved to zval_update_constant_with_ctx (or the equivalent for PHP-8.1). The code will then error and complain about recursive class constant declarations which seems reasonable. I'll create a PR for that tomorrow.

@iluuu1994
Copy link
Member Author

I moved the AST eval bug to a new issue #10709.

nielsdos added a commit to nielsdos/php-src that referenced this issue Mar 16, 2023
We're not relying on re2c's bounds checking mechanism because
re2c:yyfill:check = 0; is set. We just return 0 if we read over the end
of the input in YYFILL. Note that we used to use the "any character"
wildcard in the comment regexes.
But that means if we go over the end in the comment regexes,
we don't know that and it's just like the 0 bytes are part of the token.
Since a 0 byte already is considered as an end-of-file, we can just block
those in the regex.

For the regexes with newlines, I had to not only include \x00 in the
denylist, but also \n and \r because otherwise it would greedily match
those and let the single-line comment run over multiple lines.
nielsdos added a commit to nielsdos/php-src that referenced this issue Mar 16, 2023
We're not relying on re2c's bounds checking mechanism because
re2c:yyfill:check = 0; is set. We just return 0 if we read over the end
of the input in YYFILL. Note that we used to use the "any character"
wildcard in the comment regexes.
But that means if we go over the end in the comment regexes,
we don't know that and it's just like the 0 bytes are part of the token.
Since a 0 byte already is considered as an end-of-file, we can just block
those in the regex.

For the regexes with newlines, I had to not only include \x00 in the
denylist, but also \n and \r because otherwise it would greedily match
those and let the single-line comment run over multiple lines.
nielsdos added a commit to nielsdos/php-src that referenced this issue Mar 17, 2023
We're not relying on re2c's bounds checking mechanism because
re2c:yyfill:check = 0; is set. We just return 0 if we read over the end
of the input in YYFILL. Note that we used to use the "any character"
wildcard in the comment regexes.
But that means if we go over the end in the comment regexes,
we don't know that and it's just like the 0 bytes are part of the token.
Since a 0 byte already is considered as an end-of-file, we can just block
those in the regex.

For the regexes with newlines, I had to not only include \x00 in the
denylist, but also \n and \r because otherwise it would greedily match
those and let the single-line comment run over multiple lines.
nielsdos added a commit that referenced this issue Mar 17, 2023
We're not relying on re2c's bounds checking mechanism because
re2c:yyfill:check = 0; is set. We just return 0 if we read over the end
of the input in YYFILL. Note that we used to use the "any character"
wildcard in the comment regexes.
But that means if we go over the end in the comment regexes,
we don't know that and it's just like the 0 bytes are part of the token.
Since a 0 byte already is considered as an end-of-file, we can just block
those in the regex.

For the regexes with newlines, I had to not only include \x00 in the
denylist, but also \n and \r because otherwise it would greedily match
those and let the single-line comment run over multiple lines.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants