Skip to content

Commit bf8a9a1

Browse files
author
Father Chrysostomos
committed
[perl #128951] Fix ASan error with @{\327
By \327 I mean character number 327 in octal. Without memory tools like ASan, it produces garbled output. The added test fails like this: # Failed test 18 - @ { \327 \n - used to garble output (or fail asan) [perl \#128951] at ./test.pl line 1058 # got "Unrecognized character \\xD7; marked by <-- HERE after \x{a0}\x{f6}@3\x{a8}\x{7f}\000\000@{<-- HERE near column -1 at - line 1." # expected "Unrecognized character \\xD7; marked by <-- HERE after @{<-- HERE near column 3 at - line 1." Dave Mitchell’s explanation from the RT ticket: > The src code contains the bytes: > > @ { \327 \n > > after seeing "@{" the lexer calls scan_ident(), which sees the \327 as an > ident, then calls S_skipspace_flags() to skip the spaces following the > ident. This moves the current cursor position to the \n, and since that's > a line boundary, its updates PL_linestart and PL_bufptr to point to \n > too. > > When it finds that the next char isn't a '}', it does this: > > /* Didn't find the closing } at the point we expected, so restore > state such that the next thing to process is the opening { and */ > s = SvPVX(PL_linestr) + bracket; /* let the parser handle it */ > > i.e. it moves s back to the "{\317" then continues. > > However, PL_linestart doesn't get reset, so later when the parser > encounters the \327 and tries to croak with "Unrecognized character %s ...", > when it prints out the section of src code in error, since s < PL_linestr, > negative string lengths and ASAN errors ensue. This commit fixes it by passing the LEX_NO_INCLINE flag (added by 2179133), which specifies that we are not trying to read past the newline but simply peek ahead. In that case lex_read_space does not reset PL_linestart. But that does cause problems with code like: ${; #line 3 } because we end up jumping ahead via skipspace without updating the line number. So we need to do a skipspace_flags(..., LEX_NO_INCLINE) first (i.e., peek ahead), and then when we know we don’t need to go back again we can skipspace(...) for real.
1 parent 7d897bd commit bf8a9a1

File tree

2 files changed

+21
-6
lines changed

2 files changed

+21
-6
lines changed

t/op/lex.t

+9-2
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ use warnings;
77

88
BEGIN { chdir 't' if -d 't'; require './test.pl'; }
99

10-
plan(tests => 27);
10+
plan(tests => 28);
1111

1212
{
1313
no warnings 'deprecated';
@@ -129,7 +129,7 @@ fresh_perl_is(
129129
'* <null> ident'
130130
);
131131
SKIP: {
132-
skip "Different output on EBCDIC (presumably)", 2 if $::IS_EBCDIC;
132+
skip "Different output on EBCDIC (presumably)", 3 if $::IS_EBCDIC;
133133
fresh_perl_is(
134134
qq'"ab}"ax;&\0z\x8Ao}\x82x;', <<gibberish,
135135
Bareword found where operator expected at - line 1, near ""ab}"ax"
@@ -150,6 +150,13 @@ gibberish
150150
{ stderr => 1 },
151151
'gibberish containing &{+z} - used to crash [perl #123753]'
152152
);
153+
fresh_perl_is(
154+
"\@{\327\n", <<\gibberisi,
155+
Unrecognized character \xD7; marked by <-- HERE after @{<-- HERE near column 3 at - line 1.
156+
gibberisi
157+
{ stderr => 1 },
158+
'@ { \327 \n - used to garble output (or fail asan) [perl #128951]'
159+
);
153160
}
154161

155162
fresh_perl_is(

toke.c

+12-4
Original file line numberDiff line numberDiff line change
@@ -9074,6 +9074,8 @@ S_scan_ident(pTHX_ char *s, char *dest, STRLEN destlen, I32 ck_uni)
90749074
else if (ck_uni && bracket == -1)
90759075
check_uni();
90769076
if (bracket != -1) {
9077+
bool skip;
9078+
char *s2;
90779079
/* If we were processing {...} notation then... */
90789080
if (isIDFIRST_lazy_if(d,is_utf8)) {
90799081
/* if it starts as a valid identifier, assume that it is one.
@@ -9122,13 +9124,19 @@ S_scan_ident(pTHX_ char *s, char *dest, STRLEN destlen, I32 ck_uni)
91229124

91239125
if ( !tmp_copline )
91249126
tmp_copline = CopLINE(PL_curcop);
9125-
if (s < PL_bufend && isSPACE(*s)) {
9126-
s = skipspace(s);
9127-
}
9127+
if ((skip = s < PL_bufend && isSPACE(*s)))
9128+
/* Avoid incrementing line numbers or resetting PL_linestart,
9129+
in case we have to back up. */
9130+
s2 = skipspace_flags(s, LEX_NO_INCLINE);
9131+
else
9132+
s2 = s;
91289133

91299134
/* Expect to find a closing } after consuming any trailing whitespace.
91309135
*/
9131-
if (*s == '}') {
9136+
if (*s2 == '}') {
9137+
/* Now increment line numbers if applicable. */
9138+
if (skip)
9139+
s = skipspace(s);
91329140
s++;
91339141
if (PL_lex_state == LEX_INTERPNORMAL && !PL_lex_brackets) {
91349142
PL_lex_state = LEX_INTERPEND;

0 commit comments

Comments
 (0)