Skip to content

Incorrect identification of \K (not actually) inside lookahead #18123

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
thoughtstream opened this issue Sep 12, 2020 · 1 comment · Fixed by #18133
Closed

Incorrect identification of \K (not actually) inside lookahead #18123

thoughtstream opened this issue Sep 12, 2020 · 1 comment · Fixed by #18133

Comments

@thoughtstream
Copy link
Contributor

Module:

Description

During regex compilation, Perl v5.32 appears to be misidentifying a \K that occurs
after a nested set of lookaheads as being inside a lookahead, and therefore incorrectly
terminates compilation with:

\K not permitted in lookahead/lookbehind in regex

This is, for example, causing problems for the Regexp::Debugger module,
which injects small lookaheads into regexes to assist with its reporting of
their matching process.

The reported misbehaviour does not occur in Perl v5.30.0 and earlier
(presumably because they didn't prohibit \K inside lookaheads).

Steps to Reproduce

use v5.32.0;
qr/(?=(?=x)x)\K/;

Expected behavior

This regex should compile correctly. The \K is not inside a lookahead.

Perl configuration

Summary of my perl5 (revision 5 version 32 subversion 0) configuration:
   
  Platform:
    osname=darwin
    osvers=18.7.0
    archname=darwin-2level
    uname='darwin daneel.local 18.7.0 darwin kernel version 18.7.0: mon apr 27 20:09:39 pdt 2020; root:xnu-4903.278.35~1release_x86_64 x86_64 '
    config_args='-de -Dprefix=/Users/damian/perl5/perlbrew/perls/perl-5.32.0 -Aeval:scriptdir=/Users/damian/perl5/perlbrew/perls/perl-5.32.0/bin'
    hint=recommended
    useposix=true
    d_sigaction=define
    useithreads=undef
    usemultiplicity=undef
    use64bitint=define
    use64bitall=define
    uselongdouble=undef
    usemymalloc=n
    default_inc_excludes_dot=define
    bincompat5005=undef
  Compiler:
    cc='cc'
    ccflags ='-fno-common -DPERL_DARWIN -mmacosx-version-min=10.14 -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include -DPERL_USE_SAFE_PUTENV'
    optimize='-O3'
    cppflags='-fno-common -DPERL_DARWIN -mmacosx-version-min=10.14 -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include'
    ccversion=''
    gccversion='4.2.1 Compatible Apple LLVM 11.0.0 (clang-1100.0.33.8)'
    gccosandvers=''
    intsize=4
    longsize=8
    ptrsize=8
    doublesize=8
    byteorder=12345678
    doublekind=3
    d_longlong=define
    longlongsize=8
    d_longdbl=define
    longdblsize=16
    longdblkind=3
    ivtype='long'
    ivsize=8
    nvtype='double'
    nvsize=8
    Off_t='off_t'
    lseeksize=8
    alignbytes=8
    prototype=define
  Linker and Libraries:
    ld='cc'
    ldflags =' -mmacosx-version-min=10.14 -fstack-protector-strong -L/usr/local/lib'
    libpth=/usr/local/lib /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/clang/11.0.0/lib /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/lib /usr/lib
    libs=-lpthread -lgdbm -ldbm -ldl -lm -lutil -lc
    perllibs=-lpthread -ldl -lm -lutil -lc
    libc=
    so=dylib
    useshrplib=false
    libperl=libperl.a
    gnulibc_version=''
  Dynamic Linking:
    dlsrc=dl_dlopen.xs
    dlext=bundle
    d_dlsymun=undef
    ccdlflags=' '
    cccdlflags=' '
    lddlflags=' -mmacosx-version-min=10.14 -bundle -undefined dynamic_lookup -L/usr/local/lib -fstack-protector-strong'


Characteristics of this binary (from libperl): 
  Compile-time options:
    HAS_TIMES
    PERLIO_LAYERS
    PERL_COPY_ON_WRITE
    PERL_DONT_CREATE_GVSV
    PERL_MALLOC_WRAP
    PERL_OP_PARENT
    PERL_PRESERVE_IVUV
    PERL_USE_SAFE_PUTENV
    USE_64_BIT_ALL
    USE_64_BIT_INT
    USE_LARGE_FILES
    USE_LOCALE
    USE_LOCALE_COLLATE
    USE_LOCALE_CTYPE
    USE_LOCALE_NUMERIC
    USE_LOCALE_TIME
    USE_PERLIO
    USE_PERL_ATOF
  Built under darwin
  Compiled at Jun 27 2020 22:49:58
  %ENV:
    PERL5LIB="./lib:/Users/damian/perl5/perlbrew/perls/perl-5.32.0/lib/perl5/darwin-thread-multi-2level/:/Users/damian/perl5/perlbrew/perls/perl-5.32.0/lib/perl5/darwin-2level/:/Users/damian/perl5/perlbrew/perls/perl-5.32.0/lib/perl5:/Users/damian/perl5/perlbrew/perls/perl-5.32.0/lib"
    PERL6LIB="./lib6:/Users/damian/lib/perl6"
    PERLBREW_CSHRC_VERSION="0.76"
    PERLBREW_HOME="/Users/damian/.perlbrew"
    PERLBREW_MANPATH="/Users/damian/perl5/perlbrew/perls/perl-5.32.0/man"
    PERLBREW_PATH="/Users/damian/perl5/perlbrew/bin:/Users/damian/perl5/perlbrew/perls/perl-5.32.0/bin"
    PERLBREW_PERL="perl-5.32.0"
    PERLBREW_ROOT="/Users/damian/perl5/perlbrew"
    PERLBREW_VERSION="0.76"
    PERL_MB_OPT="--install_base /Users/damian/perl5/perlbrew/perls/perl-5.32.0/"
    PERL_MM_OPT="INSTALL_BASE=/Users/damian/perl5/perlbrew/perls/perl-5.32.0/"
  @INC:
    ./lib
    /Users/damian/perl5/perlbrew/perls/perl-5.32.0/lib/perl5/darwin-thread-multi-2level/
    /Users/damian/perl5/perlbrew/perls/perl-5.32.0/lib/perl5/darwin-2level/
    /Users/damian/perl5/perlbrew/perls/perl-5.32.0/lib/perl5/darwin-2level
    /Users/damian/perl5/perlbrew/perls/perl-5.32.0/lib/perl5
    /Users/damian/perl5/perlbrew/perls/perl-5.32.0/lib/5.32.0/darwin-2level
    /Users/damian/perl5/perlbrew/perls/perl-5.32.0/lib/5.32.0
    /Users/damian/perl5/perlbrew/perls/perl-5.32.0/lib
    /Users/damian/perl5/perlbrew/perls/perl-5.32.0/lib/site_perl/5.32.0/darwin-2level
    /Users/damian/perl5/perlbrew/perls/perl-5.32.0/lib/site_perl/5.32.0
    /Users/damian/perl5/perlbrew/perls/perl-5.32.0/lib/5.32.0/darwin-2level
    /Users/damian/perl5/perlbrew/perls/perl-5.32.0/lib/5.32.0

@jkeenan
Copy link
Contributor

jkeenan commented Sep 12, 2020

Module:

Description

During regex compilation, Perl v5.32 appears to be misidentifying a \K that occurs
after a nested set of lookaheads as being inside a lookahead, and therefore incorrectly
terminates compilation with:

\K not permitted in lookahead/lookbehind in regex

This is, for example, causing problems for the Regexp::Debugger module,
which injects small lookaheads into regexes to assist with its reporting of
their matching process.

The reported misbehaviour does not occur in Perl v5.30.0 and earlier
(presumably because they didn't prohibit \K inside lookaheads).

Steps to Reproduce

use v5.32.0;
qr/(?=(?=x)x)\K/;

Expected behavior

This regex should compile correctly. The \K is not inside a lookahead.

Assuming this is a bug, bisection points to the following commit:

105c827d9a0f19a772c7b179e2997d842a095460 is the first bad commit
commit 105c827d9a0f19a772c7b179e2997d842a095460
Author: Tony Cook <[email protected]>
Date:   Thu Sep 28 14:40:24 2017 +1000
Commit:     Tony Cook <[email protected]>
CommitDate: Mon Aug 19 15:30:46 2019 +1000

    (perl #124256) disallow \K in lookahead and lookbehind
    
    \K can cause infinite loops in matching in these, and we're not sure
    how it really should behave, so forbid it.

@tonycoz Can you take a look?

Thank you very much.
Jim Keenan

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants