RT #134172: commit 027471cf breaks t/op/sprintf2.t on FreeBSD-11
-
Basics
- Id
- 134172
- Status
- pending release
-
Dates
- Created:
- 2019-06-04 00:14:02
- Author:
- Hugo van der Sanden
- Last Updated:
- 2019-07-31 14:48:34
- Closed:
-
Custom Fields
- Fixed In:
- Operating System:
- HPUX netbsd openbsd freebsd
- PatchStatus:
- Perl Version:
- 5.31.1
- Severity:
- low
- Type:
- core
-
Links
- DependedOnBy:
- DependsOn:
- MemberOf:
- Members:
- ReferredToBy:
- RefersTo:
-
People
- Owner:
- Nobody in particular <>
- Requestors:
- James E Keenan <jkee...@pob...>
- Cc:
- AdminCC:
# James E Keenan <jkee...@pob...>
Mon, 03 Jun 2019 17:14:02 -0700
The following smoke test indicates that a recent change to blead is
causing failures in t/op/sprintf2.t on FreeBSD-11.
#####
http://perl5.test-smoke.org/report/88863
#####
In each of 36 different configurations in the smoke test run, all tests
pass individually, but the file has a whole is graded FAIL.
#####
$ cd t; ./perl harness -v op/sprintf2.t;cd -
ok 1 - the sprintf "%.<number>g" optimization
ok 2 - the sprintf "%.<number>f" optimization
ok 3 - width calculation under utf8 upgrade, length=1
...
ok 1698 - special-case %.0f on -4503599627370503
ok 1699 - croak for very large numeric format results
Dubious, test returned 9 (wstat 2304, 0x900)
All 1699 subtests passed
(less 30 skipped subtests: 1669 okay)
Test Summary Report
-------------------
op/sprintf2.t (Wstat: 2304 Tests: 1699 Failed: 0)
Non-zero exit status: 9
Files=1, Tests=1699, 0 wallclock secs ( 0.20 usr 0.01 sys + 0.16 cusr
0.11 csys = 0.48 CPU)
Result: FAIL
#####
Bisecting with the following command ...
#####
perl Porting/bisect.pl --start=e3c1dc81bd --end=9f58603ced --target
t/op/sprintf2.t
#####
... gave this result:
#####
...
good - zero exit from ./perl -Ilib t/op/sprintf2.t
Runner returned 0 for end revision at Porting/bisect.pl line 233.
That took 114 seconds.
#####
... which is not useful. I hypothesized that something is tickling
.t/TEST or ./t perl harness with respect to the failing test (or vice
versa).
To test this hypothesis, I reviewed 'git log' and noticed two commits
which might have touched sprintf functionality. I built perl at each of
them, saw that one PASS and one FAIL; confirmed the FAIL.
#####
commit 027471cf1095f75f273df40310e4647fe1e8a9df
Author: Tony Cook <[email protected]>
AuthorDate: Wed Mar 20 16:47:49 2019 +1100
Commit: Tony Cook <[email protected]>
CommitDate: Mon Jun 3 15:48:34 2019 +1000
(perl #133913) limit numeric format results to INT_MAX
The return value of v?snprintf() is int, and we pay attention to that
return value, so limit the expected size of numeric formats to INT_MAX.
#####
Thank you very much.
Jim Keenan
#####
Summary of my perl5 (revision 5 version 31 subversion 1) configuration:
Commit id: 027471cf1095f75f273df40310e4647fe1e8a9df
Platform:
osname=freebsd
osvers=11.2-stable
archname=amd64-freebsd-thread-multi
uname='freebsd perlmonger.nycbug.org 11.2-stable freebsd
11.2-stable #0 r339445: sat oct 20 00:08:11 utc 2018
[email protected]:usrobjusrsrcsysgeneric amd64 '
config_args='-des -Dusedevel -Duseithreads -Doptimize=-O2 -pipe
-fstack-protector -fno-strict-aliasing'
hint=recommended
useposix=true
d_sigaction=define
useithreads=define
usemultiplicity=define
use64bitint=define
use64bitall=define
uselongdouble=undef
usemymalloc=n
default_inc_excludes_dot=define
bincompat5005=undef
Compiler:
cc='cc'
ccflags ='-DHAS_FPSETMASK -DHAS_FLOATINGPOINT_H
-fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include
-D_FORTIFY_SOURCE=2'
optimize='-O2 -pipe -fstack-protector -fno-strict-aliasing'
cppflags='-DHAS_FPSETMASK -DHAS_FLOATINGPOINT_H
-fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include'
ccversion=''
gccversion='4.2.1 Compatible FreeBSD Clang 6.0.1
(tags/RELEASE_601/final 335540)'
gccosandvers=''
intsize=4
longsize=8
ptrsize=8
doublesize=8
byteorder=12345678
doublekind=3
d_longlong=define
longlongsize=8
d_longdbl=define
longdblsize=16
longdblkind=3
ivtype='long'
ivsize=8
nvtype='double'
nvsize=8
Off_t='off_t'
lseeksize=8
alignbytes=8
prototype=define
Linker and Libraries:
ld='cc'
ldflags ='-pthread -Wl,-E -fstack-protector-strong -L/usr/local/lib'
libpth=/usr/lib /usr/local/lib /usr/lib/clang/6.0.1/lib /usr/lib
libs=-lpthread -lgdbm -ldl -lm -lcrypt -lutil
perllibs=-lpthread -ldl -lm -lcrypt -lutil
libc=
so=so
useshrplib=false
libperl=libperl.a
gnulibc_version=''
Dynamic Linking:
dlsrc=dl_dlopen.xs
dlext=so
d_dlsymun=undef
ccdlflags=' '
cccdlflags='-DPIC -fPIC'
lddlflags='-shared -L/usr/local/lib -fstack-protector-strong'
Characteristics of this binary (from libperl):
Compile-time options:
HAS_TIMES
MULTIPLICITY
PERLIO_LAYERS
PERL_COPY_ON_WRITE
PERL_DONT_CREATE_GVSV
PERL_IMPLICIT_CONTEXT
PERL_MALLOC_WRAP
PERL_OP_PARENT
PERL_PRESERVE_IVUV
PERL_USE_DEVEL
USE_64_BIT_ALL
USE_64_BIT_INT
USE_ITHREADS
USE_LARGE_FILES
USE_LOCALE
USE_LOCALE_COLLATE
USE_LOCALE_CTYPE
USE_LOCALE_NUMERIC
USE_LOCALE_TIME
USE_PERLIO
USE_PERL_ATOF
USE_REENTRANT_API
Built under freebsd
Compiled at Jun 4 2019 00:08:26
%ENV:
PERL2DIR="/home/jkeenan/gitwork/perl2"
PERL_WORKDIR="/home/jkeenan/gitwork/perl"
@INC:
lib
/usr/local/lib/perl5/site_perl/5.31.1/amd64-freebsd-thread-multi
/usr/local/lib/perl5/site_perl/5.31.1
/usr/local/lib/perl5/5.31.1/amd64-freebsd-thread-multi
/usr/local/lib/perl5/5.31.1
# James E Keenan <jkee...@cpa...>
Mon, 03 Jun 2019 17:25:47 -0700
On Tue, 04 Jun 2019 00:14:02 GMT, [email protected] wrote:
> The following smoke test indicates that a recent change to blead is
> causing failures in t/op/sprintf2.t on FreeBSD-11.
>
> #####
> http://perl5.test-smoke.org/report/88863
> #####
>
> In each of 36 different configurations in the smoke test run, all tests
> pass individually, but the file has a whole is graded FAIL.
>
> #####
> $ cd t; ./perl harness -v op/sprintf2.t;cd -
>
> ok 1 - the sprintf "%.<number>g" optimization
> ok 2 - the sprintf "%.<number>f" optimization
> ok 3 - width calculation under utf8 upgrade, length=1
> ...
> ok 1698 - special-case %.0f on -4503599627370503
> ok 1699 - croak for very large numeric format results
> Dubious, test returned 9 (wstat 2304, 0x900)
> All 1699 subtests passed
> (less 30 skipped subtests: 1669 okay)
>
> Test Summary Report
> -------------------
> op/sprintf2.t (Wstat: 2304 Tests: 1699 Failed: 0)
> Non-zero exit status: 9
> Files=1, Tests=1699, 0 wallclock secs ( 0.20 usr 0.01 sys + 0.16 cusr
> 0.11 csys = 0.48 CPU)
> Result: FAIL
> #####
>
> Bisecting with the following command ...
>
> #####
> perl Porting/bisect.pl --start=e3c1dc81bd --end=9f58603ced --target
> t/op/sprintf2.t
> #####
>
> ... gave this result:
>
> #####
> ...
> good - zero exit from ./perl -Ilib t/op/sprintf2.t
> Runner returned 0 for end revision at Porting/bisect.pl line 233.
> That took 114 seconds.
> #####
>
> ... which is not useful. I hypothesized that something is tickling
> .t/TEST or ./t perl harness with respect to the failing test (or vice
> versa).
>
> To test this hypothesis, I reviewed 'git log' and noticed two commits
> which might have touched sprintf functionality. I built perl at each of
> them, saw that one PASS and one FAIL; confirmed the FAIL.
>
> #####
> commit 027471cf1095f75f273df40310e4647fe1e8a9df
> Author: Tony Cook <[email protected]>
> AuthorDate: Wed Mar 20 16:47:49 2019 +1100
> Commit: Tony Cook <[email protected]>
> CommitDate: Mon Jun 3 15:48:34 2019 +1000
>
> (perl #133913) limit numeric format results to INT_MAX
>
> The return value of v?snprintf() is int, and we pay attention to that
> return value, so limit the expected size of numeric formats to INT_MAX.
> #####
>
> Thank you very much.
> Jim Keenan
>
The problem lies in this unit test which was added to t/op/sprintf2.t in the commit in question:
#####
# large uvsize needed so the large width is parsed properly
# large sizesize needed so the STRLEN check doesn't
if ($Config{intsize} == 4 && $Config{uvsize} > 4 && $Config{sizesize} > 4) {
eval { my $x = sprintf("%7000000000E", 0) };
like($@, qr/^Numeric format result too large at /,
"croak for very large numeric format results");
}
#####
If I comment out this test, I get a PASS.
So something's amiss with that test in relation to the harness.
Thank you very much.
--
James E Keenan ([email protected])
# Hugo van der Sanden <...@cry...>
Mon, 03 Jun 2019 18:15:35 -0700
On Mon, 03 Jun 2019 17:14:02 -0700, [email protected] wrote:
> The following smoke test indicates that a recent change to blead is
> causing failures in t/op/sprintf2.t on FreeBSD-11.
[...]
> op/sprintf2.t (Wstat: 2304 Tests: 1699 Failed: 0)
> Non-zero exit status: 9
It's this non-zero exit status that's making it a fail. That's reporting that it terminated with a SEGV. It's a bit strange that a change is causing a segv only after that test has run successfully, but definitely a good reason for calling it failure.
I can't yet reproduce it here, but you should be getting a core dump. If you can generate one (ideally on a debugging perl) and get a stack trace out of gdb it should be possible to diagnose the problem.
Hugo
# The RT System itself <>
Mon, 03 Jun 2019 18:15:35 -0700
# James E Keenan <jkee...@cpa...>
Mon, 03 Jun 2019 19:22:56 -0700
On Tue, 04 Jun 2019 01:15:35 GMT, hv wrote:
> On Mon, 03 Jun 2019 17:14:02 -0700, [email protected] wrote:
> > The following smoke test indicates that a recent change to blead is
> > causing failures in t/op/sprintf2.t on FreeBSD-11.
> [...]
> > op/sprintf2.t (Wstat: 2304 Tests: 1699 Failed: 0)
> > Non-zero exit status: 9
>
> It's this non-zero exit status that's making it a fail. That's
> reporting that it terminated with a SEGV. It's a bit strange that a
> change is causing a segv only after that test has run successfully,
> but definitely a good reason for calling it failure.
>
> I can't yet reproduce it here, but you should be getting a core dump.
> If you can generate one (ideally on a debugging perl) and get a stack
> trace out of gdb it should be possible to diagnose the problem.
>
> Hugo
I haven't used gdb in years, so I'm fumbling around.
Does this help?
#####
[t] $ gdb ./perl perl.core
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "amd64-marcel-freebsd"...
Core was generated by `./perl -I.. -MTestInit op/sprintf2.t'.
Program terminated with signal 11, Segmentation fault.
Reading symbols from /lib/libthr.so.3...Reading symbols from /usr/lib/debug//lib/libthr.so.3.debug...done.
done.
Loaded symbols for /lib/libthr.so.3
Reading symbols from /usr/lib/libdl.so.1...Reading symbols from /usr/lib/debug//usr/lib/libdl.so.1.debug...done.
done.
Loaded symbols for /usr/lib/libdl.so.1
Reading symbols from /lib/libm.so.5...Reading symbols from /usr/lib/debug//lib/libm.so.5.debug...done.
done.
Loaded symbols for /lib/libm.so.5
Reading symbols from /lib/libcrypt.so.5...Reading symbols from /usr/lib/debug//lib/libcrypt.so.5.debug...done.
done.
Loaded symbols for /lib/libcrypt.so.5
Reading symbols from /lib/libutil.so.9...Reading symbols from /usr/lib/debug//lib/libutil.so.9.debug...done.
done.
Loaded symbols for /lib/libutil.so.9
Reading symbols from /lib/libc.so.7...Reading symbols from /usr/lib/debug//lib/libc.so.7.debug...done.
done.
Loaded symbols for /lib/libc.so.7
Reading symbols from /libexec/ld-elf.so.1...Reading symbols from /usr/lib/debug//libexec/ld-elf.so.1.debug...done.
done.
Loaded symbols for /libexec/ld-elf.so.1
#0 S_my_exit_jump (my_perl=0x801e22000) at perl.c:5261
5261 if (cxstack_ix >= 0) {
(gdb) bt
#0 S_my_exit_jump (my_perl=0x801e22000) at perl.c:5261
#1 0x0000000000458107 in Perl_my_failure_exit (my_perl=0x801e22000) at perl.c:5249
#2 0x00000000005987e2 in Perl_die_unwind (my_perl=0x801e22000, msv=0x801f56838) at pp_ctl.c:1797
#3 0x00000000004f4788 in Perl_vcroak (my_perl=0x801e22000, pat=0x801f56838 "\030 �\001\b", args=0x7fffffffe2a0)
at util.c:1711
#4 0x00000000004f1b27 in Perl_croak_nocontext (pat=<value optimized out>) at util.c:1745
#5 0x00000000006128bb in PerlIOUnix_refcnt_inc (fd=0) at perlio.c:2309
#6 0x0000000000613298 in PerlIOUnix_open (my_perl=0x801e22000, self=0x81dd30, layers=0x801e1b8d8, n=0,
mode=0x81e4c0 "r", fd=0, imode=0, perm=0, f=0x801eac038, narg=0, args=0x0) at perlio.c:2607
#7 0x000000000061519c in PerlIOBuf_open (my_perl=0x801e22000, self=0x81da90, layers=0x801e1b8d8, n=1,
mode=<value optimized out>, fd=1, imode=0, perm=0, f=0x0, narg=0, args=0x0) at perlio.c:3897
#8 0x0000000000611238 in PerlIO_openn (my_perl=0x801e22000, layers=<value optimized out>,
mode=<value optimized out>, fd=0, imode=<value optimized out>, perm=<value optimized out>, f=0x0, narg=0,
args=0x0) at perlio.c:1550
#9 0x000000000061010e in PerlIO_stdstreams (my_perl=0x801e22000) at perlio.c:4928
#10 0x0000000000616ecd in Perl_PerlIO_stderr (my_perl=0x801e22000) at perlio.c:4884
#11 0x00000000004f4463 in Perl_write_to_stderr (my_perl=0x801e22000, msv=0x801f56838) at util.c:1516
#12 0x00000000005987da in Perl_die_unwind (my_perl=0x801e22000, msv=0x801f56838) at pp_ctl.c:1796
#13 0x00000000004f4788 in Perl_vcroak (my_perl=0x801e22000, pat=0x801f56838 "\030 �\001\b", args=0x7fffffffe680)
at util.c:1711
#14 0x00000000004f1b27 in Perl_croak_nocontext (pat=<value optimized out>) at util.c:1745
#15 0x000000000044c57e in Perl_sys_term () at perl.c:146
#16 0x0000000000421472 in main (argc=<value optimized out>, argv=<value optimized out>, env=0x7fffffffe770)
---Type <return> to continue, or q <return> to quit---
at perlmain.c:155
#####
--
James E Keenan ([email protected])
# James E Keenan <jkee...@cpa...>
Mon, 03 Jun 2019 19:35:25 -0700
On Tue, 04 Jun 2019 02:22:56 GMT, jkeenan wrote:
> #3 0x00000000004f4788 in Perl_vcroak (my_perl=0x801e22000,
> pat=0x801f56838 "\030 �\001\b", args=0x7fffffffe2a0)
> at util.c:1711
That non-printing character is octal:
357 277 275
> #4 0x00000000004f1b27 in Perl_croak_nocontext (pat=<value optimized
> out>) at util.c:1745
...
> #12 0x00000000005987da in Perl_die_unwind (my_perl=0x801e22000,
> msv=0x801f56838) at pp_ctl.c:1796
> #13 0x00000000004f4788 in Perl_vcroak (my_perl=0x801e22000,
> pat=0x801f56838 "\030 �\001\b", args=0x7fffffffe680)
> at util.c:1711
Same here.
--
James E Keenan ([email protected])
# karl williamson <publ...@khw...>
Mon, 03 Jun 2019 20:06:27 -0700
On 6/3/19 8:35 PM, James E Keenan via RT wrote:
> On Tue, 04 Jun 2019 02:22:56 GMT, jkeenan wrote:
>> #3 0x00000000004f4788 in Perl_vcroak (my_perl=0x801e22000,
>> pat=0x801f56838 "\030 �\001\b", args=0x7fffffffe2a0)
>> at util.c:1711
>
> That non-printing character is octal:
>
> 357 277 275
which is REPLACEMENT CHARACTER
>
>> #4 0x00000000004f1b27 in Perl_croak_nocontext (pat=<value optimized
>> out>) at util.c:1745
> ...
>> #12 0x00000000005987da in Perl_die_unwind (my_perl=0x801e22000,
>> msv=0x801f56838) at pp_ctl.c:1796
>> #13 0x00000000004f4788 in Perl_vcroak (my_perl=0x801e22000,
>> pat=0x801f56838 "\030 �\001\b", args=0x7fffffffe680)
>> at util.c:1711
>
> Same here.
>
# Hugo van der Sanden <...@cry...>
Mon, 03 Jun 2019 21:15:59 -0700
On Mon, 03 Jun 2019 19:22:56 -0700, jkeenan wrote:
> On Tue, 04 Jun 2019 01:15:35 GMT, hv wrote:
> > On Mon, 03 Jun 2019 17:14:02 -0700, [email protected] wrote:
> > > The following smoke test indicates that a recent change to blead is
> > > causing failures in t/op/sprintf2.t on FreeBSD-11.
> > [...]
> > > op/sprintf2.t (Wstat: 2304 Tests: 1699 Failed: 0)
> > > Non-zero exit status: 9
> >
> > It's this non-zero exit status that's making it a fail. That's
> > reporting that it terminated with a SEGV.
>
> I haven't used gdb in years, so I'm fumbling around.
>
> Does this help?
Well, it is exactly what I asked for.
> #13 0x00000000004f4788 in Perl_vcroak (my_perl=0x801e22000, pat=0x801f56838 "\030 �\001\b", args=0x7fffffffe680) at util.c:1711
> #14 0x00000000004f1b27 in Perl_croak_nocontext (pat=<value optimized
> out>) at util.c:1745
> #15 0x000000000044c57e in Perl_sys_term () at perl.c:146
> #16 0x0000000000421472 in main (argc=<value optimized out>,
> argv=<value optimized out>, env=0x7fffffffe770)
> at perlmain.c:155
We're in sys_term, so most of the interpreter has already been wiped out, but we're then calling croak(), which relies on the interpreter's infrastructure.
My best guess is that we're hitting 'Perl_croak_nocontext("panic: MUTEX_UNLOCK (%d) [%s:%d]" ...)' in one of the MUTEX_UNLOCK calls, and some wild writes have corrupted both one of our mutexes and perhaps this panic string.
It seems like it'd be hard for it to get this far with that level of corruption though.
I'll try harder to reproduce it: I'd need some interactive gdb work to diagnose it.
Hugo
# Hugo van der Sanden <...@cry...>
Mon, 03 Jun 2019 21:27:17 -0700
On Mon, 03 Jun 2019 21:15:59 -0700, hv wrote:
> On Mon, 03 Jun 2019 19:22:56 -0700, jkeenan wrote:
> > On Tue, 04 Jun 2019 01:15:35 GMT, hv wrote:
> > > On Mon, 03 Jun 2019 17:14:02 -0700, [email protected] wrote:
> > > > The following smoke test indicates that a recent change to blead
> > > > is
> > > > causing failures in t/op/sprintf2.t on FreeBSD-11.
> > > [...]
> > > > op/sprintf2.t (Wstat: 2304 Tests: 1699 Failed: 0)
> > > > Non-zero exit status: 9
> > >
> > > It's this non-zero exit status that's making it a fail. That's
> > > reporting that it terminated with a SEGV.
> >
> > I haven't used gdb in years, so I'm fumbling around.
> >
> > Does this help?
>
> Well, it is exactly what I asked for.
>
> > #13 0x00000000004f4788 in Perl_vcroak (my_perl=0x801e22000,
> > pat=0x801f56838 "\030 �\001\b", args=0x7fffffffe680) at util.c:1711
> > #14 0x00000000004f1b27 in Perl_croak_nocontext (pat=<value optimized
> > out> ) at util.c:1745
> > #15 0x000000000044c57e in Perl_sys_term () at perl.c:146
> > #16 0x0000000000421472 in main (argc=<value optimized out>,
> > argv=<value optimized out>, env=0x7fffffffe770)
> > at perlmain.c:155
>
> We're in sys_term, so most of the interpreter has already been wiped
> out, but we're then calling croak(), which relies on the interpreter's
> infrastructure.
>
> My best guess is that we're hitting 'Perl_croak_nocontext("panic:
> MUTEX_UNLOCK (%d) [%s:%d]" ...)' in one of the MUTEX_UNLOCK calls, and
> some wild writes have corrupted both one of our mutexes and perhaps
> this panic string.
Oops, s/MUTEX_UNLOCK/MUTEX_DESTROY/g
>
> It seems like it'd be hard for it to get this far with that level of
> corruption though.
>
> I'll try harder to reproduce it: I'd need some interactive gdb work to
> diagnose it.
>
> Hugo
# James E Keenan <jkee...@cpa...>
Tue, 04 Jun 2019 17:37:31 -0700
On Tue, 04 Jun 2019 04:15:59 GMT, hv wrote:
> On Mon, 03 Jun 2019 19:22:56 -0700, jkeenan wrote:
> > On Tue, 04 Jun 2019 01:15:35 GMT, hv wrote:
> > > On Mon, 03 Jun 2019 17:14:02 -0700, [email protected] wrote:
> > > > The following smoke test indicates that a recent change to blead
> > > > is
> > > > causing failures in t/op/sprintf2.t on FreeBSD-11.
> > > [...]
> > > > op/sprintf2.t (Wstat: 2304 Tests: 1699 Failed: 0)
> > > > Non-zero exit status: 9
> > >
> > > It's this non-zero exit status that's making it a fail. That's
> > > reporting that it terminated with a SEGV.
> >
> > I haven't used gdb in years, so I'm fumbling around.
> >
> > Does this help?
>
> Well, it is exactly what I asked for.
>
> > #13 0x00000000004f4788 in Perl_vcroak (my_perl=0x801e22000,
> > pat=0x801f56838 "\030 �\001\b", args=0x7fffffffe680) at util.c:1711
> > #14 0x00000000004f1b27 in Perl_croak_nocontext (pat=<value optimized
> > out> ) at util.c:1745
> > #15 0x000000000044c57e in Perl_sys_term () at perl.c:146
> > #16 0x0000000000421472 in main (argc=<value optimized out>,
> > argv=<value optimized out>, env=0x7fffffffe770)
> > at perlmain.c:155
>
> We're in sys_term, so most of the interpreter has already been wiped
> out, but we're then calling croak(), which relies on the interpreter's
> infrastructure.
>
> My best guess is that we're hitting 'Perl_croak_nocontext("panic:
> MUTEX_UNLOCK (%d) [%s:%d]" ...)' in one of the MUTEX_UNLOCK calls, and
> some wild writes have corrupted both one of our mutexes and perhaps
> this panic string.
>
> It seems like it'd be hard for it to get this far with that level of
> corruption though.
>
> I'll try harder to reproduce it: I'd need some interactive gdb work to
> diagnose it.
>
> Hugo
This is now showing up on other OSes. See:
http://perl5.test-smoke.org/submatrix?test=../t/op/sprintf2.t&pversion=5.31.1
--
James E Keenan ([email protected])
# Hugo van der Sanden <...@cry...>
Wed, 05 Jun 2019 03:39:48 -0700
On Mon, 03 Jun 2019 21:15:59 -0700, hv wrote:
> On Mon, 03 Jun 2019 19:22:56 -0700, jkeenan wrote:
> > #13 0x00000000004f4788 in Perl_vcroak (my_perl=0x801e22000,
> > pat=0x801f56838 "\030 �\001\b", args=0x7fffffffe680) at util.c:1711
> > #14 0x00000000004f1b27 in Perl_croak_nocontext (pat=<value optimized
> > out> ) at util.c:1745
> > #15 0x000000000044c57e in Perl_sys_term () at perl.c:146
> > #16 0x0000000000421472 in main (argc=<value optimized out>,
> > argv=<value optimized out>, env=0x7fffffffe770)
> > at perlmain.c:155
>
> We're in sys_term, so most of the interpreter has already been wiped
> out, but we're then calling croak(), which relies on the interpreter's
> infrastructure.
[...]
> I'll try harder to reproduce it: I'd need some interactive gdb work to
> diagnose it.
I still cannot reproduce here. Please could you try getting another stacktrace with the attached patch applied, to narrow down which bit of Perl_sys_term() we're in?
To help with subsequent tests, it would also be useful to narrow down the testcase. Please check if any of these reproduce the SEGV:
./perl -e 'my $x = sprintf("%7000000000E", 0)'
./perl -e 'eval { my $x = sprintf("%7000000000E", 0) }; print "ok\n"'
the attached sprintf2-short.t, with all but the new test removed
Cheers,
Hugo
From 03ad1fae97d9b34922e0e00f387cebe466ca13ef Mon Sep 17 00:00:00 2001
From: Hugo van der Sanden <[email protected]>
Date: Wed, 5 Jun 2019 11:29:12 +0100
Subject: [PATCH] DO NOT COMMIT: expand PERL_SYS_TERM_BODY
Expand the unixish body for Perl_sys_term() for more useful stacktrace.
---
perl.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/perl.c b/perl.c
index 69e3efcd70..acb7c4f2a2 100644
--- a/perl.c
+++ b/perl.c
@@ -143,7 +143,14 @@ Perl_sys_term(void)
{
dVAR;
if (!PL_veto_cleanup) {
- PERL_SYS_TERM_BODY();
+ HINTS_REFCNT_TERM; /* PL_hints_mutex */
+ KEYWORD_PLUGIN_MUTEX_TERM; /* PL_keyword_plugin_mutex */
+ OP_CHECK_MUTEX_TERM; /* PL_check_mutex */
+ OP_REFCNT_TERM; /* PL_op_mutex */
+ PERLIO_TERM; /* PerlIO_teardown() */
+ MALLOC_TERM; /* PL_malloc_mutex */
+ LOCALE_TERM; /* PL_locale_mutex */
+ USER_PROP_MUTEX_TERM; /* PL_user_prop_mutex */
}
}
--
2.17.1
#!./perl -w
# Tests for sprintf that do not fit the format of sprintf.t.
BEGIN {
chdir 't' if -d 't';
require './test.pl';
require './charset_tools.pl';
set_up_inc('../lib');
}
use strict;
use Config;
# large uvsize needed so the large width is parsed properly
# large sizesize needed so the STRLEN check doesn't
if ($Config{intsize} == 4 && $Config{uvsize} > 4 && $Config{sizesize} > 4) {
eval { my $x = sprintf("%7000000000E", 0) };
like($@, qr/^Numeric format result too large at /,
"croak for very large numeric format results");
}
done_testing();
# James E Keenan <jkee...@cpa...>
Wed, 05 Jun 2019 05:39:11 -0700
On Wed, 05 Jun 2019 10:39:48 GMT, hv wrote:
> On Mon, 03 Jun 2019 21:15:59 -0700, hv wrote:
> > On Mon, 03 Jun 2019 19:22:56 -0700, jkeenan wrote:
> > > #13 0x00000000004f4788 in Perl_vcroak (my_perl=0x801e22000,
> > > pat=0x801f56838 "\030 �\001\b", args=0x7fffffffe680) at util.c:1711
> > > #14 0x00000000004f1b27 in Perl_croak_nocontext (pat=<value
> > > optimized
> > > out> ) at util.c:1745
> > > #15 0x000000000044c57e in Perl_sys_term () at perl.c:146
> > > #16 0x0000000000421472 in main (argc=<value optimized out>,
> > > argv=<value optimized out>, env=0x7fffffffe770)
> > > at perlmain.c:155
> >
> > We're in sys_term, so most of the interpreter has already been wiped
> > out, but we're then calling croak(), which relies on the
> > interpreter's
> > infrastructure.
> [...]
> > I'll try harder to reproduce it: I'd need some interactive gdb work
> > to
> > diagnose it.
>
> I still cannot reproduce here. Please could you try getting another
> stacktrace with the attached patch applied, to narrow down which bit
> of Perl_sys_term() we're in?
>
> To help with subsequent tests, it would also be useful to narrow down
> the testcase. Please check if any of these reproduce the SEGV:
> ./perl -e 'my $x = sprintf("%7000000000E", 0)'
> ./perl -e 'eval { my $x = sprintf("%7000000000E", 0) }; print
> "ok\n"'
> the attached sprintf2-short.t, with all but the new test removed
>
(The previously reported backtrace was from a version of t/op/sprintf2.t with all tests but the final one commented out.)
See attachment, hugo-results-20190605.txt.
--
James E Keenan ([email protected])
Work on RT 134172 sprintf2.t on FreeBSD
06/05 07:53 am
1. Confirmed that I still have intact the DEBUGGING perl I built on nycbug host. Noted that at this point, while I'm in local branch rt-134172-sprintf2, that branch is identical to blead. Confirmed that t/op/sprintf2.t FAIL with segfault.
2. Copied Hugo's sprintf2-short.t into t/op. Ran it thru harness. Got FAIL.
3. Tried Hugo's two one-liners with my DEBUGGING build.
a. Hugo's first one-liner. (Note that this generates a warning.)
$ rm ./perl.core; ./perl -e 'my $x = sprintf("%7000000000E", 0)'; ls -ltr . | tail -1
Numeric format result too large at -e line 1.
Segmentation fault (core dumped)
-rw------- 1 jkeenan jkeenan 4816896 Jun 5 12:08 perl.core
b. Hugo's second one-liner. (Note that this does not generate a warning.)
$ rm ./perl.core; ./perl -e 'eval { my $x = sprintf("%7000000000E", 0) }; print "ok\n"'; ls -ltr . | tail -1
Segmentation fault (core dumped)
-rw------- 1 jkeenan jkeenan 4816896 Jun 5 12:09 perl.core
4. Applied Hugo's patch to branch; rebuilt perl with customary FreeBSD configuration switches plus '-DDEBUGGING'.
$ git clean -dfx
$ git am < ~/learn/perl/p5p/rt-134172-sprintf2/0001-DO-NOT-COMMIT-expand-PERL_SYS_TERM_BODY.patch
$ debugging_configure && make test_prep
$ $ ls ./perl.core t/perl.core
ls: ./perl.core: No such file or directory
ls: t/perl.core: No such file or directory
a. Re-run the test file and inspect the core dump.
#####
$ cd t;./perl harness -v op/sprintf2-short.t; cd -
ok 1 - croak for very large numeric format results
All 1 subtests passed
Test Summary Report
-------------------
op/sprintf2-short.t (Wstat: 139 Tests: 1 Failed: 0)
Non-zero wait status: 139
Files=1, Tests=1, 0 wallclock secs ( 0.02 usr 0.00 sys + 0.02 cusr 0.02 csys = 0.06 CPU)
Result: FAIL
/usr/home/jkeenan/gitwork/perl
$ gdb perl t/perl.core
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "amd64-marcel-freebsd"...
Core was generated by `./perl -I.. -MTestInit op/sprintf2-short.t'.
Program terminated with signal 11, Segmentation fault.
Reading symbols from /lib/libthr.so.3...Reading symbols from /usr/lib/debug//lib/libthr.so.3.debug...done.
done.
Loaded symbols for /lib/libthr.so.3
Reading symbols from /usr/lib/libdl.so.1...Reading symbols from /usr/lib/debug//usr/lib/libdl.so.1.debug...done.
done.
Loaded symbols for /usr/lib/libdl.so.1
Reading symbols from /lib/libm.so.5...Reading symbols from /usr/lib/debug//lib/libm.so.5.debug...done.
done.
Loaded symbols for /lib/libm.so.5
Reading symbols from /lib/libcrypt.so.5...Reading symbols from /usr/lib/debug//lib/libcrypt.so.5.debug...done.
done.
Loaded symbols for /lib/libcrypt.so.5
Reading symbols from /lib/libutil.so.9...Reading symbols from /usr/lib/debug//lib/libutil.so.9.debug...done.
done.
Loaded symbols for /lib/libutil.so.9
Reading symbols from /lib/libc.so.7...Reading symbols from /usr/lib/debug//lib/libc.so.7.debug...done.
done.
Loaded symbols for /lib/libc.so.7
Reading symbols from /libexec/ld-elf.so.1...Reading symbols from /usr/lib/debug//libexec/ld-elf.so.1.debug...done.
done.
Loaded symbols for /libexec/ld-elf.so.1
#0 0x0000000000597ad9 in Perl_dounwind (my_perl=0x801e22000, cxix=-1) at pp_ctl.c:1550
1550 CX_LEAVE_SCOPE(cx);
(gdb) bt
#0 0x0000000000597ad9 in Perl_dounwind (my_perl=0x801e22000, cxix=-1) at pp_ctl.c:1550
#1 0x0000000000454979 in S_my_exit_jump (my_perl=0x801e22000) at perl.c:5269
#2 0x00000000004581a7 in Perl_my_failure_exit (my_perl=0x801e22000) at perl.c:5256
#3 0x0000000000598882 in Perl_die_unwind (my_perl=0x801e22000, msv=0x801e1f048) at pp_ctl.c:1797
#4 0x00000000004f4828 in Perl_vcroak (my_perl=0x801e22000, pat=0x801e1f048 "\030 �\001\b", args=0x7fffffffe290)
at util.c:1711
#5 0x00000000004f1bc7 in Perl_croak_nocontext (pat=<value optimized out>) at util.c:1745
#6 0x000000000061295b in PerlIOUnix_refcnt_inc (fd=0) at perlio.c:2309
#7 0x0000000000613338 in PerlIOUnix_open (my_perl=0x801e22000, self=0x81ddd0, layers=0x801e1b8d8, n=0,
mode=0x81e560 "r", fd=0, imode=0, perm=0, f=0x801eaba38, narg=0, args=0x0) at perlio.c:2607
#8 0x000000000061523c in PerlIOBuf_open (my_perl=0x801e22000, self=0x81db30, layers=0x801e1b8d8, n=1,
mode=<value optimized out>, fd=1, imode=0, perm=0, f=0x0, narg=0, args=0x0) at perlio.c:3897
#9 0x00000000006112d8 in PerlIO_openn (my_perl=0x801e22000, layers=<value optimized out>,
mode=<value optimized out>, fd=0, imode=<value optimized out>, perm=<value optimized out>, f=0x0, narg=0,
args=0x0) at perlio.c:1550
#10 0x00000000006101ae in PerlIO_stdstreams (my_perl=0x801e22000) at perlio.c:4928
#11 0x0000000000616f6d in Perl_PerlIO_stderr (my_perl=0x801e22000) at perlio.c:4884
#12 0x00000000004f4503 in Perl_write_to_stderr (my_perl=0x801e22000, msv=0x801e1f048) at util.c:1516
#13 0x000000000059887a in Perl_die_unwind (my_perl=0x801e22000, msv=0x801e1f048) at pp_ctl.c:1796
#14 0x00000000004f4828 in Perl_vcroak (my_perl=0x801e22000, pat=0x801e1f048 "\030 �\001\b", args=0x7fffffffe670)
at util.c:1711
#15 0x00000000004f1bc7 in Perl_croak_nocontext (pat=<value optimized out>) at util.c:1745
#16 0x000000000044c596 in Perl_sys_term () at perl.c:152
#17 0x0000000000421472 in main (argc=<value optimized out>, argv=<value optimized out>, env=0x7fffffffe758)
at perlmain.c:155
$ rm t/perl.core
#####
b. Run Hugo's first one-liner and inspect the core dump.
#####
$ rm ./perl.core; ./perl -e 'my $x = sprintf("%7000000000E", 0)'; ls -ltr . | tail -1
rm: ./perl.core: No such file or directory
Numeric format result too large at -e line 1.
Segmentation fault (core dumped)
-rw------- 1 jkeenan jkeenan 4816896 Jun 5 12:26 perl.core
[perl] $ gdb perl ./perl.core
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "amd64-marcel-freebsd"...
Core was generated by `./perl -e my $x = sprintf("%7000000000E", 0)'.
Program terminated with signal 11, Segmentation fault.
Reading symbols from /lib/libthr.so.3...Reading symbols from /usr/lib/debug//lib/libthr.so.3.debug...done.
done.
Loaded symbols for /lib/libthr.so.3
Reading symbols from /usr/lib/libdl.so.1...Reading symbols from /usr/lib/debug//usr/lib/libdl.so.1.debug...done.
done.
Loaded symbols for /usr/lib/libdl.so.1
Reading symbols from /lib/libm.so.5...Reading symbols from /usr/lib/debug//lib/libm.so.5.debug...done.
done.
Loaded symbols for /lib/libm.so.5
Reading symbols from /lib/libcrypt.so.5...Reading symbols from /usr/lib/debug//lib/libcrypt.so.5.debug...done.
done.
Loaded symbols for /lib/libcrypt.so.5
Reading symbols from /lib/libutil.so.9...Reading symbols from /usr/lib/debug//lib/libutil.so.9.debug...done.
done.
Loaded symbols for /lib/libutil.so.9
Reading symbols from /lib/libc.so.7...Reading symbols from /usr/lib/debug//lib/libc.so.7.debug...done.
done.
Loaded symbols for /lib/libc.so.7
Reading symbols from /libexec/ld-elf.so.1...Reading symbols from /usr/lib/debug//libexec/ld-elf.so.1.debug...done.
done.
Loaded symbols for /libexec/ld-elf.so.1
#0 0x0000000000597ad9 in Perl_dounwind (my_perl=0x801e22000, cxix=-1) at pp_ctl.c:1550
1550 CX_LEAVE_SCOPE(cx);
(gdb) bt
#0 0x0000000000597ad9 in Perl_dounwind (my_perl=0x801e22000, cxix=-1) at pp_ctl.c:1550
#1 0x0000000000454979 in S_my_exit_jump (my_perl=0x801e22000) at perl.c:5269
#2 0x00000000004581a7 in Perl_my_failure_exit (my_perl=0x801e22000) at perl.c:5256
#3 0x0000000000598882 in Perl_die_unwind (my_perl=0x801e22000, msv=0x801e1f048) at pp_ctl.c:1797
#4 0x00000000004f4828 in Perl_vcroak (my_perl=0x801e22000, pat=0x801e1f048 "\030 �\001\b", args=0x7fffffffe310)
at util.c:1711
#5 0x00000000004f1bc7 in Perl_croak_nocontext (pat=<value optimized out>) at util.c:1745
#6 0x000000000061295b in PerlIOUnix_refcnt_inc (fd=0) at perlio.c:2309
#7 0x0000000000613338 in PerlIOUnix_open (my_perl=0x801e22000, self=0x81ddd0, layers=0x801e1b8d8, n=0,
mode=0x81e560 "r", fd=0, imode=0, perm=0, f=0x801ea9038, narg=0, args=0x0) at perlio.c:2607
#8 0x000000000061523c in PerlIOBuf_open (my_perl=0x801e22000, self=0x81db30, layers=0x801e1b8d8, n=1,
mode=<value optimized out>, fd=1, imode=0, perm=0, f=0x0, narg=0, args=0x0) at perlio.c:3897
#9 0x00000000006112d8 in PerlIO_openn (my_perl=0x801e22000, layers=<value optimized out>,
mode=<value optimized out>, fd=0, imode=<value optimized out>, perm=<value optimized out>, f=0x0, narg=0,
args=0x0) at perlio.c:1550
#10 0x00000000006101ae in PerlIO_stdstreams (my_perl=0x801e22000) at perlio.c:4928
#11 0x0000000000616f6d in Perl_PerlIO_stderr (my_perl=0x801e22000) at perlio.c:4884
#12 0x00000000004f4503 in Perl_write_to_stderr (my_perl=0x801e22000, msv=0x801e1f048) at util.c:1516
#13 0x000000000059887a in Perl_die_unwind (my_perl=0x801e22000, msv=0x801e1f048) at pp_ctl.c:1796
#14 0x00000000004f4828 in Perl_vcroak (my_perl=0x801e22000, pat=0x801e1f048 "\030 �\001\b", args=0x7fffffffe6f0)
at util.c:1711
#15 0x00000000004f1bc7 in Perl_croak_nocontext (pat=<value optimized out>) at util.c:1745
#16 0x000000000044c596 in Perl_sys_term () at perl.c:152
---Type <return> to continue, or q <return> to quit---
#17 0x0000000000421472 in main (argc=<value optimized out>, argv=<value optimized out>, env=0x7fffffffe7d8)
at perlmain.c:155
#####
c. Run Hugo's second one-liner and inspect the core-dump.
#####
$ rm ./perl.core; ./perl -e 'eval { my $x = sprintf("%7000000000E", 0) }; print "ok\n"'; ls -ltr . | tail -1
ok
Segmentation fault (core dumped)
-rw------- 1 jkeenan jkeenan 4816896 Jun 5 12:27 perl.core
$ gdb perl ./perl.core
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "amd64-marcel-freebsd"...
Core was generated by `./perl -e eval { my $x = sprintf("%7000000000E", 0) }; print "ok\n"'.
Program terminated with signal 11, Segmentation fault.
Reading symbols from /lib/libthr.so.3...Reading symbols from /usr/lib/debug//lib/libthr.so.3.debug...done.
done.
Loaded symbols for /lib/libthr.so.3
Reading symbols from /usr/lib/libdl.so.1...Reading symbols from /usr/lib/debug//usr/lib/libdl.so.1.debug...done.
done.
Loaded symbols for /usr/lib/libdl.so.1
Reading symbols from /lib/libm.so.5...Reading symbols from /usr/lib/debug//lib/libm.so.5.debug...done.
done.
Loaded symbols for /lib/libm.so.5
Reading symbols from /lib/libcrypt.so.5...Reading symbols from /usr/lib/debug//lib/libcrypt.so.5.debug...done.
done.
Loaded symbols for /lib/libcrypt.so.5
Reading symbols from /lib/libutil.so.9...Reading symbols from /usr/lib/debug//lib/libutil.so.9.debug...done.
done.
Loaded symbols for /lib/libutil.so.9
Reading symbols from /lib/libc.so.7...Reading symbols from /usr/lib/debug//lib/libc.so.7.debug...done.
done.
Loaded symbols for /lib/libc.so.7
Reading symbols from /libexec/ld-elf.so.1...Reading symbols from /usr/lib/debug//libexec/ld-elf.so.1.debug...done.
done.
Loaded symbols for /libexec/ld-elf.so.1
#0 0x0000000000597ad9 in Perl_dounwind (my_perl=0x801e22000, cxix=-1) at pp_ctl.c:1550
1550 CX_LEAVE_SCOPE(cx);
(gdb) bt
#0 0x0000000000597ad9 in Perl_dounwind (my_perl=0x801e22000, cxix=-1) at pp_ctl.c:1550
#1 0x0000000000454979 in S_my_exit_jump (my_perl=0x801e22000) at perl.c:5269
#2 0x00000000004581a7 in Perl_my_failure_exit (my_perl=0x801e22000) at perl.c:5256
#3 0x0000000000598882 in Perl_die_unwind (my_perl=0x801e22000, msv=0x801e1f048) at pp_ctl.c:1797
#4 0x00000000004f4828 in Perl_vcroak (my_perl=0x801e22000, pat=0x801e1f048 "\030 �\001\b", args=0x7fffffffe300)
at util.c:1711
#5 0x00000000004f1bc7 in Perl_croak_nocontext (pat=<value optimized out>) at util.c:1745
#6 0x000000000061295b in PerlIOUnix_refcnt_inc (fd=0) at perlio.c:2309
#7 0x0000000000613338 in PerlIOUnix_open (my_perl=0x801e22000, self=0x81ddd0, layers=0x801e1b898, n=0,
mode=0x81e560 "r", fd=0, imode=0, perm=0, f=0x801ea9038, narg=0, args=0x0) at perlio.c:2607
#8 0x000000000061523c in PerlIOBuf_open (my_perl=0x801e22000, self=0x81db30, layers=0x801e1b898, n=1,
mode=<value optimized out>, fd=1, imode=0, perm=0, f=0x0, narg=0, args=0x0) at perlio.c:3897
#9 0x00000000006112d8 in PerlIO_openn (my_perl=0x801e22000, layers=<value optimized out>,
mode=<value optimized out>, fd=0, imode=<value optimized out>, perm=<value optimized out>, f=0x0, narg=0,
args=0x0) at perlio.c:1550
#10 0x00000000006101ae in PerlIO_stdstreams (my_perl=0x801e22000) at perlio.c:4928
#11 0x0000000000616f6d in Perl_PerlIO_stderr (my_perl=0x801e22000) at perlio.c:4884
#12 0x00000000004f4503 in Perl_write_to_stderr (my_perl=0x801e22000, msv=0x801e1f048) at util.c:1516
#13 0x000000000059887a in Perl_die_unwind (my_perl=0x801e22000, msv=0x801e1f048) at pp_ctl.c:1796
#14 0x00000000004f4828 in Perl_vcroak (my_perl=0x801e22000, pat=0x801e1f048 "\030 �\001\b", args=0x7fffffffe6e0)
at util.c:1711
#15 0x00000000004f1bc7 in Perl_croak_nocontext (pat=<value optimized out>) at util.c:1745
#16 0x000000000044c596 in Perl_sys_term () at perl.c:152
#17 0x0000000000421472 in main (argc=<value optimized out>, argv=<value optimized out>, env=0x7fffffffe7c0)
at perlmain.c:155
#####
# James E Keenan <jkee...@cpa...>
Wed, 05 Jun 2019 05:45:02 -0700
On Wed, 05 Jun 2019 00:37:31 GMT, jkeenan wrote:
> On Tue, 04 Jun 2019 04:15:59 GMT, hv wrote:
> > On Mon, 03 Jun 2019 19:22:56 -0700, jkeenan wrote:
> > > On Tue, 04 Jun 2019 01:15:35 GMT, hv wrote:
> > > > On Mon, 03 Jun 2019 17:14:02 -0700, [email protected] wrote:
> > > > > The following smoke test indicates that a recent change to
> > > > > blead
> > > > > is
> > > > > causing failures in t/op/sprintf2.t on FreeBSD-11.
> > > > [...]
> > > > > op/sprintf2.t (Wstat: 2304 Tests: 1699 Failed: 0)
> > > > > Non-zero exit status: 9
> > > >
> > > > It's this non-zero exit status that's making it a fail. That's
> > > > reporting that it terminated with a SEGV.
> > >
> > > I haven't used gdb in years, so I'm fumbling around.
> > >
> > > Does this help?
> >
> > Well, it is exactly what I asked for.
> >
> > > #13 0x00000000004f4788 in Perl_vcroak (my_perl=0x801e22000,
> > > pat=0x801f56838 "\030 �\001\b", args=0x7fffffffe680) at util.c:1711
> > > #14 0x00000000004f1b27 in Perl_croak_nocontext (pat=<value
> > > optimized
> > > out> ) at util.c:1745
> > > #15 0x000000000044c57e in Perl_sys_term () at perl.c:146
> > > #16 0x0000000000421472 in main (argc=<value optimized out>,
> > > argv=<value optimized out>, env=0x7fffffffe770)
> > > at perlmain.c:155
> >
> > We're in sys_term, so most of the interpreter has already been wiped
> > out, but we're then calling croak(), which relies on the
> > interpreter's
> > infrastructure.
> >
> > My best guess is that we're hitting 'Perl_croak_nocontext("panic:
> > MUTEX_UNLOCK (%d) [%s:%d]" ...)' in one of the MUTEX_UNLOCK calls,
> > and
> > some wild writes have corrupted both one of our mutexes and perhaps
> > this panic string.
> >
> > It seems like it'd be hard for it to get this far with that level of
> > corruption though.
> >
> > I'll try harder to reproduce it: I'd need some interactive gdb work
> > to
> > diagnose it.
> >
> > Hugo
>
> This is now showing up on other OSes. See:
> http://perl5.test-
> smoke.org/submatrix?test=../t/op/sprintf2.t&pversion=5.31.1
But, to further aggravate us, the FAIL is showing up on our FreeBSD-11 smokers but not our FreeBSD-12 or -13 smoker.
FreeBSD-12.0-RELEASE-p4: http://perl5.test-smoke.org/report/89000 PASS
FreeBSD-13.0-CURRENT : http://perl5.test-smoke.org/report/88939 PASS
--
James E Keenan ([email protected])
# Hugo van der Sanden <...@cry...>
Wed, 05 Jun 2019 06:22:36 -0700
On Wed, 05 Jun 2019 05:39:11 -0700, jkeenan wrote:
> (The previously reported backtrace was from a version of
> t/op/sprintf2.t with all tests but the final one commented out.)
Ah ok; since the first one-liner is enough to reproduce the crash, I suggest sticking with that for further testing:
./perl -e 'my $x = sprintf("%7000000000E", 0)'
[...]
> #16 0x000000000044c596 in Perl_sys_term () at perl.c:152
> #17 0x0000000000421472 in main (argc=<value optimized out>, argv=<value optimized out>, env=0x7fffffffe758)
> at perlmain.c:155
That looks like it is in the line:
LOCALE_TERM; /* PL_locale_mutex */
.. and since this is one of the mutex destroy calls that happens _after_ we've torn down the perlio system it's no surprise that attempting to croak() goes badly.
From here I'll need some help: I'm not sure how to refer to the location of this mutex for a threaded build in gdb - I was expecting it to be something like 'my_vars->Glocale_mutex', but at least on my local build that's not recognised.
If someone can tell us that, the next step will be to run the test in gdb with a hardware watchpoint on the mutex. That'll look something like this:
shell% gdb --args ./perl -e 'my $x = sprintf("%7000000000E", 0)'
(gdb) # first let the interpreter initialise
(gdb) tbreak perlmain.c:126
(gdb) run
Temporary breakpoint 1, main (argc=<optimized out>, argv=<optimized out>,
env=<optimized out>) at perlmain.c:126
126 if (!perl_parse(my_perl, xs_init, argc, argv, (char **)NULL))
(gdb) watch PL_locale_mutex # replace with appropriate expression to refer to this mutex
(gdb) run
Hopefully that would then stop somewhere other than in the MUTEX_DESTROY, and a stack trace will show us where and how the corruption is occurring.
Hugo
# James E Keenan <jkee...@cpa...>
Wed, 05 Jun 2019 06:32:12 -0700
On Wed, 05 Jun 2019 13:22:36 GMT, hv wrote:
> On Wed, 05 Jun 2019 05:39:11 -0700, jkeenan wrote:
> > (The previously reported backtrace was from a version of
> > t/op/sprintf2.t with all tests but the final one commented out.)
>
> Ah ok; since the first one-liner is enough to reproduce the crash, I
> suggest sticking with that for further testing:
> ./perl -e 'my $x = sprintf("%7000000000E", 0)'
>
> [...]
> > #16 0x000000000044c596 in Perl_sys_term () at perl.c:152
> > #17 0x0000000000421472 in main (argc=<value optimized out>,
> > argv=<value optimized out>, env=0x7fffffffe758)
> > at perlmain.c:155
>
> That looks like it is in the line:
> LOCALE_TERM; /* PL_locale_mutex */
> .. and since this is one of the mutex destroy calls that happens
> _after_ we've torn down the perlio system it's no surprise that
> attempting to croak() goes badly.
>
> From here I'll need some help: I'm not sure how to refer to the
> location of this mutex for a threaded build in gdb - I was expecting
> it to be something like 'my_vars->Glocale_mutex', but at least on my
> local build that's not recognised.
>
> If someone can tell us that, the next step will be to run the test in
> gdb with a hardware watchpoint on the mutex. That'll look something
> like this:
>
> shell% gdb --args ./perl -e 'my $x = sprintf("%7000000000E", 0)'
> (gdb) # first let the interpreter initialise
> (gdb) tbreak perlmain.c:126
> (gdb) run
> Temporary breakpoint 1, main (argc=<optimized out>, argv=<optimized
> out>,
> env=<optimized out>) at perlmain.c:126
> 126 if (!perl_parse(my_perl, xs_init, argc, argv, (char
> **)NULL))
> (gdb) watch PL_locale_mutex # replace with appropriate expression to
> refer to this mutex
> (gdb) run
>
> Hopefully that would then stop somewhere other than in the
> MUTEX_DESTROY, and a stack trace will show us where and how the
> corruption is occurring.
>
> Hugo
I'm not sure whether you wanted *me* to run the above commands prior to hearing from someone about mutexes, but, for what it's worth, here's what I got:
#####
[perl] $ gdb --args ./perl -e 'my $x = sprintf("%7000000000E", 0)'
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "amd64-marcel-freebsd"...
(gdb)
(gdb) tbreak perlmain.c:126
Breakpoint 1 at 0x4213d6: file perlmain.c, line 126.
(gdb) run
Starting program: /usr/home/jkeenan/gitwork/perl/perl -e my\ \$x\ =\ sprintf\(\"%7000000000E\",\ 0\)
main (argc=3, argv=0x7fffffffe770, env=0x7fffffffe790) at perlmain.c:126
126 if (!perl_parse(my_perl, xs_init, argc, argv, (char **)NULL))
(gdb) watch PL_locale_mutex
Watchpoint 2: PL_locale_mutex
(gdb) run
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: /usr/home/jkeenan/gitwork/perl/perl -e my\ \$x\ =\ sprintf\(\"%7000000000E\",\ 0\)
Watchpoint 2: PL_locale_mutex
Watchpoint 2: PL_locale_mutex
Watchpoint 2: PL_locale_mutex
Watchpoint 2: PL_locale_mutex
Watchpoint 2: PL_locale_mutex
Watchpoint 2: PL_locale_mutex
Watchpoint 2: PL_locale_mutex
Watchpoint 2: PL_locale_mutex
Watchpoint 2: PL_locale_mutex
Watchpoint 2: PL_locale_mutex
Watchpoint 2: PL_locale_mutex
Watchpoint 2: PL_locale_mutex
Watchpoint 2: PL_locale_mutex
Watchpoint 2: PL_locale_mutex
Watchpoint 2: PL_locale_mutex
Old value = 0x0
New value = 0x801e1c1e0
0x0000000800c6caec in __pthread_mutex_init (mutex=0xa45b28, mutex_attr=<value optimized out>) at pthread_md.h:94
94 {
Current language: auto; currently minimal
(gdb) bt
#0 0x0000000800c6caec in __pthread_mutex_init (mutex=0xa45b28, mutex_attr=<value optimized out>)
at pthread_md.h:94
#1 0x000000000044c702 in perl_alloc () at perl.c:97
#2 0x00000000004213a9 in main (argc=3, argv=0x7fffffffe770, env=0x7fffffffe790) at perlmain.c:119
(gdb)
#####
Thank you very much.
--
James E Keenan ([email protected])
# Hugo van der Sanden <...@cry...>
Wed, 05 Jun 2019 07:22:10 -0700
On Wed, 05 Jun 2019 06:32:12 -0700, jkeenan wrote:
> I'm not sure whether you wanted *me* to run the above commands prior
> to hearing from someone about mutexes
I had not expected it would work, but since it did let's go with it. :) (For me that just gives 'No symbol "PL_locale_mutex" in current context'.)
> here's what I got:
>
> #####
> [perl] $ gdb --args ./perl -e 'my $x = sprintf("%7000000000E", 0)'
[...]
> Watchpoint 2: PL_locale_mutex
>
> Old value = 0x0
> New value = 0x801e1c1e0
> 0x0000000800c6caec in __pthread_mutex_init (mutex=0xa45b28,
> mutex_attr=<value optimized out>) at pthread_md.h:94
That's slightly odd, the intent of the tbreak was to get past the initialize (which the stack trace shows is invoked from perlmain.c:119). Never mind, please get to this point again, then continue with:
(gdb) cont
.. and let's see where it stops next, and get another stack trace.
Hugo
# James E Keenan <jkee...@cpa...>
Wed, 05 Jun 2019 07:54:26 -0700
On Wed, 05 Jun 2019 14:22:10 GMT, hv wrote:
> On Wed, 05 Jun 2019 06:32:12 -0700, jkeenan wrote:
> > I'm not sure whether you wanted *me* to run the above commands prior
> > to hearing from someone about mutexes
>
> I had not expected it would work, but since it did let's go with it.
> :) (For me that just gives 'No symbol "PL_locale_mutex" in current
> context'.)
>
> > here's what I got:
> >
> > #####
> > [perl] $ gdb --args ./perl -e 'my $x = sprintf("%7000000000E", 0)'
> [...]
> > Watchpoint 2: PL_locale_mutex
> >
> > Old value = 0x0
> > New value = 0x801e1c1e0
> > 0x0000000800c6caec in __pthread_mutex_init (mutex=0xa45b28,
> > mutex_attr=<value optimized out>) at pthread_md.h:94
>
> That's slightly odd, the intent of the tbreak was to get past the
> initialize (which the stack trace shows is invoked from
> perlmain.c:119). Never mind, please get to this point again, then
> continue with:
>
> (gdb) cont
>
> .. and let's see where it stops next, and get another stack trace.
>
> Hugo
After 'cont', the program hung for > 1 minute, so I Ctrl-C-ed it.
#####
...
Watchpoint 2: PL_locale_mutex
Watchpoint 2: PL_locale_mutex
Old value = 0x0
New value = 0x801e1c1e0
0x0000000800c6caec in __pthread_mutex_init (mutex=0xa45b28,
---Type <return> to continue, or q <return> to quit---
mutex_attr=<value optimized out>) at pthread_md.h:94
94 {
Current language: auto; currently minimal
(gdb) cont
Continuing.
^C
Program received signal SIGINT, Interrupt.
__vdso_gettc (th=0x7ffffffff270, tc=0x7fffffffe53c) at cpufunc.h:326
326 __asm __volatile("lfence" : : : "memory");
(gdb) bt
#0 __vdso_gettc (th=0x7ffffffff270, tc=0x7fffffffe53c) at cpufunc.h:326
#1 0x000000080183e729 in binuptime (bt=0x7fffffffe580, tk=0x7ffffffff1b0,
abs=1) at /usr/src/lib/libc/sys/__vdso_gettimeofday.c:43
#2 0x000000080183e818 in __vdso_clock_gettime (clock_id=13, ts=0x7fffffffe5e8)
at /usr/src/lib/libc/sys/__vdso_gettimeofday.c:149
#3 0x0000000801827a31 in __clock_gettime (clock_id=13, ts=0x7fffffffe5e8)
at /usr/src/lib/libc/sys/clock_gettime.c:46
#4 0x00000008018271da in time (t=0x801e225a8)
at /usr/src/lib/libc/gen/time.c:45
#5 0x000000000044fd6f in perl_parse (my_perl=0x801e22000,
xsinit=0x421490 <xs_init>, argc=<value optimized out>, argv=0x306,
env=<value optimized out>) at perl.c:1822
#6 0x00000000004213ea in main (argc=<value optimized out>,
argv=<value optimized out>, env=0x7fffffffe7a0) at perlmain.c:126
#####
--
James E Keenan ([email protected])
# Hugo van der Sanden <...@cry...>
Wed, 05 Jun 2019 08:26:36 -0700
On Wed, 05 Jun 2019 07:54:26 -0700, jkeenan wrote:
> After 'cont', the program hung for > 1 minute, so I Ctrl-C-ed it.
Watchpoints can make things rather slow; you'll see from the stack trace that we're still in perl_parse(), so we haven't got as far as running the program yet. My guess is that it should complete in 10-15 minutes, but if possible please allow at least up to an hour before giving up on it.
Hugo
# James E Keenan <jkee...@cpa...>
Wed, 05 Jun 2019 08:40:23 -0700
On Wed, 05 Jun 2019 15:26:36 GMT, hv wrote:
> On Wed, 05 Jun 2019 07:54:26 -0700, jkeenan wrote:
> > After 'cont', the program hung for > 1 minute, so I Ctrl-C-ed it.
>
> Watchpoints can make things rather slow; you'll see from the stack
> trace that we're still in perl_parse(), so we haven't got as far as
> running the program yet. My guess is that it should complete in 10-15
> minutes, but if possible please allow at least up to an hour before
> giving up on it.
>
Did not know that. Have to go AFK. Will resume tonight. Thanks.
--
James E Keenan ([email protected])
# karl williamson <publ...@khw...>
Wed, 05 Jun 2019 10:45:50 -0700
On 6/5/19 9:40 AM, James E Keenan via RT wrote:
> On Wed, 05 Jun 2019 15:26:36 GMT, hv wrote:
>> On Wed, 05 Jun 2019 07:54:26 -0700, jkeenan wrote:
>>> After 'cont', the program hung for > 1 minute, so I Ctrl-C-ed it.
>>
>> Watchpoints can make things rather slow; you'll see from the stack
>> trace that we're still in perl_parse(), so we haven't got as far as
>> running the program yet. My guess is that it should complete in 10-15
>> minutes, but if possible please allow at least up to an hour before
>> giving up on it.
>>
>
> Did not know that. Have to go AFK. Will resume tonight. Thanks.
>
I have access to a 11.1-RELEASE FreeBSD 11.1-RELEASE #0 r321309: Fri Jul
21 14:27:25 UTC 2017
When I run ./perl -e 'my $x = sprintf("%7000000000E", 0)'
I get
Integer overflow in format string for sprintf at -e line 1.
# James E Keenan <jkee...@cpa...>
Thu, 06 Jun 2019 04:37:28 -0700
On Wed, 05 Jun 2019 15:26:36 GMT, hv wrote:
> On Wed, 05 Jun 2019 07:54:26 -0700, jkeenan wrote:
> > After 'cont', the program hung for > 1 minute, so I Ctrl-C-ed it.
>
> Watchpoints can make things rather slow; you'll see from the stack
> trace that we're still in perl_parse(), so we haven't got as far as
> running the program yet. My guess is that it should complete in 10-15
> minutes, but if possible please allow at least up to an hour before
> giving up on it.
>
> Hugo
I let it run overnight. No difference in results.
#####
Continuing.
^C [killed at 07:35 next day]
Program received signal SIGINT, Interrupt.
__vdso_gettc (th=<value optimized out>, tc=0x7fffffffe10c)
at /usr/src/lib/libc/x86/sys/__vdso_gettc.c:325
325 }
bt
#0 __vdso_gettc (th=<value optimized out>, tc=0x7fffffffe10c)
at /usr/src/lib/libc/x86/sys/__vdso_gettc.c:325
#1 0x000000080183e729 in binuptime (bt=0x7fffffffe150, tk=0x7ffffffff1b0, abs=1)
at /usr/src/lib/libc/sys/__vdso_gettimeofday.c:43
#2 0x000000080183e818 in __vdso_clock_gettime (clock_id=13, ts=0x7fffffffe1b8)
at /usr/src/lib/libc/sys/__vdso_gettimeofday.c:149
#3 0x0000000801827a31 in __clock_gettime (clock_id=13, ts=0x7fffffffe1b8)
at /usr/src/lib/libc/sys/clock_gettime.c:46
#4 0x00000008018271da in time (t=0x801e225a8) at /usr/src/lib/libc/gen/time.c:45
#5 0x000000000044fd6f in perl_parse (my_perl=0x801e22000, xsinit=0x421490 <xs_init>,
argc=<value optimized out>, argv=0x306, env=<value optimized out>) at perl.c:1822
#6 0x00000000004213ea in main (argc=<value optimized out>, argv=<value optimized out>,
env=0x7fffffffe370) at perlmain.c:126
#####
--
James E Keenan ([email protected])
# Hugo van der Sanden <...@cry...>
Thu, 06 Jun 2019 07:17:46 -0700
On Thu, 06 Jun 2019 04:37:28 -0700, jkeenan wrote:
> I let it run overnight. No difference in results.
Oh, that's sad, it's clearly hanging in that time() call for some reason.
Let's try a different approach, closer to the likely source of the problem. Line 12950 of sv.c should be "float_need = 1 /* possible unary minus */", the point we start handling the large number in the sprintf pattern. Check that this is the right line, and thn try the following:
shell% gdb --args ./perl -we 'my $x = sprintf("%7000000000E", 0)'
(gdb) break sv.c:12950
(gdb) run
Breakpoint 1, Perl_sv_vcatpvfn_flags (my_perl=0xd43260, sv=0xd72e78,
pat=0xd76368 "%7000000000E", patlen=12, args=0x0, svargs=0xd48b18,
sv_count=1, maybe_tainted=0x7fffffffd2ff, flags=0) at sv.c:12950
12950 float_need = 1 /* possible unary minus */
and then:
(gdb) disp PL_locale_mutex
(gdb) disp *PL_locale_mutex
(gdb) next
.. and repeat 'next' about 15 times until the Perl_croak() call. If the displayed values haven't changed by this point, try 'cont' to see if they do change by the time we hit the SEGV.
Hugo
# James E Keenan <jkee...@cpa...>
Thu, 06 Jun 2019 08:12:34 -0700
On Thu, 06 Jun 2019 14:17:46 GMT, hv wrote:
> On Thu, 06 Jun 2019 04:37:28 -0700, jkeenan wrote:
> > I let it run overnight. No difference in results.
>
> Oh, that's sad, it's clearly hanging in that time() call for some
> reason.
>
> Let's try a different approach, closer to the likely source of the
> problem. Line 12950 of sv.c should be "float_need = 1 /* possible
> unary minus */", the point we start handling the large number in the
> sprintf pattern. Check that this is the right line, and thn try the
> following:
>
> shell% gdb --args ./perl -we 'my $x = sprintf("%7000000000E", 0)'
> (gdb) break sv.c:12950
> (gdb) run
> Breakpoint 1, Perl_sv_vcatpvfn_flags (my_perl=0xd43260, sv=0xd72e78,
> pat=0xd76368 "%7000000000E", patlen=12, args=0x0,
> svargs=0xd48b18,
> sv_count=1, maybe_tainted=0x7fffffffd2ff, flags=0) at sv.c:12950
> 12950 float_need = 1 /* possible unary minus */
>
> and then:
> (gdb) disp PL_locale_mutex
> (gdb) disp *PL_locale_mutex
> (gdb) next
> .. and repeat 'next' about 15 times until the Perl_croak() call. If
> the displayed values haven't changed by this point, try 'cont' to see
> if they do change by the time we hit the SEGV.
>
> Hugo
#####
$ ./perl -Ilib -V:config_args
config_args='-des -Dusedevel -Duseithreads -Doptimize=-O2 -pipe -fstack-protector -fno-strict-aliasing -DDEBUGGING';
[perl] $ gdb --args ./perl -we 'my $x = sprintf("%7000000000E", 0)'
GNU gdb 6.1.1 [FreeBSD]
[snip]
(gdb) break sv.c:12950
Breakpoint 1 at 0x557e4e: file sv.c, line 12950.
(gdb) run
Starting program: /usr/home/jkeenan/gitwork/perl/perl -we my\ \$x\ =\ sprintf\(\"%7000000000E\",\ 0\)
Breakpoint 1, Perl_sv_vcatpvfn_flags (my_perl=<value optimized out>, sv=<value optimized out>,
pat=<value optimized out>, patlen=<value optimized out>, args=0x0, svargs=0x801e16528,
sv_count=1, maybe_tainted=0x801c02200, flags=0) at sv.c:12967
12967 if (!lc_numeric_set) {
(gdb) disp PL_locale_mutex
1: PL_locale_mutex = 0x801e1c1e0
(gdb) disp *PL_locale_mutex
2: *PL_locale_mutex = {m_lock = {m_owner = 0, m_flags = 0, m_ceilings = 0x801e1c1e8, m_rb_lnk = 0,
m_spare = 0x801e1c1f8}, m_flags = 1, m_count = 0, m_spinloops = 0, m_yieldloops = 0, m_ps = 0,
m_qe = {tqe_next = 0x0, tqe_prev = 0x0}, m_pqe = {tqe_next = 0x0, tqe_prev = 0x0}, m_rb_prev = 0x0}
(gdb) next
13092 Perl_croak(aTHX_ "Numeric format result too large");
2: *PL_locale_mutex = {m_lock = {m_owner = 0, m_flags = 0, m_ceilings = 0x801e1c1e8, m_rb_lnk = 0,
m_spare = 0x801e1c1f8}, m_flags = 1, m_count = 0, m_spinloops = 0, m_yieldloops = 0, m_ps = 0,
m_qe = {tqe_next = 0x0, tqe_prev = 0x0}, m_pqe = {tqe_next = 0x0, tqe_prev = 0x0}, m_rb_prev = 0x0}
1: PL_locale_mutex = 0x801e1c1e0
(gdb) next
Breakpoint 1, Perl_sv_vcatpvfn_flags (my_perl=<value optimized out>, sv=<value optimized out>,
pat=<value optimized out>, patlen=<value optimized out>, args=0x0, svargs=0x801e16530,
sv_count=1, maybe_tainted=0x452591, flags=0) at sv.c:12967
12967 if (!lc_numeric_set) {
2: *PL_locale_mutex = {m_lock = {m_owner = 0, m_flags = 0, m_ceilings = 0x801e1c1e8, m_rb_lnk = 0,
m_spare = 0x801e1c1f8}, m_flags = 1, m_count = 0, m_spinloops = 0, m_yieldloops = 0, m_ps = 0,
m_qe = {tqe_next = 0x0, tqe_prev = 0x0}, m_pqe = {tqe_next = 0x0, tqe_prev = 0x0}, m_rb_prev = 0x0}
1: PL_locale_mutex = 0x801e1c1e0
(gdb) next
12972 STORE_LC_NUMERIC_SET_TO_NEEDED();
2: *PL_locale_mutex = {m_lock = {m_owner = 0, m_flags = 0, m_ceilings = 0x801e1c1e8, m_rb_lnk = 0,
m_spare = 0x801e1c1f8}, m_flags = 1, m_count = 0, m_spinloops = 0, m_yieldloops = 0, m_ps = 0,
m_qe = {tqe_next = 0x0, tqe_prev = 0x0}, m_pqe = {tqe_next = 0x0, tqe_prev = 0x0}, m_rb_prev = 0x0}
1: PL_locale_mutex = 0x801e1c1e0
(gdb) next
13092 Perl_croak(aTHX_ "Numeric format result too large");
2: *PL_locale_mutex = {m_lock = {m_owner = 0, m_flags = 0, m_ceilings = 0x801e1c1e8, m_rb_lnk = 0,
m_spare = 0x801e1c1f8}, m_flags = 1, m_count = 0, m_spinloops = 0, m_yieldloops = 0, m_ps = 0,
m_qe = {tqe_next = 0x0, tqe_prev = 0x0}, m_pqe = {tqe_next = 0x0, tqe_prev = 0x0}, m_rb_prev = 0x0}
1: PL_locale_mutex = 0x801e1c1e0
(gdb) next
Numeric format result too large at -e line 1.
Program received signal SIGSEGV, Segmentation fault.
0x0000000000597ad9 in Perl_dounwind (my_perl=0x801e22000, cxix=-1) at pp_ctl.c:1550
1550 CX_LEAVE_SCOPE(cx);
2: *PL_locale_mutex = Error accessing memory address 0x2: Bad address.
Disabling display 2 to avoid infinite recursion.
#####
--
James E Keenan ([email protected])
# James E Keenan <jkee...@cpa...>
Thu, 06 Jun 2019 08:16:29 -0700
On Thu, 06 Jun 2019 15:12:34 GMT, jkeenan wrote:
> On Thu, 06 Jun 2019 14:17:46 GMT, hv wrote:
> > On Thu, 06 Jun 2019 04:37:28 -0700, jkeenan wrote:
> > > I let it run overnight. No difference in results.
> >
> > Oh, that's sad, it's clearly hanging in that time() call for some
> > reason.
> >
> > Let's try a different approach, closer to the likely source of the
> > problem. Line 12950 of sv.c should be "float_need = 1 /* possible
> > unary minus */", the point we start handling the large number in the
> > sprintf pattern. Check that this is the right line, and thn try the
> > following:
> >
> > shell% gdb --args ./perl -we 'my $x = sprintf("%7000000000E", 0)'
> > (gdb) break sv.c:12950
> > (gdb) run
> > Breakpoint 1, Perl_sv_vcatpvfn_flags (my_perl=0xd43260, sv=0xd72e78,
> > pat=0xd76368 "%7000000000E", patlen=12, args=0x0,
> > svargs=0xd48b18,
> > sv_count=1, maybe_tainted=0x7fffffffd2ff, flags=0) at sv.c:12950
> > 12950 float_need = 1 /* possible unary minus */
> >
> > and then:
> > (gdb) disp PL_locale_mutex
> > (gdb) disp *PL_locale_mutex
> > (gdb) next
> > .. and repeat 'next' about 15 times until the Perl_croak() call. If
> > the displayed values haven't changed by this point, try 'cont' to see
> > if they do change by the time we hit the SEGV.
> >
> > Hugo
>
> #####
> $ ./perl -Ilib -V:config_args
> config_args='-des -Dusedevel -Duseithreads -Doptimize=-O2 -pipe
> -fstack-protector -fno-strict-aliasing -DDEBUGGING';
> [perl] $ gdb --args ./perl -we 'my $x = sprintf("%7000000000E", 0)'
> GNU gdb 6.1.1 [FreeBSD]
> [snip]
> (gdb) break sv.c:12950
> Breakpoint 1 at 0x557e4e: file sv.c, line 12950.
> (gdb) run
> Starting program: /usr/home/jkeenan/gitwork/perl/perl -we my\ \$x\ =\
> sprintf\(\"%7000000000E\",\ 0\)
>
> Breakpoint 1, Perl_sv_vcatpvfn_flags (my_perl=<value optimized out>,
> sv=<value optimized out>,
> pat=<value optimized out>, patlen=<value optimized out>, args=0x0,
> svargs=0x801e16528,
> sv_count=1, maybe_tainted=0x801c02200, flags=0) at sv.c:12967
> 12967 if (!lc_numeric_set) {
> (gdb) disp PL_locale_mutex
> 1: PL_locale_mutex = 0x801e1c1e0
> (gdb) disp *PL_locale_mutex
> 2: *PL_locale_mutex = {m_lock = {m_owner = 0, m_flags = 0, m_ceilings
> = 0x801e1c1e8, m_rb_lnk = 0,
> m_spare = 0x801e1c1f8}, m_flags = 1, m_count = 0, m_spinloops = 0,
> m_yieldloops = 0, m_ps = 0,
> m_qe = {tqe_next = 0x0, tqe_prev = 0x0}, m_pqe = {tqe_next = 0x0,
> tqe_prev = 0x0}, m_rb_prev = 0x0}
> (gdb) next
> 13092 Perl_croak(aTHX_ "Numeric format result too
> large");
> 2: *PL_locale_mutex = {m_lock = {m_owner = 0, m_flags = 0, m_ceilings
> = 0x801e1c1e8, m_rb_lnk = 0,
> m_spare = 0x801e1c1f8}, m_flags = 1, m_count = 0, m_spinloops = 0,
> m_yieldloops = 0, m_ps = 0,
> m_qe = {tqe_next = 0x0, tqe_prev = 0x0}, m_pqe = {tqe_next = 0x0,
> tqe_prev = 0x0}, m_rb_prev = 0x0}
> 1: PL_locale_mutex = 0x801e1c1e0
> (gdb) next
>
> Breakpoint 1, Perl_sv_vcatpvfn_flags (my_perl=<value optimized out>,
> sv=<value optimized out>,
> pat=<value optimized out>, patlen=<value optimized out>, args=0x0,
> svargs=0x801e16530,
> sv_count=1, maybe_tainted=0x452591, flags=0) at sv.c:12967
> 12967 if (!lc_numeric_set) {
> 2: *PL_locale_mutex = {m_lock = {m_owner = 0, m_flags = 0, m_ceilings
> = 0x801e1c1e8, m_rb_lnk = 0,
> m_spare = 0x801e1c1f8}, m_flags = 1, m_count = 0, m_spinloops = 0,
> m_yieldloops = 0, m_ps = 0,
> m_qe = {tqe_next = 0x0, tqe_prev = 0x0}, m_pqe = {tqe_next = 0x0,
> tqe_prev = 0x0}, m_rb_prev = 0x0}
> 1: PL_locale_mutex = 0x801e1c1e0
> (gdb) next
> 12972 STORE_LC_NUMERIC_SET_TO_NEEDED();
> 2: *PL_locale_mutex = {m_lock = {m_owner = 0, m_flags = 0, m_ceilings
> = 0x801e1c1e8, m_rb_lnk = 0,
> m_spare = 0x801e1c1f8}, m_flags = 1, m_count = 0, m_spinloops = 0,
> m_yieldloops = 0, m_ps = 0,
> m_qe = {tqe_next = 0x0, tqe_prev = 0x0}, m_pqe = {tqe_next = 0x0,
> tqe_prev = 0x0}, m_rb_prev = 0x0}
> 1: PL_locale_mutex = 0x801e1c1e0
> (gdb) next
> 13092 Perl_croak(aTHX_ "Numeric format result too
> large");
> 2: *PL_locale_mutex = {m_lock = {m_owner = 0, m_flags = 0, m_ceilings
> = 0x801e1c1e8, m_rb_lnk = 0,
> m_spare = 0x801e1c1f8}, m_flags = 1, m_count = 0, m_spinloops = 0,
> m_yieldloops = 0, m_ps = 0,
> m_qe = {tqe_next = 0x0, tqe_prev = 0x0}, m_pqe = {tqe_next = 0x0,
> tqe_prev = 0x0}, m_rb_prev = 0x0}
> 1: PL_locale_mutex = 0x801e1c1e0
> (gdb) next
> Numeric format result too large at -e line 1.
>
> Program received signal SIGSEGV, Segmentation fault.
> 0x0000000000597ad9 in Perl_dounwind (my_perl=0x801e22000, cxix=-1) at
> pp_ctl.c:1550
> 1550 CX_LEAVE_SCOPE(cx);
> 2: *PL_locale_mutex = Error accessing memory address 0x2: Bad address.
> Disabling display 2 to avoid infinite recursion.
> #####
Followed by:
#####
(gdb) bt
#0 0x0000000000597ad9 in Perl_dounwind (my_perl=0x801e22000, cxix=-1) at pp_ctl.c:1550
#1 0x0000000000454979 in S_my_exit_jump (my_perl=0x801e22000) at perl.c:5269
#2 0x00000000004581a7 in Perl_my_failure_exit (my_perl=0x801e22000) at perl.c:5256
#3 0x0000000000598882 in Perl_die_unwind (my_perl=0x801e22000, msv=0x801e1f048) at pp_ctl.c:1797
#4 0x00000000004f4828 in Perl_vcroak (my_perl=0x801e22000, pat=0x801e1f048 "\030 �\001\b",
args=0x7fffffffe2e0) at util.c:1711
#5 0x00000000004f1bc7 in Perl_croak_nocontext (pat=<value optimized out>) at util.c:1745
#6 0x000000000061295b in PerlIOUnix_refcnt_inc (fd=0) at perlio.c:2309
#7 0x0000000000613338 in PerlIOUnix_open (my_perl=0x801e22000, self=0x81ddd0, layers=0x801e1b8d8,
n=0, mode=0x81e560 "r", fd=0, imode=0, perm=0, f=0x801ea9038, narg=0, args=0x0) at perlio.c:2607
#8 0x000000000061523c in PerlIOBuf_open (my_perl=0x801e22000, self=0x81db30, layers=0x801e1b8d8,
n=1, mode=<value optimized out>, fd=1, imode=0, perm=0, f=0x0, narg=0, args=0x0) at perlio.c:3897
#9 0x00000000006112d8 in PerlIO_openn (my_perl=0x801e22000, layers=<value optimized out>,
mode=<value optimized out>, fd=0, imode=<value optimized out>, perm=<value optimized out>, f=0x0,
narg=0, args=0x0) at perlio.c:1550
#10 0x00000000006101ae in PerlIO_stdstreams (my_perl=0x801e22000) at perlio.c:4928
#11 0x0000000000616f6d in Perl_PerlIO_stderr (my_perl=0x801e22000) at perlio.c:4884
#12 0x00000000004f4503 in Perl_write_to_stderr (my_perl=0x801e22000, msv=0x801e1f048) at util.c:1516
#13 0x000000000059887a in Perl_die_unwind (my_perl=0x801e22000, msv=0x801e1f048) at pp_ctl.c:1796
#14 0x00000000004f4828 in Perl_vcroak (my_perl=0x801e22000, pat=0x801e1f048 "\030 �\001\b",
args=0x7fffffffe6c0) at util.c:1711
#15 0x00000000004f1bc7 in Perl_croak_nocontext (pat=<value optimized out>) at util.c:1745
#16 0x000000000044c596 in Perl_sys_term () at perl.c:152
#17 0x0000000000421472 in main (argc=<value optimized out>, argv=<value optimized out>,
env=0x7fffffffe7a0) at perlmain.c:155
#####
--
James E Keenan ([email protected])
# Hugo van der Sanden <...@cry...>
Thu, 06 Jun 2019 08:47:02 -0700
On Thu, 06 Jun 2019 08:16:29 -0700, jkeenan wrote:
> > (gdb) next
> > 13092 Perl_croak(aTHX_ "Numeric format result too
> > large");
> > 2: *PL_locale_mutex = {m_lock = {m_owner = 0, m_flags = 0, m_ceilings
> > = 0x801e1c1e8, m_rb_lnk = 0,
> > m_spare = 0x801e1c1f8}, m_flags = 1, m_count = 0, m_spinloops =
> > 0,
> > m_yieldloops = 0, m_ps = 0,
> > m_qe = {tqe_next = 0x0, tqe_prev = 0x0}, m_pqe = {tqe_next = 0x0,
> > tqe_prev = 0x0}, m_rb_prev = 0x0}
> > 1: PL_locale_mutex = 0x801e1c1e0
Ok, so at this point it still looks valid ...
> > Program received signal SIGSEGV, Segmentation fault.
> > 0x0000000000597ad9 in Perl_dounwind (my_perl=0x801e22000, cxix=-1) at
> > pp_ctl.c:1550
> > 1550 CX_LEAVE_SCOPE(cx);
> > 2: *PL_locale_mutex = Error accessing memory address 0x2: Bad
> > address.
> > Disabling display 2 to avoid infinite recursion.
.. but by the time we get here it has been corrupted to 0x2.
I think it's worth another go at the 'watch', but if that fails we may have to step through with 'next' and 'step' until the displayed value changes. If we set the watchpoint later, it may work better:
shell% gdb --args ./perl -we 'my $x = sprintf("%7000000000E", 0)'
(gdb) break sv.c:13092
(gdb) run
.. break at sv.c:13092 (the croak call)
(gdb) watch PL_locale_mutex
(gdb) cont
As before, it should not take more than a few minutes after the 'cont'.
Hugo
# James E Keenan <jkee...@cpa...>
Thu, 06 Jun 2019 08:54:29 -0700
On Thu, 06 Jun 2019 15:47:02 GMT, hv wrote:
> On Thu, 06 Jun 2019 08:16:29 -0700, jkeenan wrote:
> > > (gdb) next
> > > 13092 Perl_croak(aTHX_ "Numeric format result too
> > > large");
> > > 2: *PL_locale_mutex = {m_lock = {m_owner = 0, m_flags = 0,
> > > m_ceilings
> > > = 0x801e1c1e8, m_rb_lnk = 0,
> > > m_spare = 0x801e1c1f8}, m_flags = 1, m_count = 0, m_spinloops =
> > > 0,
> > > m_yieldloops = 0, m_ps = 0,
> > > m_qe = {tqe_next = 0x0, tqe_prev = 0x0}, m_pqe = {tqe_next = 0x0,
> > > tqe_prev = 0x0}, m_rb_prev = 0x0}
> > > 1: PL_locale_mutex = 0x801e1c1e0
>
> Ok, so at this point it still looks valid ...
>
> > > Program received signal SIGSEGV, Segmentation fault.
> > > 0x0000000000597ad9 in Perl_dounwind (my_perl=0x801e22000, cxix=-1)
> > > at
> > > pp_ctl.c:1550
> > > 1550 CX_LEAVE_SCOPE(cx);
> > > 2: *PL_locale_mutex = Error accessing memory address 0x2: Bad
> > > address.
> > > Disabling display 2 to avoid infinite recursion.
>
> .. but by the time we get here it has been corrupted to 0x2.
>
> I think it's worth another go at the 'watch', but if that fails we may
> have to step through with 'next' and 'step' until the displayed value
> changes. If we set the watchpoint later, it may work better:
>
> shell% gdb --args ./perl -we 'my $x = sprintf("%7000000000E", 0)'
> (gdb) break sv.c:13092
> (gdb) run
> .. break at sv.c:13092 (the croak call)
> (gdb) watch PL_locale_mutex
> (gdb) cont
>
> As before, it should not take more than a few minutes after the
> 'cont'.
>
> Hugo
'cont' returned within 1 second.
#####
$ gdb --args ./perl -we 'my $x = sprintf("%7000000000E", 0)'
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "amd64-marcel-freebsd"...
(gdb) break sv.c:13092
Breakpoint 1 at 0x55b2f0: file sv.c, line 13092.
(gdb) run
Starting program: /usr/home/jkeenan/gitwork/perl/perl -we my\ \$x\ =\ sprintf\(\"%7000000000E\",\ 0\)
Breakpoint 1, Perl_sv_vcatpvfn_flags (my_perl=<value optimized out>, sv=<value optimized out>,
pat=<value optimized out>, patlen=<value optimized out>, args=0x0, svargs=0x801e16528,
sv_count=1, maybe_tainted=0x801c02200, flags=0) at sv.c:13092
13092 Perl_croak(aTHX_ "Numeric format result too large");
(gdb) watch PL_locale_mutex
Watchpoint 2: PL_locale_mutex
(gdb) cont
Continuing.
Breakpoint 1, Perl_sv_vcatpvfn_flags (my_perl=<value optimized out>, sv=<value optimized out>,
pat=<value optimized out>, patlen=<value optimized out>, args=0x0, svargs=0x801e16530,
sv_count=1, maybe_tainted=0x452591, flags=0) at sv.c:13092
13092 Perl_croak(aTHX_ "Numeric format result too large");
(gdb) bt
#0 Perl_sv_vcatpvfn_flags (my_perl=<value optimized out>, sv=<value optimized out>,
pat=<value optimized out>, patlen=<value optimized out>, args=0x0, svargs=0x801e16530,
sv_count=1, maybe_tainted=0x452591, flags=0) at sv.c:13092
#1 0x0000000000553045 in Perl_sv_vsetpvfn (my_perl=0x801e22000, sv=0x801e98610,
pat=0x801e20128 "%7000000000E", patlen=<value optimized out>, args=<value optimized out>,
svargs=0x801e16530, sv_count=1, maybe_tainted=0x7fffffffe577) at sv.c:10984
#2 0x00000000005c4f31 in Perl_do_sprintf (my_perl=0x801e22000, sv=0x801e98610,
len=<value optimized out>, sarg=<value optimized out>) at doop.c:734
#3 0x0000000000579ac0 in Perl_pp_sprintf (my_perl=0x801e22000) at pp.c:3559
#4 0x00000000004f0340 in Perl_runops_debug (my_perl=0x801e22000) at dump.c:2537
#5 0x0000000000453464 in S_run_body (my_perl=0x801e22000, oldscope=1) at inline.h:65
#6 0x00000000004532d9 in perl_run (my_perl=0x801e22000) at perl.c:2646
#7 0x00000000004213fa in main (argc=<value optimized out>, argv=<value optimized out>,
env=0x7fffffffe7a0) at perlmain.c:127
#####
--
James E Keenan ([email protected])
# Hugo van der Sanden <...@cry...>
Thu, 06 Jun 2019 09:02:45 -0700
On Thu, 06 Jun 2019 08:54:29 -0700, jkeenan wrote:
> 'cont' returned within 1 second.
[...]
> (gdb) cont
> Continuing.
>
> Breakpoint 1, Perl_sv_vcatpvfn_flags (my_perl=<value optimized out>,
> sv=<value optimized out>,
> pat=<value optimized out>, patlen=<value optimized out>, args=0x0,
> svargs=0x801e16530,
> sv_count=1, maybe_tainted=0x452591, flags=0) at sv.c:13092
> 13092 Perl_croak(aTHX_ "Numeric format result too
> large");
Sorry, that has not progressed; please repeat the 'cont' until it says something about changing the watched value - it should look something like:
Hardware watchpoint 1: PL_locale_mutex
Old value = (pthread_mutex_t) 0x[some value]
New value = (pthread_mutex_t) 0x2
Hugo
# James E Keenan <jkee...@cpa...>
Thu, 06 Jun 2019 10:30:58 -0700
On Thu, 06 Jun 2019 15:54:29 GMT, jkeenan wrote:
> On Thu, 06 Jun 2019 15:47:02 GMT, hv wrote:
> > On Thu, 06 Jun 2019 08:16:29 -0700, jkeenan wrote:
> > > > (gdb) next
> > > > 13092 Perl_croak(aTHX_ "Numeric format result
> > > > too
> > > > large");
> > > > 2: *PL_locale_mutex = {m_lock = {m_owner = 0, m_flags = 0,
> > > > m_ceilings
> > > > = 0x801e1c1e8, m_rb_lnk = 0,
> > > > m_spare = 0x801e1c1f8}, m_flags = 1, m_count = 0, m_spinloops
> > > > =
> > > > 0,
> > > > m_yieldloops = 0, m_ps = 0,
> > > > m_qe = {tqe_next = 0x0, tqe_prev = 0x0}, m_pqe = {tqe_next =
> > > > 0x0,
> > > > tqe_prev = 0x0}, m_rb_prev = 0x0}
> > > > 1: PL_locale_mutex = 0x801e1c1e0
> >
> > Ok, so at this point it still looks valid ...
> >
> > > > Program received signal SIGSEGV, Segmentation fault.
> > > > 0x0000000000597ad9 in Perl_dounwind (my_perl=0x801e22000, cxix=-
> > > > 1)
> > > > at
> > > > pp_ctl.c:1550
> > > > 1550 CX_LEAVE_SCOPE(cx);
> > > > 2: *PL_locale_mutex = Error accessing memory address 0x2: Bad
> > > > address.
> > > > Disabling display 2 to avoid infinite recursion.
> >
> > .. but by the time we get here it has been corrupted to 0x2.
> >
> > I think it's worth another go at the 'watch', but if that fails we
> > may
> > have to step through with 'next' and 'step' until the displayed value
> > changes. If we set the watchpoint later, it may work better:
> >
> > shell% gdb --args ./perl -we 'my $x = sprintf("%7000000000E", 0)'
> > (gdb) break sv.c:13092
> > (gdb) run
> > .. break at sv.c:13092 (the croak call)
> > (gdb) watch PL_locale_mutex
> > (gdb) cont
> >
> > As before, it should not take more than a few minutes after the
> > 'cont'.
> >
> > Hugo
>
> 'cont' returned within 1 second.
>
> #####
> $ gdb --args ./perl -we 'my $x = sprintf("%7000000000E", 0)'
> GNU gdb 6.1.1 [FreeBSD]
> Copyright 2004 Free Software Foundation, Inc.
> GDB is free software, covered by the GNU General Public License, and
> you are
> welcome to change it and/or distribute copies of it under certain
> conditions.
> Type "show copying" to see the conditions.
> There is absolutely no warranty for GDB. Type "show warranty" for
> details.
> This GDB was configured as "amd64-marcel-freebsd"...
> (gdb) break sv.c:13092
> Breakpoint 1 at 0x55b2f0: file sv.c, line 13092.
> (gdb) run
> Starting program: /usr/home/jkeenan/gitwork/perl/perl -we my\ \$x\ =\
> sprintf\(\"%7000000000E\",\ 0\)
>
> Breakpoint 1, Perl_sv_vcatpvfn_flags (my_perl=<value optimized out>,
> sv=<value optimized out>,
> pat=<value optimized out>, patlen=<value optimized out>, args=0x0,
> svargs=0x801e16528,
> sv_count=1, maybe_tainted=0x801c02200, flags=0) at sv.c:13092
> 13092 Perl_croak(aTHX_ "Numeric format result too
> large");
> (gdb) watch PL_locale_mutex
> Watchpoint 2: PL_locale_mutex
> (gdb) cont
> Continuing.
>
> Breakpoint 1, Perl_sv_vcatpvfn_flags (my_perl=<value optimized out>,
> sv=<value optimized out>,
> pat=<value optimized out>, patlen=<value optimized out>, args=0x0,
> svargs=0x801e16530,
> sv_count=1, maybe_tainted=0x452591, flags=0) at sv.c:13092
> 13092 Perl_croak(aTHX_ "Numeric format result too
> large");
> (gdb) bt
> #0 Perl_sv_vcatpvfn_flags (my_perl=<value optimized out>, sv=<value
> optimized out>,
> pat=<value optimized out>, patlen=<value optimized out>, args=0x0,
> svargs=0x801e16530,
> sv_count=1, maybe_tainted=0x452591, flags=0) at sv.c:13092
> #1 0x0000000000553045 in Perl_sv_vsetpvfn (my_perl=0x801e22000,
> sv=0x801e98610,
> pat=0x801e20128 "%7000000000E", patlen=<value optimized out>,
> args=<value optimized out>,
> svargs=0x801e16530, sv_count=1, maybe_tainted=0x7fffffffe577) at
> sv.c:10984
> #2 0x00000000005c4f31 in Perl_do_sprintf (my_perl=0x801e22000,
> sv=0x801e98610,
> len=<value optimized out>, sarg=<value optimized out>) at
> doop.c:734
> #3 0x0000000000579ac0 in Perl_pp_sprintf (my_perl=0x801e22000) at
> pp.c:3559
> #4 0x00000000004f0340 in Perl_runops_debug (my_perl=0x801e22000) at
> dump.c:2537
> #5 0x0000000000453464 in S_run_body (my_perl=0x801e22000, oldscope=1)
> at inline.h:65
> #6 0x00000000004532d9 in perl_run (my_perl=0x801e22000) at
> perl.c:2646
> #7 0x00000000004213fa in main (argc=<value optimized out>,
> argv=<value optimized out>,
> env=0x7fffffffe7a0) at perlmain.c:127
> #####
#####
$ gdb --args ./perl -we 'my $x = sprintf("%7000000000E", 0)'
[snip]
(gdb) break sv.c:13092
Breakpoint 1 at 0x55b2f0: file sv.c, line 13092.
(gdb) run
Starting program: /usr/home/jkeenan/gitwork/perl/perl -we my\ \$x\ =\ sprintf\(\"%7000000000E\",\ 0\)
Breakpoint 1, Perl_sv_vcatpvfn_flags (my_perl=<value optimized out>, sv=<value optimized out>,
pat=<value optimized out>, patlen=<value optimized out>, args=0x0, svargs=0x801e16528,
sv_count=1, maybe_tainted=0x801c02200, flags=0) at sv.c:13092
13092 Perl_croak(aTHX_ "Numeric format result too large");
(gdb) watch PL_locale_mutex
Watchpoint 2: PL_locale_mutex
(gdb) cont
Continuing.
Breakpoint 1, Perl_sv_vcatpvfn_flags (my_perl=<value optimized out>, sv=<value optimized out>,
pat=<value optimized out>, patlen=<value optimized out>, args=0x0, svargs=0x801e16530,
sv_count=1, maybe_tainted=0x452591, flags=0) at sv.c:13092
13092 Perl_croak(aTHX_ "Numeric format result too large");
(gdb) cont
Continuing.
Numeric format result too large at -e line 1.
Watchpoint 2: PL_locale_mutex
Old value = 0x801e1c1e0
New value = 0x2
_pthread_mutex_destroy (mutex=0xa45b28) at pthread_md.h:95
95 return (TCB_GET64(tcb_thread));
Current language: auto; currently minimal
(gdb) bt
#0 _pthread_mutex_destroy (mutex=0xa45b28) at pthread_md.h:95
#1 0x000000000044c554 in Perl_sys_term () at perl.c:152
#2 0x0000000000421472 in main (argc=<value optimized out>, argv=<value optimized out>,
env=0x7fffffffe7a0) at perlmain.c:155
#####
--
James E Keenan ([email protected])
# Hugo van der Sanden <...@cry...>
Thu, 06 Jun 2019 17:00:13 -0700
On Thu, 06 Jun 2019 10:30:58 -0700, jkeenan wrote:
> Watchpoint 2: PL_locale_mutex
>
> Old value = 0x801e1c1e0
> New value = 0x2
> _pthread_mutex_destroy (mutex=0xa45b28) at pthread_md.h:95
> 95 return (TCB_GET64(tcb_thread));
> Current language: auto; currently minimal
> (gdb) bt
> #0 _pthread_mutex_destroy (mutex=0xa45b28) at pthread_md.h:95
> #1 0x000000000044c554 in Perl_sys_term () at perl.c:152
> #2 0x0000000000421472 in main (argc=<value optimized out>,
> argv=<value optimized out>,
> env=0x7fffffffe7a0) at perlmain.c:155
Ok, we're getting somewhere - somewhere weird, but somewhere.
Please could you post the extract from pthread_md.h around line 95, presumably a #define or inline function for _pthread_mutex_destroy? If you're also able to find the definition for TCB_GET64 somewhere in the include files (most likely under /usr/include) that would help too.
Hugo
# James E Keenan <jkee...@cpa...>
Thu, 06 Jun 2019 17:25:06 -0700
On Fri, 07 Jun 2019 00:00:13 GMT, hv wrote:
> On Thu, 06 Jun 2019 10:30:58 -0700, jkeenan wrote:
> > Watchpoint 2: PL_locale_mutex
> >
> > Old value = 0x801e1c1e0
> > New value = 0x2
> > _pthread_mutex_destroy (mutex=0xa45b28) at pthread_md.h:95
> > 95 return (TCB_GET64(tcb_thread));
> > Current language: auto; currently minimal
> > (gdb) bt
> > #0 _pthread_mutex_destroy (mutex=0xa45b28) at pthread_md.h:95
> > #1 0x000000000044c554 in Perl_sys_term () at perl.c:152
> > #2 0x0000000000421472 in main (argc=<value optimized out>,
> > argv=<value optimized out>,
> > env=0x7fffffffe7a0) at perlmain.c:155
>
> Ok, we're getting somewhere - somewhere weird, but somewhere.
>
> Please could you post the extract from pthread_md.h around line 95,
> presumably a #define or inline function for _pthread_mutex_destroy?
# /usr/src/lib/libthr/arch/amd64/include/pthread_md.h
#####
92 static __inline struct pthread *
93 _get_curthread(void)
94 {
95 return (TCB_GET64(tcb_thread));
96 }
#####
> If you're also able to find the definition for TCB_GET64 somewhere in the
> include files (most likely under /usr/include) that would help too.
>
# /usr/src/lib/libthr/arch/amd64/include/pthread_md.h
#####
65 /*
66 * Evaluates to the value of the per-tcb variable name.
67 */
68 #define TCB_GET64(name) ({ \
69 __tcb_type(name) __result; \
70 \
71 u_long __i; \
72 __asm __volatile("movq %%fs:%1, %0" \
73 : "=r" (__i) \
74 : "m" (*(volatile u_long *)(__tcb_offset(name)))); \
75 __result = (__tcb_type(name))__i; \
76 \
77 __result; \
78 })
#####
--
James E Keenan ([email protected])
# James E Keenan <jkee...@cpa...>
Sat, 08 Jun 2019 15:11:16 -0700
On Fri, 07 Jun 2019 00:25:06 GMT, jkeenan wrote:
> On Fri, 07 Jun 2019 00:00:13 GMT, hv wrote:
> > On Thu, 06 Jun 2019 10:30:58 -0700, jkeenan wrote:
> > > Watchpoint 2: PL_locale_mutex
> > >
> > > Old value = 0x801e1c1e0
> > > New value = 0x2
> > > _pthread_mutex_destroy (mutex=0xa45b28) at pthread_md.h:95
> > > 95 return (TCB_GET64(tcb_thread));
> > > Current language: auto; currently minimal
> > > (gdb) bt
> > > #0 _pthread_mutex_destroy (mutex=0xa45b28) at pthread_md.h:95
> > > #1 0x000000000044c554 in Perl_sys_term () at perl.c:152
> > > #2 0x0000000000421472 in main (argc=<value optimized out>,
> > > argv=<value optimized out>,
> > > env=0x7fffffffe7a0) at perlmain.c:155
> >
> > Ok, we're getting somewhere - somewhere weird, but somewhere.
> >
> > Please could you post the extract from pthread_md.h around line 95,
> > presumably a #define or inline function for _pthread_mutex_destroy?
>
> # /usr/src/lib/libthr/arch/amd64/include/pthread_md.h
> #####
> 92 static __inline struct pthread *
> 93 _get_curthread(void)
> 94 {
> 95 return (TCB_GET64(tcb_thread));
> 96 }
> #####
>
> > If you're also able to find the definition for TCB_GET64 somewhere in the
> > include files (most likely under /usr/include) that would help too.
> >
>
> # /usr/src/lib/libthr/arch/amd64/include/pthread_md.h
> #####
> 65 /*
> 66 * Evaluates to the value of the per-tcb variable name.
> 67 */
> 68 #define TCB_GET64(name) ({ \
> 69 __tcb_type(name) __result; \
> 70 \
> 71 u_long __i; \
> 72 __asm __volatile("movq %%fs:%1, %0" \
> 73 : "=r" (__i) \
> 74 : "m" (*(volatile u_long *)(__tcb_offset(name)))); \
> 75 __result = (__tcb_type(name))__i; \
> 76 \
> 77 __result; \
> 78 })
> #####
hv and I are continuing to investigate this, but any help from someone with access to any of the affected OSes is welcome.
Thank you very much.
--
James E Keenan ([email protected])
# Hugo van der Sanden <...@cry...>
Mon, 10 Jun 2019 04:53:37 -0700
On Thu, 06 Jun 2019 10:30:58 -0700, jkeenan wrote:
> Watchpoint 2: PL_locale_mutex
>
> Old value = 0x801e1c1e0
> New value = 0x2
> _pthread_mutex_destroy (mutex=0xa45b28) at pthread_md.h:95
> 95 return (TCB_GET64(tcb_thread));
Ok, I can make sense of this bit now: the pthread_mutex_destroy code explicitly overwrites the pointer to signal that it has been destroyed:
/usr/src/lib/libthr/thread/thr_private.h has:
#define THR_MUTEX_DESTROYED ((struct pthread_mutex *)2)
/usr/src/lib/libthr/thread/thr_mutex.c has:
*mutex = THR_MUTEX_DESTROYED;
in multiple cases.
So this isn't corruption but intentional. The core of the problem, though, is that we're seeing an error returned from the mutex_destroy, which is what is triggering our attempt to croak. And the error turns out to be EBUSY, indicating that we're trying to destroy a mutex that is still locked, and with the debug option -DLv it becomes trivial to see that:
$ ./perl -Ilib -DLv -we 'my $x = sprintf("%2E", 0)'
sv.c: 12972: locking lc_numeric; depth=1
sv.c: 13422: unlocking lc_numeric; depth=0
EXECUTING...
$ ./perl -Ilib -DLv -we 'my $x = sprintf("%7000000000E", 0)'
sv.c: 12972: locking lc_numeric; depth=1
EXECUTING...
sv.c: 12972: avoided lc_numeric_lock; new depth=2
Numeric format result too large at -e line 1.
Segmentation fault (core dumped)
$
We're invoking STORE_LC_NUMERIC_SET_TO_NEEDED, which is documented with:
On threaded perls not operating with thread-safe functionality, this macro uses
a mutex to force a critical section. Therefore the matching RESTORE should be
close by, and guaranteed to be called.
.. however this is wrapping precisely the call with the new error check, so when that croaks the RESTORE_LC_NUMERIC is missed, and that causes us to attempt global destruction with the mutex still held.
Because there are 2 nested calls to the STORE macro, I'm not sure how correct the obvious fix is; there also seems to be some cleverness with pragmas to try and enforce balancing of STORE/RESTORE. I was able to get it both to compile and run the test case without error with the attached, but I think it's Karl's baby so I hope he can advise.
Note also: a) there are other opportunities to hit fatal errors in sprintf, I assume they'll all need the same handling; b) the 'mutex_destroy or croak' pattern makes no sense after PerlIO_teardown(), but I'm not sure what we can do instead - the bug uncovered here certainly suggests we'd have been unwise simply to ignore errors on mutex_destroy.
Hugo
From b6c19f3e4b408c9f460d394c6b6fa8309f591a5a Mon Sep 17 00:00:00 2001
From: Hugo van der Sanden <[email protected]>
Date: Mon, 10 Jun 2019 11:44:25 +0000
Subject: [PATCH] Hack to RESTORE_LC_NUMERIC before croak
---
sv.c | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/sv.c b/sv.c
index 8bc0af0c16..1710434620 100644
--- a/sv.c
+++ b/sv.c
@@ -13089,7 +13089,19 @@ Perl_sv_vcatpvfn_flags(pTHX_ SV *const sv, const char *const pat, const STRLEN p
/* snprintf() returns an int, and we use that return value,
so die horribly if the expected size is too large for int
*/
+#ifdef USE_LOCALE_NUMERIC
+ if (lc_numeric_set) {
+ /* must not croak() with this held */
+ RESTORE_LC_NUMERIC();
+ }
+#endif
Perl_croak(aTHX_ "Numeric format result too large");
+#ifdef USE_LOCALE_NUMERIC
+ if (lc_numeric_set) {
+ /* balance the diagnostics */
+ STORE_LC_NUMERIC_SET_TO_NEEDED();
+ }
+#endif
}
if (PL_efloatsize <= float_need) {
--
2.19.1
# James E Keenan <jkee...@cpa...>
Mon, 10 Jun 2019 05:24:19 -0700
On Mon, 10 Jun 2019 11:53:37 GMT, hv wrote:
> On Thu, 06 Jun 2019 10:30:58 -0700, jkeenan wrote:
> > Watchpoint 2: PL_locale_mutex
> >
> > Old value = 0x801e1c1e0
> > New value = 0x2
> > _pthread_mutex_destroy (mutex=0xa45b28) at pthread_md.h:95
> > 95 return (TCB_GET64(tcb_thread));
>
> Ok, I can make sense of this bit now: the pthread_mutex_destroy code
> explicitly overwrites the pointer to signal that it has been
> destroyed:
>
> /usr/src/lib/libthr/thread/thr_private.h has:
> #define THR_MUTEX_DESTROYED ((struct pthread_mutex *)2)
> /usr/src/lib/libthr/thread/thr_mutex.c has:
> *mutex = THR_MUTEX_DESTROYED;
> in multiple cases.
>
> So this isn't corruption but intentional. The core of the problem,
> though, is that we're seeing an error returned from the mutex_destroy,
> which is what is triggering our attempt to croak. And the error turns
> out to be EBUSY, indicating that we're trying to destroy a mutex that
> is still locked, and with the debug option -DLv it becomes trivial to
> see that:
>
> $ ./perl -Ilib -DLv -we 'my $x = sprintf("%2E", 0)'
> sv.c: 12972: locking lc_numeric; depth=1
> sv.c: 13422: unlocking lc_numeric; depth=0
>
> EXECUTING...
>
> $ ./perl -Ilib -DLv -we 'my $x = sprintf("%7000000000E", 0)'
> sv.c: 12972: locking lc_numeric; depth=1
>
> EXECUTING...
>
> sv.c: 12972: avoided lc_numeric_lock; new depth=2
> Numeric format result too large at -e line 1.
> Segmentation fault (core dumped)
> $
>
> We're invoking STORE_LC_NUMERIC_SET_TO_NEEDED, which is documented
> with:
> On threaded perls not operating with thread-safe functionality, this
> macro uses
> a mutex to force a critical section. Therefore the matching RESTORE
> should be
> close by, and guaranteed to be called.
> .. however this is wrapping precisely the call with the new error
> check, so when that croaks the RESTORE_LC_NUMERIC is missed, and that
> causes us to attempt global destruction with the mutex still held.
>
> Because there are 2 nested calls to the STORE macro, I'm not sure how
> correct the obvious fix is; there also seems to be some cleverness
> with pragmas to try and enforce balancing of STORE/RESTORE. I was able
> to get it both to compile and run the test case without error with the
> attached, but I think it's Karl's baby so I hope he can advise.
>
> Note also: a) there are other opportunities to hit fatal errors in
> sprintf, I assume they'll all need the same handling; b) the
> 'mutex_destroy or croak' pattern makes no sense after
> PerlIO_teardown(), but I'm not sure what we can do instead - the bug
> uncovered here certainly suggests we'd have been unwise simply to
> ignore errors on mutex_destroy.
>
> Hugo
Thanks. I realize your patch is preliminary, but to get some data from other platforms, I've placed it in this branch:
smoke-me/jkeenan/hv/134172-hack
Thank you very much.
--
James E Keenan ([email protected])
# Karl Williamson <...@ind...>
Mon, 10 Jun 2019 05:46:31 -0700
I’m traveling but cursory examining the patch looks good
Sent from my iPhone
> On Jun 10, 2019, at 7:24 AM, James E Keenan via RT <[email protected]> wrote:
>
>> On Mon, 10 Jun 2019 11:53:37 GMT, hv wrote:
>>> On Thu, 06 Jun 2019 10:30:58 -0700, jkeenan wrote:
>>> Watchpoint 2: PL_locale_mutex
>>>
>>> Old value = 0x801e1c1e0
>>> New value = 0x2
>>> _pthread_mutex_destroy (mutex=0xa45b28) at pthread_md.h:95
>>> 95 return (TCB_GET64(tcb_thread));
>>
>> Ok, I can make sense of this bit now: the pthread_mutex_destroy code
>> explicitly overwrites the pointer to signal that it has been
>> destroyed:
>>
>> /usr/src/lib/libthr/thread/thr_private.h has:
>> #define THR_MUTEX_DESTROYED ((struct pthread_mutex *)2)
>> /usr/src/lib/libthr/thread/thr_mutex.c has:
>> *mutex = THR_MUTEX_DESTROYED;
>> in multiple cases.
>>
>> So this isn't corruption but intentional. The core of the problem,
>> though, is that we're seeing an error returned from the mutex_destroy,
>> which is what is triggering our attempt to croak. And the error turns
>> out to be EBUSY, indicating that we're trying to destroy a mutex that
>> is still locked, and with the debug option -DLv it becomes trivial to
>> see that:
>>
>> $ ./perl -Ilib -DLv -we 'my $x = sprintf("%2E", 0)'
>> sv.c: 12972: locking lc_numeric; depth=1
>> sv.c: 13422: unlocking lc_numeric; depth=0
>>
>> EXECUTING...
>>
>> $ ./perl -Ilib -DLv -we 'my $x = sprintf("%7000000000E", 0)'
>> sv.c: 12972: locking lc_numeric; depth=1
>>
>> EXECUTING...
>>
>> sv.c: 12972: avoided lc_numeric_lock; new depth=2
>> Numeric format result too large at -e line 1.
>> Segmentation fault (core dumped)
>> $
>>
>> We're invoking STORE_LC_NUMERIC_SET_TO_NEEDED, which is documented
>> with:
>> On threaded perls not operating with thread-safe functionality, this
>> macro uses
>> a mutex to force a critical section. Therefore the matching RESTORE
>> should be
>> close by, and guaranteed to be called.
>> .. however this is wrapping precisely the call with the new error
>> check, so when that croaks the RESTORE_LC_NUMERIC is missed, and that
>> causes us to attempt global destruction with the mutex still held.
>>
>> Because there are 2 nested calls to the STORE macro, I'm not sure how
>> correct the obvious fix is; there also seems to be some cleverness
>> with pragmas to try and enforce balancing of STORE/RESTORE. I was able
>> to get it both to compile and run the test case without error with the
>> attached, but I think it's Karl's baby so I hope he can advise.
>>
>> Note also: a) there are other opportunities to hit fatal errors in
>> sprintf, I assume they'll all need the same handling; b) the
>> 'mutex_destroy or croak' pattern makes no sense after
>> PerlIO_teardown(), but I'm not sure what we can do instead - the bug
>> uncovered here certainly suggests we'd have been unwise simply to
>> ignore errors on mutex_destroy.
>>
>> Hugo
>
> Thanks. I realize your patch is preliminary, but to get some data from other platforms, I've placed it in this branch:
>
> smoke-me/jkeenan/hv/134172-hack
>
> Thank you very much.
>
> --
> James E Keenan ([email protected])
>
> ---
> via perlbug: queue: perl5 status: open
> https://rt.perl.org/Ticket/Display.html?id=134172
# James E Keenan <jkee...@cpa...>
Tue, 11 Jun 2019 05:59:02 -0700
On Mon, 10 Jun 2019 12:24:19 GMT, jkeenan wrote:
> On Mon, 10 Jun 2019 11:53:37 GMT, hv wrote:
> > On Thu, 06 Jun 2019 10:30:58 -0700, jkeenan wrote:
> > > Watchpoint 2: PL_locale_mutex
> > >
> > > Old value = 0x801e1c1e0
> > > New value = 0x2
> > > _pthread_mutex_destroy (mutex=0xa45b28) at pthread_md.h:95
> > > 95 return (TCB_GET64(tcb_thread));
> >
> > Ok, I can make sense of this bit now: the pthread_mutex_destroy code
> > explicitly overwrites the pointer to signal that it has been
> > destroyed:
> >
> > /usr/src/lib/libthr/thread/thr_private.h has:
> > #define THR_MUTEX_DESTROYED ((struct pthread_mutex *)2)
> > /usr/src/lib/libthr/thread/thr_mutex.c has:
> > *mutex = THR_MUTEX_DESTROYED;
> > in multiple cases.
> >
> > So this isn't corruption but intentional. The core of the problem,
> > though, is that we're seeing an error returned from the
> > mutex_destroy,
> > which is what is triggering our attempt to croak. And the error turns
> > out to be EBUSY, indicating that we're trying to destroy a mutex that
> > is still locked, and with the debug option -DLv it becomes trivial to
> > see that:
> >
> > $ ./perl -Ilib -DLv -we 'my $x = sprintf("%2E", 0)'
> > sv.c: 12972: locking lc_numeric; depth=1
> > sv.c: 13422: unlocking lc_numeric; depth=0
> >
> > EXECUTING...
> >
> > $ ./perl -Ilib -DLv -we 'my $x = sprintf("%7000000000E", 0)'
> > sv.c: 12972: locking lc_numeric; depth=1
> >
> > EXECUTING...
> >
> > sv.c: 12972: avoided lc_numeric_lock; new depth=2
> > Numeric format result too large at -e line 1.
> > Segmentation fault (core dumped)
> > $
> >
> > We're invoking STORE_LC_NUMERIC_SET_TO_NEEDED, which is documented
> > with:
> > On threaded perls not operating with thread-safe functionality,
> > this
> > macro uses
> > a mutex to force a critical section. Therefore the matching
> > RESTORE
> > should be
> > close by, and guaranteed to be called.
> > .. however this is wrapping precisely the call with the new error
> > check, so when that croaks the RESTORE_LC_NUMERIC is missed, and that
> > causes us to attempt global destruction with the mutex still held.
> >
> > Because there are 2 nested calls to the STORE macro, I'm not sure how
> > correct the obvious fix is; there also seems to be some cleverness
> > with pragmas to try and enforce balancing of STORE/RESTORE. I was
> > able
> > to get it both to compile and run the test case without error with
> > the
> > attached, but I think it's Karl's baby so I hope he can advise.
> >
> > Note also: a) there are other opportunities to hit fatal errors in
> > sprintf, I assume they'll all need the same handling; b) the
> > 'mutex_destroy or croak' pattern makes no sense after
> > PerlIO_teardown(), but I'm not sure what we can do instead - the bug
> > uncovered here certainly suggests we'd have been unwise simply to
> > ignore errors on mutex_destroy.
> >
> > Hugo
>
> Thanks. I realize your patch is preliminary, but to get some data
> from other platforms, I've placed it in this branch:
>
> smoke-me/jkeenan/hv/134172-hack
>
> Thank you very much.
Smoke results are looking good. I intend to merge this branch into blead within 24 hours unless we get bad results or someone objects.
TonyC, does this look okay to you?
Hugo, would you like to suggest a better commit message?
Thank you very much.
--
James E Keenan ([email protected])
# Hugo van der Sanden <...@cry...>
Tue, 11 Jun 2019 06:30:21 -0700
On Tue, 11 Jun 2019 05:59:02 -0700, jkeenan wrote:
> Smoke results are looking good. I intend to merge this branch into
> blead within 24 hours unless we get bad results or someone objects.
>
> TonyC, does this look okay to you?
>
> Hugo, would you like to suggest a better commit message?
I don't think this should be merged at is, it was intended only as a proof of concept.
In particular, I strongly dislike the additional 5 lines (in a NOTREACHED area) simply to satisfy the macro's expectations of balance: if we keep the current general structure, the solution here will also need to be wrapped around any other part of the code that can bypass the unlock, and I anticipate there will be quite a few such areas.
I've looked some more today, and I think a larger change would be merited, but I'd want some input from DaveM and Karl.
I feel the problems here are essentially a clash between the work from bc37e90eb: "Perl_sv_vcatpvfn_flags: set locale at most once" and from e9bc6d6b: "Add thread-safe locale handling". Because of the mutex requirements of the latter, attempting to retain the former will need us to identify every possible route out of sv_vcatpvfn - not only the explicit croaks, but also eg any attempt to read an SV that might be tied or overloaded.
I don't think that's a practical thing to do; as such, I'm not convinced there's a practical route to getting correct mutex-protected locale changes on the platforms that need it while retaining the "do this only once" performance benefits.
What I'm not sure of is whether _some_ of the "only once" benefits can be retained, eg to discover if switching is necessary - I suspect that subsidiary code (such as tied and overloaded arguments, again) might make that unsafe even in an unthreaded application.
I guess another possibility would be to put an unlock request on a save stack, but that feels like it's going even further in the wrong direction - you want to be shortening the time you hold a mutex rather than lengthening it.
Hugo
# James E Keenan <jkee...@cpa...>
Fri, 14 Jun 2019 07:09:41 -0700
On Thu, 06 Jun 2019 15:54:29 GMT, jkeenan wrote:
> On Thu, 06 Jun 2019 15:47:02 GMT, hv wrote:
> > On Thu, 06 Jun 2019 08:16:29 -0700, jkeenan wrote:
> > > > (gdb) next
> > > > 13092 Perl_croak(aTHX_ "Numeric format result
> > > > too
> > > > large");
> > > > 2: *PL_locale_mutex = {m_lock = {m_owner = 0, m_flags = 0,
> > > > m_ceilings
> > > > = 0x801e1c1e8, m_rb_lnk = 0,
> > > > m_spare = 0x801e1c1f8}, m_flags = 1, m_count = 0, m_spinloops
> > > > =
> > > > 0,
> > > > m_yieldloops = 0, m_ps = 0,
> > > > m_qe = {tqe_next = 0x0, tqe_prev = 0x0}, m_pqe = {tqe_next =
> > > > 0x0,
> > > > tqe_prev = 0x0}, m_rb_prev = 0x0}
> > > > 1: PL_locale_mutex = 0x801e1c1e0
> >
> > Ok, so at this point it still looks valid ...
> >
> > > > Program received signal SIGSEGV, Segmentation fault.
> > > > 0x0000000000597ad9 in Perl_dounwind (my_perl=0x801e22000, cxix=-
> > > > 1)
> > > > at
> > > > pp_ctl.c:1550
> > > > 1550 CX_LEAVE_SCOPE(cx);
> > > > 2: *PL_locale_mutex = Error accessing memory address 0x2: Bad
> > > > address.
> > > > Disabling display 2 to avoid infinite recursion.
> >
> > .. but by the time we get here it has been corrupted to 0x2.
> >
> > I think it's worth another go at the 'watch', but if that fails we
> > may
> > have to step through with 'next' and 'step' until the displayed value
> > changes. If we set the watchpoint later, it may work better:
> >
> > shell% gdb --args ./perl -we 'my $x = sprintf("%7000000000E", 0)'
> > (gdb) break sv.c:13092
> > (gdb) run
> > .. break at sv.c:13092 (the croak call)
> > (gdb) watch PL_locale_mutex
> > (gdb) cont
> >
> > As before, it should not take more than a few minutes after the
> > 'cont'.
> >
> > Hugo
>
> 'cont' returned within 1 second.
>
> #####
> $ gdb --args ./perl -we 'my $x = sprintf("%7000000000E", 0)'
> GNU gdb 6.1.1 [FreeBSD]
> Copyright 2004 Free Software Foundation, Inc.
> GDB is free software, covered by the GNU General Public License, and
> you are
> welcome to change it and/or distribute copies of it under certain
> conditions.
> Type "show copying" to see the conditions.
> There is absolutely no warranty for GDB. Type "show warranty" for
> details.
> This GDB was configured as "amd64-marcel-freebsd"...
> (gdb) break sv.c:13092
> Breakpoint 1 at 0x55b2f0: file sv.c, line 13092.
> (gdb) run
> Starting program: /usr/home/jkeenan/gitwork/perl/perl -we my\ \$x\ =\
> sprintf\(\"%7000000000E\",\ 0\)
>
> Breakpoint 1, Perl_sv_vcatpvfn_flags (my_perl=<value optimized out>,
> sv=<value optimized out>,
> pat=<value optimized out>, patlen=<value optimized out>, args=0x0,
> svargs=0x801e16528,
> sv_count=1, maybe_tainted=0x801c02200, flags=0) at sv.c:13092
> 13092 Perl_croak(aTHX_ "Numeric format result too
> large");
> (gdb) watch PL_locale_mutex
> Watchpoint 2: PL_locale_mutex
> (gdb) cont
> Continuing.
>
> Breakpoint 1, Perl_sv_vcatpvfn_flags (my_perl=<value optimized out>,
> sv=<value optimized out>,
> pat=<value optimized out>, patlen=<value optimized out>, args=0x0,
> svargs=0x801e16530,
> sv_count=1, maybe_tainted=0x452591, flags=0) at sv.c:13092
> 13092 Perl_croak(aTHX_ "Numeric format result too
> large");
> (gdb) bt
> #0 Perl_sv_vcatpvfn_flags (my_perl=<value optimized out>, sv=<value
> optimized out>,
> pat=<value optimized out>, patlen=<value optimized out>, args=0x0,
> svargs=0x801e16530,
> sv_count=1, maybe_tainted=0x452591, flags=0) at sv.c:13092
> #1 0x0000000000553045 in Perl_sv_vsetpvfn (my_perl=0x801e22000,
> sv=0x801e98610,
> pat=0x801e20128 "%7000000000E", patlen=<value optimized out>,
> args=<value optimized out>,
> svargs=0x801e16530, sv_count=1, maybe_tainted=0x7fffffffe577) at
> sv.c:10984
> #2 0x00000000005c4f31 in Perl_do_sprintf (my_perl=0x801e22000,
> sv=0x801e98610,
> len=<value optimized out>, sarg=<value optimized out>) at
> doop.c:734
> #3 0x0000000000579ac0 in Perl_pp_sprintf (my_perl=0x801e22000) at
> pp.c:3559
> #4 0x00000000004f0340 in Perl_runops_debug (my_perl=0x801e22000) at
> dump.c:2537
> #5 0x0000000000453464 in S_run_body (my_perl=0x801e22000, oldscope=1)
> at inline.h:65
> #6 0x00000000004532d9 in perl_run (my_perl=0x801e22000) at
> perl.c:2646
> #7 0x00000000004213fa in main (argc=<value optimized out>,
> argv=<value optimized out>,
> env=0x7fffffffe7a0) at perlmain.c:127
> #####
We're getting the same test failure on threaded builds on OpenBSD-6.4 and 6.5.
On 6.5, this is not showing up in the regular smoke-testing reports, but if you go to the "log" version of Carlos's reports, you can see it. See, e.g., http://perl5.test-smoke.org/logfile/89510, where t/op/sprintf2.t is graded '?????' on threaded builds.
I build a threaded perl on blead on an OpenBSD-6.4 VM. Some results for t/op/sprintf2.t and for Hugo's one-liner.
#####
$ uname -mrs
OpenBSD 6.4 amd64
$ ./perl -Ilib -V:config_args
config_args='-des -Dusedevel -Duseithreads';
$ ./perl -Ilib -v | head -2 | tail -1
This is perl 5, version 31, subversion 1 (v5.31.1 (v5.31.0-159-g84ab3b6908)) built for OpenBSD.amd64-openbsd-thread-multi
$ cd t;./perl harness -v op/sprintf2.t;cd -
[snip]
ok 1699 - croak for very large numeric format results
pthread_mutex_destroy on mutex with waiters!
All 1699 subtests passed
(less 30 skipped subtests: 1669 okay)
Test Summary Report
-------------------
op/sprintf2.t (Wstat: 138 Tests: 1699 Failed: 0)
Non-zero wait status: 138
Files=1, Tests=1699, 1 wallclock secs ( 0.22 usr 0.15 sys + 0.11 cusr 0.53 csys = 1.01 CPU)
Result: FAIL
#####
Note how on this platform an error message is printed:
pthread_mutex_destroy on mutex with waiters!
#####
$ uname -mrs
OpenBSD 6.4 amd64
$ ./perl -Ilib -v | head -2 | tail -1
This is perl 5, version 31, subversion 1 (v5.31.1 (v5.31.0-159-g84ab3b6908)) built for OpenBSD.amd64-openbsd-thread-multi
$ ./perl -Ilib -V:config_args
config_args='-des -Dusedevel -Duseithreads -DDEBUGGING';
$ gdb --args ./perl -we 'my $x = sprintf("%7000000000E", 0)'
GNU gdb 6.3
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "amd64-unknown-openbsd6.4"...
(gdb) break sv.c:13092
Breakpoint 1 at 0x15aa82: file sv.c, line 13092.
(gdb) run
Starting program: /home/jkeenan/gitwork/perl/perl -we my\ \$x\ =\ sprintf\(\"%7000000000E\",\ 0\)
Breakpoint 1 at 0x18a24b55aa82: file sv.c, line 13092.
Breakpoint 1, Perl_sv_vcatpvfn_flags (my_perl=Variable "my_perl" is not available.
) at sv.c:13095
13095 Perl_croak(aTHX_ "Numeric format result too large");
(gdb) watch PL_locale_mutex
Watchpoint 2: PL_locale_mutex
(gdb) cont
Continuing.
Breakpoint 1, Perl_sv_vcatpvfn_flags (my_perl=Variable "my_perl" is not available.
) at sv.c:13095
13095 Perl_croak(aTHX_ "Numeric format result too large");
(gdb) cont
Continuing.
Numeric format result too large at -e line 1.
Watchpoint 2: PL_locale_mutex
Old value = 0x18a52d7d7160
New value = 0x0
0x000018a50c654f06 in _libc_pthread_mutex_destroy (mutexp=0x18a24ba98c40)
at /usr/src/lib/libc/thread/rthread_mutex.c:89
89 /usr/src/lib/libc/thread/rthread_mutex.c: No such file or directory.
in /usr/src/lib/libc/thread/rthread_mutex.c
Current language: auto; currently minimal
(gdb) bt
#0 0x000018a50c654f06 in _libc_pthread_mutex_destroy (mutexp=0x18a24ba98c40)
at /usr/src/lib/libc/thread/rthread_mutex.c:89
#1 0x000018a24b430af5 in Perl_sys_term () at perl.c:146
#2 0x000018a24b4015ee in main (argc=Variable "argc" is not available.
) at perlmain.c:155
#####
Thank you very much.
--
James E Keenan ([email protected])
# Dave Mitchell <dave...@iab...>
Thu, 20 Jun 2019 03:23:43 -0700
On Tue, Jun 11, 2019 at 06:30:21AM -0700, Hugo van der Sanden via RT wrote:
> I've looked some more today, and I think a larger change would be
> merited, but I'd want some input from DaveM and Karl.
>
> I feel the problems here are essentially a clash between the work from
> bc37e90eb: "Perl_sv_vcatpvfn_flags: set locale at most once" and from
> e9bc6d6b: "Add thread-safe locale handling". Because of the mutex
> requirements of the latter, attempting to retain the former will need us
> to identify every possible route out of sv_vcatpvfn - not only the
> explicit croaks, but also eg any attempt to read an SV that might be
> tied or overloaded.
>
> I don't think that's a practical thing to do; as such, I'm not convinced
> there's a practical route to getting correct mutex-protected locale
> changes on the platforms that need it while retaining the "do this only
> once" performance benefits.
I have no objection to making it potentially switch locales for each
call to snprintf.
> What I'm not sure of is whether _some_ of the "only once" benefits can
> be retained, eg to discover if switching is necessary - I suspect that
> subsidiary code (such as tied and overloaded arguments, again) might
> make that unsafe even in an unthreaded application.
From my commit message in bc37e90eb, it appears that checking whether
we're within the lexical scope of 'use locale' is the expensive bit. That
can't change during the course of executing the function, so it looks like
a good candidate for checking only once. I guess it's the IN_LC() bit of
STORE_LC_NUMERIC_SET_TO_NEEDED.
--
Spock (or Data) is fired from his high-ranking position for not being able
to understand the most basic nuances of about one in three sentences that
anyone says to him.
-- Things That Never Happen in "Star Trek" #19
# Hugo van der Sanden <...@cry...>
Mon, 22 Jul 2019 08:47:39 -0700
On Tue, 11 Jun 2019 06:30:21 -0700, hv wrote:
> I feel the problems here are essentially a clash between the work from
> bc37e90eb: "Perl_sv_vcatpvfn_flags: set locale at most once" and from
> e9bc6d6b: "Add thread-safe locale handling". Because of the mutex
> requirements of the latter, attempting to retain the former will need
> us to identify every possible route out of sv_vcatpvfn - not only the
> explicit croaks, but also eg any attempt to read an SV that might be
> tied or overloaded.
>
> I don't think that's a practical thing to do; as such, I'm not
> convinced there's a practical route to getting correct mutex-protected
> locale changes on the platforms that need it while retaining the "do
> this only once" performance benefits.
Attached is an attempt to fix this by defining a new more tightly scoped
WITH_LC_NUMERIC_SET_TO_NEEDED macro, and using this around each point
(that I was able to identify) that needs the locale switching.
As Dave mentioned, the expensive part is the hints check of IN_LC(LC_NUMERIC);
I cache a copy of that for sv_vcatpvfn_flags (and format_hexfp) to use;
however I note that the same check is done multiple times within the
underlying STORE_LC_NUMERIC_SET_TO_NEEDED macro, so it may be worth
making variants of WITH_LC_x and STORE_LC_x that accept a cached boolean
for the hints check.
I'd appreciate if someone could look through for any location I might
have missed that needs the new macro, and we should probably give this
some smoking.
Hugo
From 8ced26e934d146385433fe888a9111f611f76d3d Mon Sep 17 00:00:00 2001
From: Hugo van der Sanden <[email protected]>
Date: Mon, 22 Jul 2019 16:29:13 +0100
Subject: [PATCH] [perl #134172] restrict scope of locale changes during
sprintf
In some environments we must hold a mutex for the duration of a temporary
locale change, so we must ensure that mutex is released appropriately.
This means intervening code must not croak, or otherwise bypass the
unlock.
In sv_vcatpvfn_flags(), that requirement was violated when attempting
to avoid multiple temporary locale changes by collapsing them into
a single one. This partially undoes that to fix the problem, while
still attempting to retain some of the benefits by caching the
expensive hints check.
---
perl.h | 32 +++++++++++++++++
sv.c | 111 +++++++++++++++++++++++++++++----------------------------
2 files changed, 88 insertions(+), 55 deletions(-)
diff --git a/perl.h b/perl.h
index 37b2637ca1..3be4a53756 100644
--- a/perl.h
+++ b/perl.h
@@ -6426,6 +6426,29 @@ expression, but with an empty argument list, like this:
...
}
+=for apidoc Amn|void|WITH_LC_NUMERIC_SET_TO_NEEDED
+
+This macro invokes the supplied statement or block within the context
+of a L</STORE_LC_NUMERIC_SET_TO_NEEDED> .. L</RESTORE_LC_NUMERIC> pair
+if required, so eg:
+
+ WITH_LC_NUMERIC_SET_TO_NEEDED(
+ SNPRINTF_G(fv, ebuf, sizeof(ebuf), precis)
+ );
+
+is equivalent to:
+
+ {
+#ifdef USE_LOCALE_NUMERIC
+ DECLARATION_FOR_LC_NUMERIC_MANIPULATION;
+ STORE_LC_NUMERIC_SET_TO_NEEDED();
+#endif
+ SNPRINTF_G(fv, ebuf, sizeof(ebuf), precis);
+#ifdef USE_LOCALE_NUMERIC
+ RESTORE_LC_NUMERIC();
+#endif
+ }
+
=cut
*/
@@ -6548,6 +6571,14 @@ expression, but with an empty argument list, like this:
__FILE__, __LINE__, PL_numeric_standard)); \
} STMT_END
+# define WITH_LC_NUMERIC_SET_TO_NEEDED(block) \
+ STMT_START { \
+ DECLARATION_FOR_LC_NUMERIC_MANIPULATION; \
+ STORE_LC_NUMERIC_SET_TO_NEEDED(); \
+ block; \
+ RESTORE_LC_NUMERIC(); \
+ } STMT_END;
+
#else /* !USE_LOCALE_NUMERIC */
# define SET_NUMERIC_STANDARD()
@@ -6560,6 +6591,7 @@ expression, but with an empty argument list, like this:
# define RESTORE_LC_NUMERIC()
# define LOCK_LC_NUMERIC_STANDARD()
# define UNLOCK_LC_NUMERIC_STANDARD()
+# define WITH_LC_NUMERIC_SET_TO_NEEDED(block) block;
#endif /* !USE_LOCALE_NUMERIC */
diff --git a/sv.c b/sv.c
index 4315fe9b64..ef2c71126c 100644
--- a/sv.c
+++ b/sv.c
@@ -11562,7 +11562,9 @@ S_hextract(pTHX_ const NV nv, int* exponent, bool *subnormal,
* The rest of the args have the same meaning as the local vars of the
* same name within Perl_sv_vcatpvfn_flags().
*
- * It assumes the caller has already done STORE_LC_NUMERIC_SET_TO_NEEDED();
+ * The caller's determination of IN_LC(LC_NUMERIC), passed as in_lc_numeric,
+ * is used to ensure we do the right thing when we need to access the locale's
+ * numeric radix.
*
* It requires the caller to make buf large enough.
*/
@@ -11571,7 +11573,7 @@ static STRLEN
S_format_hexfp(pTHX_ char * const buf, const STRLEN bufsize, const char c,
const NV nv, const vcatpvfn_long_double_t fv,
bool has_precis, STRLEN precis, STRLEN width,
- bool alt, char plus, bool left, bool fill)
+ bool alt, char plus, bool left, bool fill, bool in_lc_numeric)
{
/* Hexadecimal floating point. */
char* p = buf;
@@ -11778,17 +11780,19 @@ S_format_hexfp(pTHX_ char * const buf, const STRLEN bufsize, const char c,
if (hexradix) {
#ifndef USE_LOCALE_NUMERIC
- *p++ = '.';
+ *p++ = '.';
#else
- if (IN_LC(LC_NUMERIC)) {
- STRLEN n;
+ if (in_lc_numeric) {
+ STRLEN n;
+ WITH_LC_NUMERIC_SET_TO_NEEDED({
const char* r = SvPV(PL_numeric_radix_sv, n);
Copy(r, p, n, char);
- p += n;
- }
- else {
- *p++ = '.';
- }
+ });
+ p += n;
+ }
+ else {
+ *p++ = '.';
+ }
#endif
}
@@ -11894,9 +11898,10 @@ Perl_sv_vcatpvfn_flags(pTHX_ SV *const sv, const char *const pat, const STRLEN p
char ebuf[IV_DIG * 4 + NV_DIG + 32];
bool no_redundant_warning = FALSE; /* did we use any explicit format parameter index? */
#ifdef USE_LOCALE_NUMERIC
- DECLARATION_FOR_LC_NUMERIC_MANIPULATION;
- bool lc_numeric_set = FALSE; /* called STORE_LC_NUMERIC_SET_TO_NEEDED? */
+ bool have_in_lc_numeric = FALSE;
#endif
+ /* we never change this unless USE_LOCALE_NUMERIC */
+ bool in_lc_numeric = FALSE;
PERL_ARGS_ASSERT_SV_VCATPVFN_FLAGS;
PERL_UNUSED_ARG(maybe_tainted);
@@ -12967,33 +12972,31 @@ Perl_sv_vcatpvfn_flags(pTHX_ SV *const sv, const char *const pat, const STRLEN p
* below, or implicitly, via an snprintf() variant.
* Note also things like ps_AF.utf8 which has
* "\N{ARABIC DECIMAL SEPARATOR} as a radix point */
- if (!lc_numeric_set) {
- /* only set once and reuse in-locale value on subsequent
- * iterations.
- * XXX what happens if we die in an eval?
- */
- STORE_LC_NUMERIC_SET_TO_NEEDED();
- lc_numeric_set = TRUE;
+ if (! have_in_lc_numeric) {
+ in_lc_numeric = IN_LC(LC_NUMERIC);
+ have_in_lc_numeric = TRUE;
}
- if (IN_LC(LC_NUMERIC)) {
- /* this can't wrap unless PL_numeric_radix_sv is a string
- * consuming virtually all the 32-bit or 64-bit address
- * space
- */
- float_need += (SvCUR(PL_numeric_radix_sv) - 1);
-
- /* floating-point formats only get utf8 if the radix point
- * is utf8. All other characters in the string are < 128
- * and so can be safely appended to both a non-utf8 and utf8
- * string as-is.
- * Note that this will convert the output to utf8 even if
- * the radix point didn't get output.
- */
- if (SvUTF8(PL_numeric_radix_sv) && !has_utf8) {
- sv_utf8_upgrade(sv);
- has_utf8 = TRUE;
- }
+ if (in_lc_numeric) {
+ WITH_LC_NUMERIC_SET_TO_NEEDED({
+ /* this can't wrap unless PL_numeric_radix_sv is a string
+ * consuming virtually all the 32-bit or 64-bit address
+ * space
+ */
+ float_need += (SvCUR(PL_numeric_radix_sv) - 1);
+
+ /* floating-point formats only get utf8 if the radix point
+ * is utf8. All other characters in the string are < 128
+ * and so can be safely appended to both a non-utf8 and utf8
+ * string as-is.
+ * Note that this will convert the output to utf8 even if
+ * the radix point didn't get output.
+ */
+ if (SvUTF8(PL_numeric_radix_sv) && !has_utf8) {
+ sv_utf8_upgrade(sv);
+ has_utf8 = TRUE;
+ }
+ });
}
#endif
@@ -13068,7 +13071,9 @@ Perl_sv_vcatpvfn_flags(pTHX_ SV *const sv, const char *const pat, const STRLEN p
&& !fill
&& intsize != 'q'
) {
- SNPRINTF_G(fv, ebuf, sizeof(ebuf), precis);
+ WITH_LC_NUMERIC_SET_TO_NEEDED(
+ SNPRINTF_G(fv, ebuf, sizeof(ebuf), precis)
+ );
elen = strlen(ebuf);
eptr = ebuf;
goto float_concat;
@@ -13113,7 +13118,7 @@ Perl_sv_vcatpvfn_flags(pTHX_ SV *const sv, const char *const pat, const STRLEN p
if (UNLIKELY(hexfp)) {
elen = S_format_hexfp(aTHX_ PL_efloatbuf, PL_efloatsize, c,
nv, fv, has_precis, precis, width,
- alt, plus, left, fill);
+ alt, plus, left, fill, in_lc_numeric);
}
else {
char *ptr = ebuf + sizeof ebuf;
@@ -13169,8 +13174,10 @@ Perl_sv_vcatpvfn_flags(pTHX_ SV *const sv, const char *const pat, const STRLEN p
const char* qfmt = quadmath_format_single(ptr);
if (!qfmt)
Perl_croak_nocontext("panic: quadmath invalid format \"%s\"", ptr);
- elen = quadmath_snprintf(PL_efloatbuf, PL_efloatsize,
- qfmt, nv);
+ WITH_LC_NUMERIC_SET_TO_NEEDED(
+ elen = quadmath_snprintf(PL_efloatbuf, PL_efloatsize,
+ qfmt, nv);
+ );
if ((IV)elen == -1) {
if (qfmt != ptr)
SAVEFREEPV(qfmt);
@@ -13180,11 +13187,15 @@ Perl_sv_vcatpvfn_flags(pTHX_ SV *const sv, const char *const pat, const STRLEN p
Safefree(qfmt);
}
#elif defined(HAS_LONG_DOUBLE)
- elen = ((intsize == 'q')
- ? my_snprintf(PL_efloatbuf, PL_efloatsize, ptr, fv)
- : my_snprintf(PL_efloatbuf, PL_efloatsize, ptr, (double)fv));
+ WITH_LC_NUMERIC_SET_TO_NEEDED(
+ elen = ((intsize == 'q')
+ ? my_snprintf(PL_efloatbuf, PL_efloatsize, ptr, fv)
+ : my_snprintf(PL_efloatbuf, PL_efloatsize, ptr, (double)fv))
+ );
#else
- elen = my_snprintf(PL_efloatbuf, PL_efloatsize, ptr, fv);
+ WITH_LC_NUMERIC_SET_TO_NEEDED(
+ elen = my_snprintf(PL_efloatbuf, PL_efloatsize, ptr, fv)
+ );
#endif
GCC_DIAG_RESTORE_STMT;
}
@@ -13406,16 +13417,6 @@ Perl_sv_vcatpvfn_flags(pTHX_ SV *const sv, const char *const pat, const STRLEN p
}
SvTAINT(sv);
-
-#ifdef USE_LOCALE_NUMERIC
-
- if (lc_numeric_set) {
- RESTORE_LC_NUMERIC(); /* Done outside loop, so don't have to
- save/restore each iteration. */
- }
-
-#endif
-
}
/* =========================================================================
--
2.17.1
# Hugo van der Sanden <...@cry...>
Mon, 22 Jul 2019 09:11:13 -0700
On Mon, 22 Jul 2019 08:47:39 -0700, hv wrote:
> As Dave mentioned, the expensive part is the hints check of IN_LC(LC_NUMERIC);
> I cache a copy of that for sv_vcatpvfn_flags (and format_hexfp) to use;
> however I note that the same check is done multiple times within the
> underlying STORE_LC_NUMERIC_SET_TO_NEEDED macro, so it may be worth
> making variants of WITH_LC_x and STORE_LC_x that accept a cached boolean
> for the hints check.
Something like the attached (not extensively tested).
Hugo
From 025dca8448365565bb5121614ae17bd7b726642a Mon Sep 17 00:00:00 2001
From: Hugo van der Sanden <[email protected]>
Date: Mon, 22 Jul 2019 17:08:45 +0100
Subject: [PATCH] Avoid multiple checks of IN_LC(LC_NUMERIC)
---
perl.h | 20 ++++++++++++++------
sv.c | 12 ++++++------
2 files changed, 20 insertions(+), 12 deletions(-)
diff --git a/perl.h b/perl.h
index 3be4a53756..c4afb3f2c7 100644
--- a/perl.h
+++ b/perl.h
@@ -6476,12 +6476,12 @@ is equivalent to:
# define DECLARATION_FOR_LC_NUMERIC_MANIPULATION \
void (*_restore_LC_NUMERIC_function)(pTHX) = NULL
-# define STORE_LC_NUMERIC_SET_TO_NEEDED() \
+# define STORE_LC_NUMERIC_SET_TO_NEEDED_i(in_lc_numeric) \
STMT_START { \
LC_NUMERIC_LOCK( \
- ( ( IN_LC(LC_NUMERIC) && _NOT_IN_NUMERIC_UNDERLYING) \
- || (! IN_LC(LC_NUMERIC) && _NOT_IN_NUMERIC_STANDARD)));\
- if (IN_LC(LC_NUMERIC)) { \
+ ( ( in_lc_numeric && _NOT_IN_NUMERIC_UNDERLYING) \
+ || (! in_lc_numeric && _NOT_IN_NUMERIC_STANDARD))); \
+ if (in_lc_numeric) { \
if (_NOT_IN_NUMERIC_UNDERLYING) { \
Perl_set_numeric_underlying(aTHX); \
_restore_LC_NUMERIC_function \
@@ -6497,6 +6497,9 @@ is equivalent to:
} \
} STMT_END
+# define STORE_LC_NUMERIC_SET_TO_NEEDED() \
+ STORE_LC_NUMERIC_SET_TO_NEEDED_i(IN_LC(LC_NUMERIC))
+
# define RESTORE_LC_NUMERIC() \
STMT_START { \
if (_restore_LC_NUMERIC_function) { \
@@ -6571,14 +6574,17 @@ is equivalent to:
__FILE__, __LINE__, PL_numeric_standard)); \
} STMT_END
-# define WITH_LC_NUMERIC_SET_TO_NEEDED(block) \
+# define WITH_LC_NUMERIC_SET_TO_NEEDED_i(in_lc_numeric, block) \
STMT_START { \
DECLARATION_FOR_LC_NUMERIC_MANIPULATION; \
- STORE_LC_NUMERIC_SET_TO_NEEDED(); \
+ STORE_LC_NUMERIC_SET_TO_NEEDED_i(in_lc_numeric); \
block; \
RESTORE_LC_NUMERIC(); \
} STMT_END;
+# define WITH_LC_NUMERIC_SET_TO_NEEDED(block) \
+ WITH_LC_NUMERIC_SET_TO_NEEDED_i(IN_LC(LC_NUMERIC), block)
+
#else /* !USE_LOCALE_NUMERIC */
# define SET_NUMERIC_STANDARD()
@@ -6587,10 +6593,12 @@ is equivalent to:
# define DECLARATION_FOR_LC_NUMERIC_MANIPULATION
# define STORE_LC_NUMERIC_SET_STANDARD()
# define STORE_LC_NUMERIC_FORCE_TO_UNDERLYING()
+# define STORE_LC_NUMERIC_SET_TO_NEEDED_i()
# define STORE_LC_NUMERIC_SET_TO_NEEDED()
# define RESTORE_LC_NUMERIC()
# define LOCK_LC_NUMERIC_STANDARD()
# define UNLOCK_LC_NUMERIC_STANDARD()
+# define WITH_LC_NUMERIC_SET_TO_NEEDED_i(in_lc_numeric, block) block;
# define WITH_LC_NUMERIC_SET_TO_NEEDED(block) block;
#endif /* !USE_LOCALE_NUMERIC */
diff --git a/sv.c b/sv.c
index ef2c71126c..dfe5f88e52 100644
--- a/sv.c
+++ b/sv.c
@@ -11784,7 +11784,7 @@ S_format_hexfp(pTHX_ char * const buf, const STRLEN bufsize, const char c,
#else
if (in_lc_numeric) {
STRLEN n;
- WITH_LC_NUMERIC_SET_TO_NEEDED({
+ WITH_LC_NUMERIC_SET_TO_NEEDED_i(TRUE, {
const char* r = SvPV(PL_numeric_radix_sv, n);
Copy(r, p, n, char);
});
@@ -12978,7 +12978,7 @@ Perl_sv_vcatpvfn_flags(pTHX_ SV *const sv, const char *const pat, const STRLEN p
}
if (in_lc_numeric) {
- WITH_LC_NUMERIC_SET_TO_NEEDED({
+ WITH_LC_NUMERIC_SET_TO_NEEDED_i(TRUE, {
/* this can't wrap unless PL_numeric_radix_sv is a string
* consuming virtually all the 32-bit or 64-bit address
* space
@@ -13071,7 +13071,7 @@ Perl_sv_vcatpvfn_flags(pTHX_ SV *const sv, const char *const pat, const STRLEN p
&& !fill
&& intsize != 'q'
) {
- WITH_LC_NUMERIC_SET_TO_NEEDED(
+ WITH_LC_NUMERIC_SET_TO_NEEDED_i(in_lc_numeric,
SNPRINTF_G(fv, ebuf, sizeof(ebuf), precis)
);
elen = strlen(ebuf);
@@ -13174,7 +13174,7 @@ Perl_sv_vcatpvfn_flags(pTHX_ SV *const sv, const char *const pat, const STRLEN p
const char* qfmt = quadmath_format_single(ptr);
if (!qfmt)
Perl_croak_nocontext("panic: quadmath invalid format \"%s\"", ptr);
- WITH_LC_NUMERIC_SET_TO_NEEDED(
+ WITH_LC_NUMERIC_SET_TO_NEEDED_i(in_lc_numeric,
elen = quadmath_snprintf(PL_efloatbuf, PL_efloatsize,
qfmt, nv);
);
@@ -13187,13 +13187,13 @@ Perl_sv_vcatpvfn_flags(pTHX_ SV *const sv, const char *const pat, const STRLEN p
Safefree(qfmt);
}
#elif defined(HAS_LONG_DOUBLE)
- WITH_LC_NUMERIC_SET_TO_NEEDED(
+ WITH_LC_NUMERIC_SET_TO_NEEDED_i(in_lc_numeric,
elen = ((intsize == 'q')
? my_snprintf(PL_efloatbuf, PL_efloatsize, ptr, fv)
: my_snprintf(PL_efloatbuf, PL_efloatsize, ptr, (double)fv))
);
#else
- WITH_LC_NUMERIC_SET_TO_NEEDED(
+ WITH_LC_NUMERIC_SET_TO_NEEDED_i(in_lc_numeric,
elen = my_snprintf(PL_efloatbuf, PL_efloatsize, ptr, fv)
);
#endif
--
2.17.1
# James E Keenan <jkee...@cpa...>
Mon, 22 Jul 2019 09:15:20 -0700
On Mon, 22 Jul 2019 16:11:13 GMT, hv wrote:
> On Mon, 22 Jul 2019 08:47:39 -0700, hv wrote:
> > As Dave mentioned, the expensive part is the hints check of
> > IN_LC(LC_NUMERIC);
> > I cache a copy of that for sv_vcatpvfn_flags (and format_hexfp) to
> > use;
> > however I note that the same check is done multiple times within the
> > underlying STORE_LC_NUMERIC_SET_TO_NEEDED macro, so it may be worth
> > making variants of WITH_LC_x and STORE_LC_x that accept a cached
> > boolean
> > for the hints check.
>
> Something like the attached (not extensively tested).
>
> Hugo
Hugo, I created this smoke branch just now after seeing our *first* patch:
smoke-me/jkeenan/hv/134172-sprintf
Should this second patch be applied to that branch as well?
Thank you very much.
--
James E Keenan ([email protected])
# Hugo van der Sanden <...@cry...>
Mon, 22 Jul 2019 11:14:11 -0700
On Mon, 22 Jul 2019 09:15:20 -0700, jkeenan wrote:
> On Mon, 22 Jul 2019 16:11:13 GMT, hv wrote:
> > On Mon, 22 Jul 2019 08:47:39 -0700, hv wrote:
> > > As Dave mentioned, the expensive part is the hints check of
> > > IN_LC(LC_NUMERIC);
> > > I cache a copy of that for sv_vcatpvfn_flags (and format_hexfp) to
> > > use;
> > > however I note that the same check is done multiple times within the
> > > underlying STORE_LC_NUMERIC_SET_TO_NEEDED macro, so it may be worth
> > > making variants of WITH_LC_x and STORE_LC_x that accept a cached
> > > boolean
> > > for the hints check.
> >
> > Something like the attached (not extensively tested).
>
> Hugo, I created this smoke branch just now after seeing our *first* patch:
>
> smoke-me/jkeenan/hv/134172-sprintf
>
> Should this second patch be applied to that branch as well?
Up to you: I'd like a second opinion on it, particularly about the naming of the additional macros, but I think it's reasonably likely to work. (It passed tests here on a build with -DNO_THREAD_SAFE_LOCALE, but I haven't tried any other variations.)
Hugo
# James E Keenan <jkee...@cpa...>
Mon, 22 Jul 2019 13:06:30 -0700
On Mon, 22 Jul 2019 18:14:11 GMT, hv wrote:
> On Mon, 22 Jul 2019 09:15:20 -0700, jkeenan wrote:
> > On Mon, 22 Jul 2019 16:11:13 GMT, hv wrote:
> > > On Mon, 22 Jul 2019 08:47:39 -0700, hv wrote:
> > > > As Dave mentioned, the expensive part is the hints check of
> > > > IN_LC(LC_NUMERIC);
> > > > I cache a copy of that for sv_vcatpvfn_flags (and format_hexfp)
> > > > to
> > > > use;
> > > > however I note that the same check is done multiple times within
> > > > the
> > > > underlying STORE_LC_NUMERIC_SET_TO_NEEDED macro, so it may be
> > > > worth
> > > > making variants of WITH_LC_x and STORE_LC_x that accept a cached
> > > > boolean
> > > > for the hints check.
> > >
> > > Something like the attached (not extensively tested).
> >
> > Hugo, I created this smoke branch just now after seeing our *first*
> > patch:
> >
> > smoke-me/jkeenan/hv/134172-sprintf
> >
> > Should this second patch be applied to that branch as well?
>
> Up to you: I'd like a second opinion on it, particularly about the
> naming of the additional macros, but I think it's reasonably likely to
> work. (It passed tests here on a build with -DNO_THREAD_SAFE_LOCALE,
> but I haven't tried any other variations.)
>
> Hugo
Applying to branch for smoke testing.
--
James E Keenan ([email protected])
# James E Keenan <jkee...@cpa...>
Wed, 24 Jul 2019 14:18:01 -0700
On Mon, 22 Jul 2019 20:06:30 GMT, jkeenan wrote:
> On Mon, 22 Jul 2019 18:14:11 GMT, hv wrote:
> > On Mon, 22 Jul 2019 09:15:20 -0700, jkeenan wrote:
> > > On Mon, 22 Jul 2019 16:11:13 GMT, hv wrote:
> > > > On Mon, 22 Jul 2019 08:47:39 -0700, hv wrote:
> > > > > As Dave mentioned, the expensive part is the hints check of
> > > > > IN_LC(LC_NUMERIC);
> > > > > I cache a copy of that for sv_vcatpvfn_flags (and format_hexfp)
> > > > > to
> > > > > use;
> > > > > however I note that the same check is done multiple times within
> > > > > the
> > > > > underlying STORE_LC_NUMERIC_SET_TO_NEEDED macro, so it may be
> > > > > worth
> > > > > making variants of WITH_LC_x and STORE_LC_x that accept a cached
> > > > > boolean
> > > > > for the hints check.
> > > >
> > > > Something like the attached (not extensively tested).
> > >
> > > Hugo, I created this smoke branch just now after seeing our *first*
> > > patch:
> > >
> > > smoke-me/jkeenan/hv/134172-sprintf
> > >
> > > Should this second patch be applied to that branch as well?
> >
> > Up to you: I'd like a second opinion on it, particularly about the
> > naming of the additional macros, but I think it's reasonably likely to
> > work. (It passed tests here on a build with -DNO_THREAD_SAFE_LOCALE,
> > but I haven't tried any other variations.)
> >
> > Hugo
>
> Applying to branch for smoke testing.
Smoke-test results in the branch appear satisfactory.
http://perl.develop-help.com/?b=smoke-me%2Fjkeenan%2Fhv%2F134172-sprintf
What else do we have to decide before merging this branch and closing this ticket?
Thank you very much.
--
James E Keenan ([email protected])
# Hugo van der Sanden <...@cry...>
Wed, 24 Jul 2019 15:07:48 -0700
On Wed, 24 Jul 2019 14:18:01 -0700, jkeenan wrote:
> On Mon, 22 Jul 2019 20:06:30 GMT, jkeenan wrote:
> > Applying to branch for smoke testing.
>
> Smoke-test results in the branch appear satisfactory.
>
> http://perl.develop-help.com/?b=smoke-me%2Fjkeenan%2Fhv%2F134172-
> sprintf
>
> What else do we have to decide before merging this branch and closing
> this ticket?
I'd like to get some feedback, particularly from Karl and Dave; Karl has already indicated he'll try to take a look this weekend. It also looks like Tony has been looking into the same issue, I just noticed he has created a branch tonyc/134172-in-lc which may have duplicated some of the work.
Given clean smokes, I'm already pretty comfortable with the first patch; the second patch ("Avoid multiple checks of IN_LC(LC_NUMERIC)") probably needs better documentation, possibly better naming, and due consideration whether to expose the new macros as part of the API.
Hugo
# Tony Cook <...@dev...>
Wed, 24 Jul 2019 17:38:13 -0700
On Wed, Jul 24, 2019 at 03:07:48PM -0700, Hugo van der Sanden via RT wrote:
> On Wed, 24 Jul 2019 14:18:01 -0700, jkeenan wrote:
> > On Mon, 22 Jul 2019 20:06:30 GMT, jkeenan wrote:
> > > Applying to branch for smoke testing.
> >
> > Smoke-test results in the branch appear satisfactory.
> >
> > http://perl.develop-help.com/?b=smoke-me%2Fjkeenan%2Fhv%2F134172-
> > sprintf
> >
> > What else do we have to decide before merging this branch and closing
> > this ticket?
>
> I'd like to get some feedback, particularly from Karl and Dave; Karl has already indicated he'll try to take a look this weekend. It also looks like Tony has been looking into the same issue, I just noticed he has created a branch tonyc/134172-in-lc which may have duplicated some of the work.
>
> Given clean smokes, I'm already pretty comfortable with the first patch; the second patch ("Avoid multiple checks of IN_LC(LC_NUMERIC)") probably needs better documentation, possibly better naming, and due consideration whether to expose the new macros as part of the API.
>
> Hugo
It looks reasonable to me.
One thing I noticed in the original code was that
STORE_LC_NUMERIC_SET_TO_NEEDED() could call the possibly expensive
IN_LC(LC_NUMERIC) up to three times:
+ ( ( in_lc_numeric && _NOT_IN_NUMERIC_UNDERLYING) \
+ || (! in_lc_numeric && _NOT_IN_NUMERIC_STANDARD))); \
+ if (in_lc_numeric) { \
(for a bare STORE_LC_NUMERIC_SET_TO_NEEDED() in_lc_numeric is
IN_LC(LC_NUMERIC) )
Tony
# Dave Mitchell <dave...@iab...>
Thu, 25 Jul 2019 00:14:09 -0700
On Wed, Jul 24, 2019 at 03:07:48PM -0700, Hugo van der Sanden via RT wrote:
> I'd like to get some feedback, particularly from Karl and Dave;
I've been dead to the world with flu for the last 8 days, so you probably
won't hear from me soon.
--
Any [programming] language that doesn't occasionally surprise the
novice will pay for it by continually surprising the expert.
-- Larry Wall
# Hugo van der Sanden <...@cry...>
Thu, 25 Jul 2019 07:17:50 -0700
On Wed, 24 Jul 2019 17:38:13 -0700, tonyc wrote:
> It looks reasonable to me.
Thanks.
> One thing I noticed in the original code was that
> STORE_LC_NUMERIC_SET_TO_NEEDED() could call the possibly expensive
> IN_LC(LC_NUMERIC) up to three times:
>
> + ( ( in_lc_numeric &&
> _NOT_IN_NUMERIC_UNDERLYING) \
> + || (! in_lc_numeric &&
> _NOT_IN_NUMERIC_STANDARD))); \
> + if (in_lc_numeric) {
> \
>
> (for a bare STORE_LC_NUMERIC_SET_TO_NEEDED() in_lc_numeric is
> IN_LC(LC_NUMERIC) )
Yes, I addressed the same in my second patch, in a slightly different manner.
Hugo
# karl williamson <publ...@khw...>
Mon, 29 Jul 2019 21:13:20 -0700
On 7/25/19 8:17 AM, Hugo van der Sanden via RT wrote:
> On Wed, 24 Jul 2019 17:38:13 -0700, tonyc wrote:
>> It looks reasonable to me.
>
> Thanks.
>
>> One thing I noticed in the original code was that
>> STORE_LC_NUMERIC_SET_TO_NEEDED() could call the possibly expensive
>> IN_LC(LC_NUMERIC) up to three times:
>>
>> + ( ( in_lc_numeric &&
>> _NOT_IN_NUMERIC_UNDERLYING) \
>> + || (! in_lc_numeric &&
>> _NOT_IN_NUMERIC_STANDARD))); \
>> + if (in_lc_numeric) {
>> \
>>
>> (for a bare STORE_LC_NUMERIC_SET_TO_NEEDED() in_lc_numeric is
>> IN_LC(LC_NUMERIC) )
>
> Yes, I addressed the same in my second patch, in a slightly different manner.
>
> Hugo
>
> ---
> via perlbug: queue: perl5 status: open
> https://rt.perl.org/Ticket/Display.html?id=134172
>
The patches both look good to me.
I have two very minor concerns.
Is saying foo({...}) portable?
And on platforms that don't have locales enabled, is it ok to just say
"block;", or should this be wrapped in a STMT_START STMT_END
As far as the names, I don't know. It's not clear to me what the _i
suffix is supposed to mean, so if you go with that name, it should be
documented. If it is meant to mean 'internal', often using an
underscore prefix is the convention. But it's not really private,, so I
don't know.
# Hugo van der Sanden <...@cry...>
Wed, 31 Jul 2019 04:36:57 -0700
On Mon, 29 Jul 2019 21:13:20 -0700, [email protected] wrote:
> The patches both look good to me.
Thanks.
> I have two very minor concerns.
>
> Is saying foo({...}) portable?
We do this in many places with DEBUG_foo() type macros.
> And on platforms that don't have locales enabled, is it ok to just say
> "block;", or should this be wrapped in a STMT_START STMT_END
By analogy with DEBUG_foo, I thought I'd found evidence it was; but I don't see that now, and DEBUG_r (for example) does wrap them, I'll modify this.
> As far as the names, I don't know. It's not clear to me what the _i
> suffix is supposed to mean, so if you go with that name, it should be
> documented. If it is meant to mean 'internal', often using an
> underscore prefix is the convention. But it's not really private,, so
> I don't know.
I agree, but I didn't have a better name. I see Tony used ..._IN(in) in his closely equivalent version, so I'll go with that.
I plan to commit both patches with those changes, along with some docs for the _IN variants, later today.
Hugo
# Hugo van der Sanden <...@cry...>
Wed, 31 Jul 2019 07:48:34 -0700
Now pushed:
commit a06a4d45d476265b6d5143171126fe6f33ef2c44
[perl #134172] restrict scope of locale changes during sprintf
In some environments we must hold a mutex for the duration of a temporary
locale change, so we must ensure that mutex is released appropriately.
This means intervening code must not croak, or otherwise bypass the
unlock.
In sv_vcatpvfn_flags(), that requirement was violated when attempting
to avoid multiple temporary locale changes by collapsing them into
a single one. This partially undoes that to fix the problem, while
still attempting to retain some of the benefits by caching the
expensive hints check.
commit 39b0ad1783fa21aad82d068e668313a54729d740
[perl #134172] Avoid multiple checks of IN_LC(LC_NUMERIC)
This adds new API macros STORE_LC_NUMERIC_SET_TO_NEEDED_IN and
WITH_LC_NUMERIC_SET_TO_NEEDED_IN that accept a precalculated value
for the hints checks of IN_LC(LC_NUMERIC).
commit 061637ca0171c453caf66ea67feedd3b26833499
[perl #134172] perldelta
# Hugo van der Sanden <...@cry...>
Wed, 31 Jul 2019 07:48:34 -0700