1975693 – glibc: __get_nprocs must not depend on malloc because malloc initialization needs it

Bug 1975693 - glibc: __get_nprocs must not depend on malloc because malloc initialization needs it

Summary: glibc: __get_nprocs must not depend on malloc because malloc initialization n...

Keywords:
Status:	CLOSED RAWHIDE
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	glibc
Sub Component:
Version:	rawhide
Hardware:	x86_64
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	---
Assignee:	Florian Weimer
QA Contact:	Fedora Extras Quality Assurance
Docs Contact:
URL:	https://retrace.fedoraproject.org/faf...
Whiteboard:	abrt_hash:4744384ec4c5e746586a2c87d16...
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2021-06-24 08:31 UTC by piio
Modified:	2021-07-06 13:22 UTC (History)
CC List:	21 users (show)
Fixed In Version:	glibc-2.33.9000-36.fc35
Clone Of:
Environment:
Last Closed:	2021-07-06 13:22:04 UTC
Type:	---
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
File: backtrace (158.25 KB, text/plain) 2021-06-24 08:31 UTC, piio	no flags	Details
File: core_backtrace (77.14 KB, text/plain) 2021-06-24 08:32 UTC, piio	no flags	Details
File: cpuinfo (2.51 KB, text/plain) 2021-06-24 08:32 UTC, piio	no flags	Details
File: environ (314 bytes, text/plain) 2021-06-24 08:32 UTC, piio	no flags	Details
File: exploitable (118 bytes, text/plain) 2021-06-24 08:32 UTC, piio	no flags	Details
File: limits (1.29 KB, text/plain) 2021-06-24 08:32 UTC, piio	no flags	Details
File: maps (3.97 KB, text/plain) 2021-06-24 08:32 UTC, piio	no flags	Details
File: mountinfo (3.92 KB, text/plain) 2021-06-24 08:32 UTC, piio	no flags	Details
File: open_fds (1.66 KB, text/plain) 2021-06-24 08:32 UTC, piio	no flags	Details
File: proc_pid_status (1.40 KB, text/plain) 2021-06-24 08:32 UTC, piio	no flags	Details
View All

Description piio 2021-06-24 08:31:44 UTC

Description of problem:
Trying to run Windows 10 VM from virt-manager (qemu based virtualization) - after few seconds VM is stopped with state 'Shutdown (Crashed)'
In last week or two this VM has worked without any problems. I tried to downgrade libvirt, qemu, kernel but I can't find on which level this problem occurs.

Ubuntu 20.04 running as Live CD has no problem.

Version-Release number of selected component:
qemu-system-x86-core-2:6.0.0-6.fc35

Additional info:
reporter:       libreport-2.15.2
backtrace_rating: 4
cgroup:         0::/machine.slice/machine-qemu\x2d6\x2dwin10.scope/libvirt/emulator
cmdline:        /usr/bin/qemu-system-x86_64 -name guest=win10,debug-threads=on -S -object $'{"qom-type":"secret","id":"masterKey0","format":"raw","file":"/var/lib/libvirt/qemu/domain-6-win10/master-key.aes"}' -machine pc-q35-3.1,accel=kvm,usb=off,vmport=off,dump-guest-core=off,memory-backend=pc.ram -cpu Skylake-Client-IBRS,ss=on,vmx=on,pdcm=on,hypervisor=on,tsc-adjust=on,clflushopt=on,umip=on,md-clear=on,stibp=on,arch-capabilities=on,ssbd=on,xsaves=on,pdpe1gb=on,ibpb=on,ibrs=on,amd-stibp=on,amd-ssbd=on,skip-l1dfl-vmentry=on,pschange-mc-no=on,hle=off,rtm=off,hv-time,hv-relaxed,hv-vapic,hv-spinlocks=0x1fff -m 1024 -object $'{"qom-type":"memory-backend-ram","id":"pc.ram","size":1073741824}' -overcommit mem-lock=off -smp 2,sockets=2,cores=1,threads=1 -uuid 6c87098b-da4a-4415-ae9c-06872147750a -no-user-config -nodefaults -chardev socket,id=charmonitor,fd=36,server=on,wait=off -mon chardev=charmonitor,id=monitor,mode=control -rtc base=localtime,driftfix=slew -global kvm-pit.lost_tick_policy=delay -no-hpet -no-shutdown -global ICH9-LPC.disable_s3=1 -global ICH9-LPC.disable_s4=1 -boot strict=on -device pcie-root-port,port=0x10,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,addr=0x2 -device pcie-root-port,port=0x11,chassis=2,id=pci.2,bus=pcie.0,addr=0x2.0x1 -device pcie-root-port,port=0x12,chassis=3,id=pci.3,bus=pcie.0,addr=0x2.0x2 -device pcie-root-port,port=0x13,chassis=4,id=pci.4,bus=pcie.0,addr=0x2.0x3 -device pcie-root-port,port=0x14,chassis=5,id=pci.5,bus=pcie.0,addr=0x2.0x4 -device pcie-pci-bridge,id=pci.6,bus=pci.4,addr=0x0 -device qemu-xhci,p2=15,p3=15,id=usb,bus=pci.2,addr=0x0 -device lsi,id=scsi0,bus=pci.6,addr=0x1 -blockdev $'{"driver":"file","filename":"/home/user/.local/share/gnome-boxes/images/win10","node-name":"libvirt-1-storage","auto-read-only":true,"discard":"unmap"}' -blockdev $'{"node-name":"libvirt-1-format","read-only":false,"driver":"qcow2","file":"libvirt-1-storage","backing":null}' -device ide-hd,bus=ide.0,drive=libvirt-1-format,id=sata0-0-0,bootindex=1 -netdev tap,fd=39,id=hostnet0 -device e1000e,netdev=hostnet0,id=net0,mac=52:54:00:e5:c3:d3,bus=pci.1,addr=0x0 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -device usb-tablet,id=input0,bus=usb.0,port=1 -audiodev id=audio1,driver=spice -vnc 127.0.0.1:0,audiodev=audio1 -device virtio-vga,id=video0,virgl=off,max_outputs=1,bus=pcie.0,addr=0x1 -device ich9-intel-hda,id=sound0,bus=pcie.0,addr=0x1b -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0,audiodev=audio1 -device virtio-balloon-pci,id=balloon0,bus=pci.3,addr=0x0 -sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny -msg timestamp=on
crash_function: __libc_scratch_buffer_grow
dso_list:       /usr/bin/qemu-system-x86_64 qemu-system-x86-core-2:6.0.0-6.fc35.x86_64 (Fedora Project) 1624446818
executable:     /usr/bin/qemu-system-x86_64
journald_cursor: s=452c1c6d97264aa68b749c09e33cabeb;i=1f7235;b=7a4b111a86db4f23a4c94b8b6eae9d79;m=f20f5698;t=5c56e1dee3d5d;x=bfa1f2772e4d355e
kernel:         5.13.0-0.rc7.51.fc35.x86_64
rootdir:        /
runlevel:       N 5
type:           CCpp
uid:            107

Comment 1 piio 2021-06-24 08:31:58 UTC

Created attachment 1793790 [details]
File: backtrace

Comment 2 piio 2021-06-24 08:32:03 UTC

Created attachment 1793791 [details]
File: core_backtrace

Comment 3 piio 2021-06-24 08:32:07 UTC

Created attachment 1793792 [details]
File: cpuinfo

Comment 4 piio 2021-06-24 08:32:11 UTC

Created attachment 1793793 [details]
File: environ

Comment 5 piio 2021-06-24 08:32:15 UTC

Created attachment 1793794 [details]
File: exploitable

Comment 6 piio 2021-06-24 08:32:18 UTC

Created attachment 1793796 [details]
File: limits

Comment 7 piio 2021-06-24 08:32:23 UTC

Created attachment 1793797 [details]
File: maps

Comment 8 piio 2021-06-24 08:32:28 UTC

Created attachment 1793798 [details]
File: mountinfo

Comment 9 piio 2021-06-24 08:32:33 UTC

Created attachment 1793799 [details]
File: open_fds

Comment 10 piio 2021-06-24 08:32:37 UTC

Created attachment 1793800 [details]
File: proc_pid_status

Comment 11 Cole Robinson 2021-06-29 13:20:52 UTC

Seems to be a crash deep in glibc. Does downgrading glibc help?

Comment 12 piio 2021-06-29 17:14:01 UTC

Hi,

I just downgraded glibc to glibc-2.33.9000-13 - this problem still occur.

Comment 13 Richard W.M. Jones 2021-06-30 12:29:34 UTC

I'm hitting this problem now:

qemu-system-x86-6.0.0-7.fc35.x86_64
glibc-2.33.9000-34.fc35.x86_64

The stack trace is ridiculously deep with the frames between
__libc_scratch_buffer_grow repeating until it crashes.

#34009 0x00007ff6c4664d65 in __libc_scratch_buffer_grow () at /lib64/libc.so.6
#34010 0x00007ff6c46d3099 in get_nprocs () at /lib64/libc.so.6
#34011 0x00007ff6c465f7da in arena_get2.part () at /lib64/libc.so.6
#34012 0x00007ff6c4661659 in tcache_init.part () at /lib64/libc.so.6
#34013 0x00007ff6c4661a0e in malloc () at /lib64/libc.so.6
#34014 0x00007ff6c4664d65 in __libc_scratch_buffer_grow () at /lib64/libc.so.6
#34015 0x00007ff6c46d3099 in get_nprocs () at /lib64/libc.so.6
#34016 0x00007ff6c465f7da in arena_get2.part () at /lib64/libc.so.6
#34017 0x00007ff6c4661659 in tcache_init.part () at /lib64/libc.so.6
#34018 0x00007ff6c4662007 in free () at /lib64/libc.so.6
#34019 0x00007ff6c4c1824d in g_free () at /lib64/libglib-2.0.so.0
#34020 0x000055ac6f6473a6 in qemu_thread_start ()
#34021 0x00007ff6c4651197 in start_thread () at /lib64/libc.so.6
#34022 0x00007ff6c46d6e54 in clone () at /lib64/libc.so.6

I'll see if I can get some debuginfo installed to get more information.
But at the moment I'd say it looks like either memory corruption or
a bug in glibc.

Comment 14 Richard W.M. Jones 2021-06-30 12:41:54 UTC

The crash happens here (although this is probably just because the
stack limit has been hit):

Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007ff6c46d305f in scratch_buffer_init (buffer=0x7ff6598f5e80)
    at ../include/scratch_buffer.h:78
78	  buffer->length = sizeof (buffer->__space);

The top of the stack with symbols looks like:

#54415 0x00007ff6c4664d65 in __GI___libc_scratch_buffer_grow (buffer=buffer@entry=0x7ff65a0f3d20) at scratch_buffer_grow.c:37
#54416 0x00007ff6c46d3099 in scratch_buffer_grow (buffer=0x7ff65a0f3d20) at ../include/scratch_buffer.h:101
#54417 __GI___get_nprocs () at ../sysdeps/unix/sysv/linux/getsysstats.c:44
#54418 0x00007ff6c465f7da in arena_get2 (size=size@entry=640, avoid_arena=avoid_arena@entry=0x0) at /usr/src/debug/glibc-2.33.9000-34.fc35.x86_64/malloc/arena.c:902
#54419 0x00007ff6c4661659 in arena_get2 (avoid_arena=0x0, size=640) at /usr/src/debug/glibc-2.33.9000-34.fc35.x86_64/malloc/arena.c:893
#54420 tcache_init () at malloc.c:3182
#54421 0x00007ff6c4661a0e in tcache_init () at malloc.c:3179
#54422 __GI___libc_malloc (bytes=bytes@entry=2048) at malloc.c:3245
#54423 0x00007ff6c4664d65 in __GI___libc_scratch_buffer_grow (buffer=buffer@entry=0x7ff65a0f41f0) at scratch_buffer_grow.c:37
#54424 0x00007ff6c46d3099 in scratch_buffer_grow (buffer=0x7ff65a0f41f0) at ../include/scratch_buffer.h:101
#54425 __GI___get_nprocs () at ../sysdeps/unix/sysv/linux/getsysstats.c:44
#54426 0x00007ff6c465f7da in arena_get2 (size=size@entry=640, avoid_arena=avoid_arena@entry=0x0) at /usr/src/debug/glibc-2.33.9000-34.fc35.x86_64/malloc/arena.c:902
#54427 0x00007ff6c4661659 in arena_get2 (avoid_arena=0x0, size=640) at /usr/src/debug/glibc-2.33.9000-34.fc35.x86_64/malloc/arena.c:893
#54428 tcache_init () at malloc.c:3182
#54429 0x00007ff6c4662007 in tcache_init () at malloc.c:3179
#54430 __GI___libc_free (mem=0x7ff648000b60) at malloc.c:3333
#54431 0x00007ff6c4c1824d in g_free () at /lib64/libglib-2.0.so.0
#54432 0x000055ac6f6473a6 in qemu_thread_start (args=0x7ff64c000b60) at ../util/qemu-thread-posix.c:518
#54433 0x00007ff6c4651197 in start_thread (arg=<optimized out>) at pthread_create.c:429
#54434 0x00007ff6c46d6e54 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:100

(Not sure why there are more frames, since this is the same core dump as
in the previous comment, but maybe debuginfo allows inlined frames to be
exposed?)

These are the frames in the recursive loop:

(gdb) frame 54423
#54423 0x00007ff6c4664d65 in __GI___libc_scratch_buffer_grow (
    buffer=buffer@entry=0x7ff65a0f41f0) at scratch_buffer_grow.c:37
37	    new_ptr = malloc (new_length);

(gdb) frame 54422
#54422 __GI___libc_malloc (bytes=bytes@entry=2048) at malloc.c:3245
3245	  MAYBE_INIT_TCACHE ();

(gdb) frame 54421
#54421 0x00007ff6c4661a0e in tcache_init () at malloc.c:3179
3179	  if (tcache_shutting_down)

(gdb) frame 54420
#54420 tcache_init () at malloc.c:3182
3182	  arena_get (ar_ptr, bytes);

(gdb) frame 54419
#54419 0x00007ff6c4661659 in arena_get2 (avoid_arena=0x0, size=640)
    at /usr/src/debug/glibc-2.33.9000-34.fc35.x86_64/malloc/arena.c:893
893	  if (a == NULL)

(gdb) frame 54418
#54418 0x00007ff6c465f7da in arena_get2 (size=size@entry=640, 
    avoid_arena=avoid_arena@entry=0x0)
    at /usr/src/debug/glibc-2.33.9000-34.fc35.x86_64/malloc/arena.c:902
902	              int n = __get_nprocs ();

(gdb) frame 54417
#54417 __GI___get_nprocs () at ../sysdeps/unix/sysv/linux/getsysstats.c:44
44	      if (!scratch_buffer_grow (&set))

(gdb) frame 54416
#54416 0x00007ff6c46d3099 in scratch_buffer_grow (buffer=0x7ff65a0f3d20)
    at ../include/scratch_buffer.h:101
101	  return __glibc_likely (__libc_scratch_buffer_grow (buffer));

(gdb) frame 54415
#54415 0x00007ff6c4664d65 in __GI___libc_scratch_buffer_grow (
    buffer=buffer@entry=0x7ff65a0f3d20) at scratch_buffer_grow.c:37
37	    new_ptr = malloc (new_length);

Comment 15 Florian Weimer 2021-06-30 12:48:50 UTC

This looks like a kernel/container host problem: sched_getaffinity fails, which does not happen on real Linux.

However, the bug is real for systems with thousands of CPUs, so we have to fix this anyway.

Comment 16 Richard W.M. Jones 2021-06-30 12:53:06 UTC

(In reply to Florian Weimer from comment #15)
> This looks like a kernel/container host problem: sched_getaffinity fails,
> which does not happen on real Linux.

seccomp?  IIUC the code correctly, it is blocked by qemu here:
https://github.com/qemu/qemu/blob/13d5f87cc3b94bfccc501142df4a7b12fee3a6e7/softmmu/qemu-seccomp.c#L108

I guess a broken syscall shouldn't cause glibc to go into an infinite
loop though :-/

Comment 17 Daniel Berrangé 2021-06-30 12:56:13 UTC

(In reply to Florian Weimer from comment #15)
> This looks like a kernel/container host problem: sched_getaffinity fails,
> which does not happen on real Linux.

Any syscall on "real" Linux can fail when seccomp is involved.

We see from the args above that

  -sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny

resourcecontrol=deny blocks the sched_getaffinity syscall, amongst others.

Comment 18 Richard W.M. Jones 2021-06-30 12:57:59 UTC

Related glibc commit is probably this one:
https://sourceware.org/git/?p=glibc.git;a=commitdiff;h=903bc7dcc2acafc40be11639767e10a2de712649

Comment 19 Dr. David Alan Gilbert 2021-06-30 13:08:13 UTC

I think there's two problems:
  a) qemu's -sandbox resourcecontrol=deny  blocks sched_get calls as well as set;  arguably that's a bit too
     mean of it, we should allow get's and deny set's
  b) glibc calls sched_getaffinity repeatedly with increasing cpusetsize - but my reading is that it should only
     try increasing cpusetsize if sched_getaffinity returns EINVAL; any other failure isn't a problem with it being too small.

Dave

Comment 20 Florian Weimer 2021-06-30 13:12:46 UTC

(In reply to Daniel Berrangé from comment #17)
> (In reply to Florian Weimer from comment #15)
> > This looks like a kernel/container host problem: sched_getaffinity fails,
> > which does not happen on real Linux.
> 
> Any syscall on "real" Linux can fail when seccomp is involved.

The kernel and glibc position is that if you break the system with seccomp, the seccomp filter needs to be fixed. A patch to document a more elaborate system call handshake that takes seccomp into account was quite strongly rejected: https://lore.kernel.org/linux-api/87lfer2c0b.fsf@oldenburg2.str.redhat.com/

But as I mentioned in comment 15, this is a glibc bug for other reasons, so we will fix this in glibc. Blocking sched_getaffinity or returning bad data from it is still a very bad idea, though.

Comment 21 Daniel Berrangé 2021-06-30 13:37:36 UTC

(In reply to Florian Weimer from comment #20)
> (In reply to Daniel Berrangé from comment #17)
> > (In reply to Florian Weimer from comment #15)
> > > This looks like a kernel/container host problem: sched_getaffinity fails,
> > > which does not happen on real Linux.
> > 
> > Any syscall on "real" Linux can fail when seccomp is involved.
> 
> The kernel and glibc position is that if you break the system with seccomp,
> the seccomp filter needs to be fixed.

I will do a fix for QEMU to relax this restriction

> A patch to document a more elaborate
> system call handshake that takes seccomp into account was quite strongly
> rejected:
> https://lore.kernel.org/linux-api/87lfer2c0b.fsf@oldenburg2.str.redhat.com/

Reading that makes my head hurt, so can't disagree with it being rejected :-)

(In reply to Dr. David Alan Gilbert from comment #19)
> I think there's two problems:
>   a) qemu's -sandbox resourcecontrol=deny  blocks sched_get calls as well as
> set;  arguably that's a bit too
>      mean of it, we should allow get's and deny set's

Yes, I'll aim to allow the getters. Even the setters have been trouble in the past - Mesa tried to use sched_setaffinity on its own internal threads and that broke in QEMU too :-(

seccomp is just soo fragile to use. Either the filter is wide open and thus largely useless, or it is restrictive and continuously breaks in unexpected ways :-(

Comment 22 Florian Weimer 2021-06-30 14:13:15 UTC

Upstream patch posted: https://sourceware.org/pipermail/libc-alpha/2021-June/128271.html

Comment 23 piio 2021-06-30 16:50:08 UTC

Hi,

I just downloaded new version glibc-2.33.9000-36 from koji and it looks that it solve my problem, VM which I had problems is starting and running for much more than few seconds like before.

Comment 24 piio 2021-06-30 17:06:59 UTC

VM is running more than 20 minutes, I don't observe any problems. Big thanks for solving this issue.

Comment 25 Richard W.M. Jones 2021-06-30 17:09:33 UTC

I have also tested it and it fixes the original problem.

Note You need to log in before you can comment on or make changes to this bug.

aoliva
arjun
berrange
cfergeau
codonell
crobinso
dgilbert
dj
fweimer
law
mcermak
mfabian
ondrejj
pbonzini
pfrankli
philmd
rjones
rth
sipoyare
virt-maint
yaneti