Description of problem: When I run xfstest generic/257 on my guest (virtiofs + xfs), Hit virtiofsd daemon to disconnect. /usr/libexec/virtiofsd --socket-path=/tmp/avocado-vt-vm1-fs2-virtiofsd.sock -d -o source=/tmp/virtio_fs2_test -o cache=auto 2021-01-18 20:31:04: virtio_session_mount: Waiting for vhost-user socket connection... 2021-01-18 20:31:07: virtio_session_mount: Received vhost-user socket connection 2021-01-18 20:31:07: virtio_loop: Entry 2021-01-18 20:31:26: fv_queue_set_started: qidx=0 started=1 2021-01-18 20:31:26: fv_queue_set_started: qidx=1 started=1 2021-01-18 20:31:26: fv_queue_thread: Start for queue 1 kick_fd 12 2021-01-18 20:31:26: fv_queue_thread: Start for queue 0 kick_fd 9 2021-01-18 20:58:29: (Process terminated with status 1) Version-Release number of selected component (if applicable): Host Kernel: 4.18.0-272.el8.ppc64le Guest Kernel: 4.18.0-275.el8.ppc64le Qemu-kvm: qemu-kvm-5.2.0-3.module+el8.4.0+9499+42e58f08 SLOF: SLOF-20191022-3.git899d9883.module+el8.3.0+6423+e4cb6418.noarch How reproducible: 3/3 Steps to Reproduce: 1. /usr/libexec/virtiofsd --socket-path=/tmp/avocado-vt-vm1-fs1-virtiofsd.sock -d -o source=/tmp/virtio_fs1_test -o cache=auto /usr/libexec/virtiofsd --socket-path=/tmp/avocado-vt-vm1-fs2-virtiofsd.sock -d -o source=/tmp/virtio_fs2_test -o cache=auto 2.boot guest /usr/libexec/qemu-kvm \ -name 'avocado-vt-vm1' \ -sandbox on \ -machine pseries \ -nodefaults \ -device VGA,bus=pci.0,addr=0x2 \ -m 122880 \ -object memory-backend-file,size=120G,mem-path=/home/virtio_fs_mem1,share=yes,id=mem-mem1 \ -smp 8,maxcpus=80,cores=40,threads=1,sockets=2 \ -numa node,memdev=mem-mem1,nodeid=0 \ -cpu 'host' \ -chardev socket,id=qmp_id_qmpmonitor1,path=/tmp/monitor-qmpmonitor1-20210118-090734-CCC1lICO,server,nowait \ -mon chardev=qmp_id_qmpmonitor1,mode=control \ -chardev socket,id=chardev_serial0,path=/tmp/serial-serial0-20210118-090734-CCC1lICO,server,nowait \ -device spapr-vty,id=serial0,reg=0x30000000,chardev=chardev_serial0 \ -device qemu-xhci,id=usb1,bus=pci.0,addr=0x3 \ -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \ -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pci.0,addr=0x4 \ -blockdev node-name=file_image1,driver=file,auto-read-only=on,discard=unmap,aio=threads,filename=/home/test/os.qcow2,cache.direct=on,cache.no-flush=off \ -blockdev node-name=drive_image1,driver=qcow2,read-only=off,cache.direct=on,cache.no-flush=off,file=file_image1 \ -device scsi-hd,id=image1,drive=drive_image1,write-cache=on \ -chardev socket,id=char_virtiofs_fs1,path=/tmp/avocado-vt-vm1-fs1-virtiofsd.sock \ -device vhost-user-fs-pci,id=vufs_virtiofs_fs1,chardev=char_virtiofs_fs1,tag=myfs1,queue-size=1024,bus=pci.0,addr=0x5 \ -chardev socket,id=char_virtiofs_fs2,path=/tmp/avocado-vt-vm1-fs2-virtiofsd.sock \ -device vhost-user-fs-pci,id=vufs_virtiofs_fs2,chardev=char_virtiofs_fs2,tag=myfs2,queue-size=1024,bus=pci.0,addr=0x6 \ -device virtio-net-pci,mac=9a:1e:02:10:22:73,id=idO6PtQ3,netdev=id3xORKR,bus=pci.0,addr=0x7 \ -netdev tap,id=id3xORKR,vhost=on \ -vnc :20 \ -rtc base=utc,clock=host \ -boot menu=off,order=cdn,once=c,strict=off \ -enable-kvm \ -monitor stdio 3. # mkdir -p /mnt/myfs1 # mkdir -p /mnt/myfs2 # yum install -y git acl attr automake bc dump e2fsprogs fio gawk gcc libtool lvm2 make psmisc quota sed xfsdump xfsprogs libacl-devel libattr-devel libaio-devel libuuid-devel xfsprogs-devel python3 sqlite # cd /home/xfstests-dev/ && make && make install # echo -e 'TEST_DEV=myfs1\nTEST_DIR=/mnt/myfs1\nSCRATCH_DEV=myfs2\nSCRATCH_MNT=/mnt/myfs2\nFSTYP=virtiofs\nFSX_AVOID="-E"' > configs/localhost.config && echo "generic/257 generic/285 generic/551" > blacklist # ./check -virtiofs generic/257 ======= hit issue /usr/libexec/virtiofsd --socket-path=/tmp/avocado-vt-vm1-fs2-virtiofsd.sock -d -o source=/tmp/virtio_fs2_test -o cache=auto 2021-01-18 20:31:04: virtio_session_mount: Waiting for vhost-user socket connection... 2021-01-18 20:31:07: virtio_session_mount: Received vhost-user socket connection 2021-01-18 20:31:07: virtio_loop: Entry 2021-01-18 20:31:26: fv_queue_set_started: qidx=0 started=1 2021-01-18 20:31:26: fv_queue_set_started: qidx=1 started=1 2021-01-18 20:31:26: fv_queue_thread: Start for queue 1 kick_fd 12 2021-01-18 20:31:26: fv_queue_thread: Start for queue 0 kick_fd 9 2021-01-18 20:58:29: (Process terminated with status 1) Guest serial log: [ OK ] Stopped target Local File Systems. Unmounting /mnt/myfs1... Unmounting /boot... [FAILED] Failed unmounting /mnt/myfs1. [ OK ] Deactivated swap /dev/rhel/swap. [ OK ] Deactivated swap /dev/disk/by-uuid/…1814e-450b-4478-8a36-0851b558ce7e. [ OK ] Deactivated swap /dev/disk/by-id/dm…oKDWkOKSV1FAarc1G3U9f7ABnuoI9Cif0. [ OK ] Deactivated swap /dev/disk/by-id/dm-name-rhel-swap. [ OK ] Deactivated swap /dev/dm-1. [ OK ] Deactivated swap /dev/mapper/rhel-swap. [ 559.883979] XFS (sda2): Unmounting Filesystem [ OK ] Unmounted /boot. [ OK ] Reached target Unmount All Filesystems. [ OK ] Stopped target Local File Systems (Pre). [ 562.293957] audit: type=1131 audit(1611035241.924:271): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=systemd-remount-fs comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' [ 562.297305] audit: type=1131 audit(1611035241.924:272): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=systemd-tmpfiles-setup-dev comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' [ OK ] Stopped Remount Root and Kernel File Systems. [ OK ] Stopped Create Static Device Nodes in /dev. Stopping Monitoring of LVM2 mirrors…ng dmeventd or progress polling... [ 562.534000] audit: type=1131 audit(1611035242.164:273): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=lvm2-monitor comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' [ OK ] Stopped Monitoring of LVM2 mirrors,…sing dmeventd or progress polling. [* ] A stop job is running for /usr/bin/…ests/generic/257 (2min 28s / 3min)[ 611.803538] INFO: task t_dir_offset2:3126 blocked for more than 120 seconds. [ 611.805800] Not tainted 4.18.0-275.el8.ppc64le #1 [ 611.806795] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 611.807842] t_dir_offset2 D 0 3126 1 0x00040082 [ 611.808843] Call Trace: [ 611.809881] [c000001d1f8db6d0] [c000000001ba7b00] .TOC.+0x0/0x8500 (unreliable) [ 611.810973] [c000001d1f8db8b0] [c000000000018400] __switch_to+0x2e0/0x500 [ 611.812062] [c000001d1f8db910] [c000000000ec0854] __schedule+0x2d4/0x9b0 [ 611.813174] [c000001d1f8db9e0] [c000000000ec0f98] schedule+0x68/0x130 [ 611.814212] [c000001d1f8dba10] [c0080000079c0624] request_wait_answer+0x1cc/0x3b0 [fuse] [ 611.815283] [c000001d1f8dba80] [c0080000079c6214] fuse_simple_request+0x33c/0x4a0 [fuse] [ 611.816292] [c000001d1f8dbad0] [c0080000079dad94] fuse_readdir_uncached+0x14c/0xa90 [fuse] [ 611.817314] [c000001d1f8dbc90] [c0080000079db8bc] fuse_readdir+0x1e4/0x88c [fuse] [ 611.818291] [c000001d1f8dbd60] [c000000000564180] iterate_dir+0x200/0x280 [ 611.819214] [c000001d1f8dbdb0] [c000000000564fd8] sys_getdents64+0xa8/0x1b0 [ 611.820112] [c000001d1f8dbe20] [c00000000000b408] system_call+0x5c/0x70 [ OK ] Stopped /usr/bin/bash -c test -w /p…ore_adj; exec ./tests/generic/257. [ OK ] Reached target Shutdown. [ OK ] Reached target Final Step. Starting Power-Off... [ 643.805978] printk: systemd-shutdow: 40 output lines suppressed due to ratelimiting [ 645.978683] systemd-shutdown[1]: Syncing filesystems and block devices. [ 645.993009] systemd-shutdown[1]: Sending SIGTERM to remaining processes... [ 734.687755] INFO: task t_dir_offset2:3126 blocked for more than 120 seconds. [ 734.688979] Not tainted 4.18.0-275.el8.ppc64le #1 [ 734.689806] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 734.690656] t_dir_offset2 D 0 3126 1 0x00040082 [ 734.691526] Call Trace: [ 734.692338] [c000001d1f8db6d0] [c000000001ba7b00] .TOC.+0x0/0x8500 (unreliable) [ 734.693217] [c000001d1f8db8b0] [c000000000018400] __switch_to+0x2e0/0x500 [ 734.694071] [c000001d1f8db910] [c000000000ec0854] __schedule+0x2d4/0x9b0 [ 734.694964] [c000001d1f8db9e0] [c000000000ec0f98] schedule+0x68/0x130 [ 734.695806] [c000001d1f8dba10] [c0080000079c0624] request_wait_answer+0x1cc/0x3b0 [fuse] [ 734.696700] [c000001d1f8dba80] [c0080000079c6214] fuse_simple_request+0x33c/0x4a0 [fuse] [ 734.697631] [c000001d1f8dbad0] [c0080000079dad94] fuse_readdir_uncached+0x14c/0xa90 [fuse] [ 734.698497] [c000001d1f8dbc90] [c0080000079db8bc] fuse_readdir+0x1e4/0x88c [fuse] [ 734.699334] [c000001d1f8dbd60] [c000000000564180] iterate_dir+0x200/0x280 [ 734.700172] [c000001d1f8dbdb0] [c000000000564fd8] sys_getdents64+0xa8/0x1b0 [ 734.700984] [c000001d1f8dbe20] [c00000000000b408] system_call+0x5c/0x70 [ 736.006375] systemd-shutdown[1]: Sending SIGKILL to remaining processes... [ 736.012010] systemd-shutdown[1]: Sending SIGKILL to PID 3126 (t_dir_offset2). [ 826.019012] systemd-shutdown[1]: Unmounting file systems. [ 826.022857] [3303]: Remounting '/mnt/myfs1' read-only in with options 'context=system_u:object_r:root_t:s0'. [ 826.029062] [3304]: Unmounting '/mnt/myfs1'. [ 826.030185] [3304]: Failed to unmount /mnt/myfs1: Device or resource busy [ 826.032408] [3305]: Remounting '/' read-only in with options 'seclabel,attr2,inode64,logbufs=8,logbsize=32k,noquota'. [ 826.058314] [3305]: Failed to remount '/' read-only: Device or resource busy [ 826.059574] systemd-shutdown[1]: Not all file systems unmounted, 2 left. [ 826.060495] systemd-shutdown[1]: Deactivating swaps. [ 826.061571] systemd-shutdown[1]: All swaps deactivated. [ 826.062421] systemd-shutdown[1]: Detaching loop devices. [ 826.063624] systemd-shutdown[1]: All loop devices detached. [ 826.064417] systemd-shutdown[1]: Detaching DM devices. [ 826.065931] systemd-shutdown[1]: Detaching DM 253:1. [ 826.110756] systemd-shutdown[1]: Not all DM devices detached, 1 left. [ 826.111678] systemd-shutdown[1]: Unmounting file systems. [ 826.113720] [3306]: Unmounting '/mnt/myfs1'. [ 826.114579] [3306]: Failed to unmount /mnt/myfs1: Device or resource busy [ 826.116349] [3307]: Remounting '/' read-only in with options 'seclabel,attr2,inode64,logbufs=8,logbsize=32k,noquota'. [ 826.118198] [3307]: Failed to remount '/' read-only: Device or resource busy [ 826.179634] printk: shutdown: 6 output lines suppressed due to ratelimiting [ 826.218616] dracut: Taking over mdmon processes. [ 826.219448] dracut Warning: Killing all remaining processes dracut Warning: Killing all remaining processes Actual results: XFS/xfstests generic/257 failure Expected results: Run generic/257 successfully. Additional info:
On aarch64 no hit this issue Hi menli, Could you try to reproduce it on x86?
Run generic/257 successfully on x86, not hit this issue.
Starting investigation on a POWER system.
(In reply to menli from comment #2) > Run generic/257 successfully on x86, not hit this issue. Test on x86 both q35 and pc with the same pkg as comment0, not hit this issue.
(In reply to Greg Kurz from comment #3) > Starting investigation on a POWER system. Reproduced 100% on a POWER9 witherspoon system. One of the virtiosfd instance terminates because of a SIGSYS signal. bash: line 1: 50200 Bad system call sudo /usr/libexec/virtiofsd --socket-path='/tmp/virtio_fs1_test.sock' -o cache=auto -o source='/tmp/virtio_fs1_test' -o sandbox=chroot -d -o log_level=debug
(In reply to Greg Kurz from comment #5) > (In reply to Greg Kurz from comment #3) > > Starting investigation on a POWER system. > > Reproduced 100% on a POWER9 witherspoon system. > > One of the virtiosfd instance terminates because of a SIGSYS signal. > > bash: line 1: 50200 Bad system call sudo /usr/libexec/virtiofsd > --socket-path='/tmp/virtio_fs1_test.sock' -o cache=auto -o > source='/tmp/virtio_fs1_test' -o sandbox=chroot -d -o log_level=debug No surprise SIGSYS comes from seccomp: type=SECCOMP msg=audit(1611233266.937:7577): auid=2010 uid=0 gid=0 ses=1 subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 pid=84612 comm="pool" exe="/home/greg/Work/qemu/qemu-ppc/.mbuild-ppc-for-6.0/obj/tools/virtiofsd/virtiofsd" sig=31 arch=c0000015 syscall=140 compat=0 ip=0x7fffab4893c4 code=0x80000000AUID="greg" UID="root" GID="root" ARCH=ppc64le SYSCALL=_llseek This because glibc uses _llseek() on POWER but it isn't present in the whitelist. Posted a patch for upstream.
Testing, update results later.
Run generic/257 successfully on POWER9 So set the bug status is verified, according to the following test results: Version-Release number of selected component (if applicable): Host Kernel: 4.18.0-304.7.el8.mr470_210503_1451.ppc64le Guest Kernel: 4.18.0-305.1.el8.ppc64le Qemu-kvm: qemu-kvm-6.0.0-16.module+el8.5.0+10848+2dccc46d SLOF: SLOF-20200717-1.gite18ddad8.module+el8.4.0+8855+a9e237a9.noarch 1. /usr/libexec/virtiofsd --socket-path=/tmp/avocado-fs1-virtiofsd.sock -d -o source=/tmp/virtio_fs1_test -o cache=auto /usr/libexec/virtiofsd --socket-path=/tmp/avocado-fs2-virtiofsd.sock -d -o source=/tmp/virtio_fs2_test -o cache=auto 2.boot guest /usr/libexec/qemu-kvm \ -name 'avocado-vt-vm1' \ -sandbox on \ -machine pseries \ -nodefaults \ -device VGA,bus=pci.0,addr=0x2 \ -m 122880 \ -object memory-backend-file,size=120G,mem-path=/home/virtio_fs_mem1,share=yes,id=mem-mem1 \ -smp 8,maxcpus=80,cores=40,threads=1,sockets=2 \ -numa node,memdev=mem-mem1,nodeid=0 \ -cpu 'host' \ -chardev socket,id=qmp_id_qmpmonitor1,path=/tmp/monitor-qmpmonitor1-20210118-090734-CCC1lICO,server=on,wait=off \ -mon chardev=qmp_id_qmpmonitor1,mode=control \ -chardev socket,id=chardev_serial0,path=/tmp/serial-serial0-20210118-090734-CCC1lICO,server=on,wait=off \ -device spapr-vty,id=serial0,reg=0x30000000,chardev=chardev_serial0 \ -device qemu-xhci,id=usb1,bus=pci.0,addr=0x3 \ -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \ -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pci.0,addr=0x4 \ -blockdev node-name=file_image1,driver=file,auto-read-only=on,discard=unmap,aio=threads,filename=/home/zhenyzha/RHEL-8.5.0-20210506.n.0-ppc64le.qcow2,cache.direct=on,cache.no-flush=off \ -blockdev node-name=drive_image1,driver=qcow2,read-only=off,cache.direct=on,cache.no-flush=off,file=file_image1 \ -device scsi-hd,id=image1,drive=drive_image1,write-cache=on \ -chardev socket,id=char_virtiofs_fs1,path=/tmp/avocado-fs1-virtiofsd.sock \ -device vhost-user-fs-pci,id=vufs_virtiofs_fs1,chardev=char_virtiofs_fs1,tag=myfs1,queue-size=1024,bus=pci.0,addr=0x5 \ -chardev socket,id=char_virtiofs_fs2,path=/tmp/avocado-fs2-virtiofsd.sock \ -device vhost-user-fs-pci,id=vufs_virtiofs_fs2,chardev=char_virtiofs_fs2,tag=myfs2,queue-size=1024,bus=pci.0,addr=0x6 \ -device virtio-net-pci,mac=9a:1e:02:10:22:73,id=idO6PtQ3,netdev=id3xORKR,bus=pci.0,addr=0x7 \ -netdev tap,id=id3xORKR,vhost=on \ -vnc :20 \ -rtc base=utc,clock=host \ -boot menu=off,order=cdn,once=c,strict=off \ -enable-kvm \ -monitor stdio 3. build env: # mkdir -p /mnt/myfs1 # mkdir -p /mnt/myfs2 # yum install -y git acl attr automake bc dump e2fsprogs fio gawk gcc libtool lvm2 make psmisc quota sed xfsdump xfsprogs libacl-devel libattr-devel libaio-devel libuuid-devel xfsprogs-devel python3 sqlite # git clone https://git.kernel.org/pub/scm/fs/xfs/xfstests-dev.git # cd /home/xfstests-dev/ && make && make install # export TEST_DEV=myfs1 && export TEST_DIR=/mnt/myfs1 && export SCRATCH_DEV=myfs2 && export SCRATCH_MNT=/mnt/myfs2 && export FSTYP=virtiofs && export FSX_AVOID="-E" 4. run xfstest generic/257: # ./check -virtiofs generic/257 FSTYP -- virtiofs PLATFORM -- Linux/ppc64le dhcp16-215-161 4.18.0-305.1.el8.ppc64le #1 SMP Fri Apr 30 16:18:15 EDT 2021 MKFS_OPTIONS -- myfs2 MOUNT_OPTIONS -- -o context=system_u:object_r:root_t:s0 myfs2 /mnt/myfs2 generic/257 0s Ran: generic/257 Passed all 1 tests 5.check dmesg log and virtiofsd daemon: dmesg and virtiofsd daemon log no error: [ 956.299407] run fstests generic/257 at 2021-05-13 15:12:24
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (virt:av bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:4684