Bug 1924986

Summary: libguestfs-test-tool sometimes fails in RHEL-AV when using kernel-rt-debug
Product: Red Hat Enterprise Linux 8 Reporter: YongkuiGuo <yoguo>
Component: kernel-rtAssignee: Red Hat Real Time Maintenance <rt-maint>
kernel-rt sub component: Locking QA Contact: YongkuiGuo <yoguo>
Status: CLOSED DUPLICATE Docs Contact:
Severity: medium    
Priority: unspecified CC: bhu, chwhite, fpacheco, jlelli, juri.lelli, kcarcia, lgoncalv, llong, mstowell, rjones, virt-maint, williams
Version: 8.4   
Target Milestone: rc   
Target Release: 8.0   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-02-05 08:26:51 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 910269    
Attachments:
Description Flags
libguestfs-test-tool entire log none

Description YongkuiGuo 2021-02-04 05:09:14 UTC
Created attachment 1754970 [details]
libguestfs-test-tool entire log

Description of problem:
libguestfs-test-tool fails in RHEL-AV when using kernel-rt-debug sometimes.


Version-Release number of selected component (if applicable):
libguestfs-1.44.0-1.module+el8.4.0+9398+f376ac33.x86_64
supermin-5.2.1-1.module+el8.4.0+9751+d56db353.x86_64
kernel-rt-debug-4.18.0-280.rt7.45.el8.x86_64


How reproducible:
60%


Steps:

1. Install kernel-rt and kernel-rt-debug packages on rhel8.4 host
# SUPERMIN_KERNEL=/boot/vmlinuz-4.18.0-280.rt7.45.el8.x86_64+debug libguestfs-test-tool
     ************************************************************
     *                    IMPORTANT NOTICE
     *
     * When reporting bugs, include the COMPLETE, UNEDITED
     * output below in your bug report.
     *
     ************************************************************
SUPERMIN_KERNEL=/boot/vmlinuz-4.18.0-280.rt7.45.el8.x86_64+debug
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin
XDG_RUNTIME_DIR=/run/user/0
SELinux: Enforcing
...
...
[    1.460422]  ? rest_init+0x1fa/0x1fa
[    1.460424]  kernel_init+0xc/0x126
[    1.460426]  ? rest_init+0x1fa/0x1fa
[    1.460427]  ret_from_fork+0x3a/0x50
[    1.460431] Modules linked in:
[    1.542534] ---[ end trace 0000000000000003 ]---
[    1.542538] RIP: 0010:rt_spin_lock_slowlock_locked+0x543/0x730
[    1.542539] Code: 0f 85 24 fe ff ff 48 c7 c2 80 56 0a a0 be b3 02 00 00 48 c7 c7 40 56 0a a0 c6 05 45 7b b8 01 01 e8 3a 19 3d fe e9 00 fe ff ff <0f> 0b 48 c7 c7 00 67 a6 a0 e8 2a db f9 fe 0f 0b 48 c7 c7 c0 66 a6
[    1.542540] RSP: 0000:ffff88800f546f78 EFLAGS: 00010046
[    1.542541] RAX: ffff88800f538000 RBX: ffffffffa10cc5a0 RCX: 0000000000000004
[    1.542542] RDX: 1ffffffff42198bf RSI: ffffffffa009f280 RDI: 0000000000000046
[    1.542543] RBP: ffff88800f546ff0 R08: 0000000000000000 R09: 0000000000000001
[    1.542544] R10: 0000000000000001 R11: ffffed10068ff08b R12: ffffffffa10cc5a0
[    1.542545] R13: ffff88800f538000 R14: 00000000000e8245 R15: 0000000000000002
[    1.542546] FS:  0000000000000000(0000) GS:ffff888034600000(0000) knlGS:0000000000000000
[    1.542547] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    1.542547] CR2: 0000000000000000 CR3: 000000002a660001 CR4: 0000000000360ef0
[    1.542550] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[    1.542550] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[    1.542553] note: swapper/0[1] exited with preempt_count 2
[    1.542555] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
[    1.542555]
[    1.542777] Kernel Offset: 0x1cc00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
libguestfs: error: appliance closed the connection unexpectedly, see earlier error messages
libguestfs: child_cleanup: 0x55c9d863fa30: child process died
libguestfs: error: guestfs_launch failed, see earlier error messages
libguestfs: closing guestfs handle 0x55c9d863fa30 (state 0)
libguestfs: command: run: rm
libguestfs: command: run: \ -rf /tmp/libguestfsZcVzbL
libguestfs: command: run: rm
libguestfs: command: run: \ -rf /tmp/libguestfsfZ4OkI

2.
# virt-rescue --scratch
Formatting '/tmp/libguestfspwnqTB/overlay2.qcow2', fmt=qcow2 cluster_size=65536 extended_l2=off compression_type=zlib size=4294967296 backing_file=/var/tmp/.guestfs-0/appliance.d/root backing_fmt=raw lazy_refcounts=off refcount_bits=16
[    0.000000] ACPI BIOS Error (bug): A valid RSDP was not found (20200110/tbxfroot-210)
[    0.004858] BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:968
[    0.004860] in_atomic(): 0, irqs_disabled(): 1, pid: 1, name: swapper/0
[    0.928119]
[    0.928120] ============================================
[    0.928121] WARNING: possible recursive locking detected
[    0.928123] 4.18.0-280.rt7.45.el8.x86_64+debug #1 Tainted: G        W        --------- -  -
[    0.928123] --------------------------------------------
[    0.928124] swapper/0/1 is trying to acquire lock:
[    0.928125] ffffffff940cc640 (depot_lock){+.+.}-{2:2}, at: stack_depot_save+0x184/0x520
[    0.928131]
[    0.928131] but task is already holding lock:
[    0.928132] ffffffff940cc640 (depot_lock){+.+.}-{2:2}, at: stack_depot_save+0x184/0x520
[    0.928135]
[    0.928135] other info that might help us debug this:
[    0.928135]  Possible unsafe locking scenario:
[    0.928135]
[    0.928135]        CPU0
[    0.928136]        ----
[    0.928136]   lock(depot_lock);
[    0.928137]   lock(depot_lock);
[    0.928138]
[    0.928138]  *** DEADLOCK ***
[    0.928138]
[    0.928139]  May be due to missing lock nesting notation
[    0.928139]
[    0.928140] 2 locks held by swapper/0/1:
[    0.928140]  #0: ffff88804ff34128 (&dev->mutex){....}-{0:0}, at: __device_attach+0x7c/0x300
[    0.928150]  #1: ffffffff940cc640 (depot_lock){+.+.}-{2:2}, at: stack_depot_save+0x184/0x520
[    0.928153]
[    0.928153] stack backtrace:
[    0.928155] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G        W        --------- -  - 4.18.0-280.rt7.45.el8.x86_64+debug #1
[    0.928156] Hardware name: Red Hat KVM, BIOS 1.13.0-2.module+el8.3.0+7353+9de0a3cc 04/01/2014
...

[    0.929013] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
[    0.929013]
[    0.929013] Kernel Offset: 0xfc00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
libguestfs: error: appliance closed the connection unexpectedly.
This usually means the libguestfs appliance crashed.


Actual results:
The error as above

Expected results:
libguestfs-test-tool can be run successfully.


Additional info:

Comment 1 Richard W.M. Jones 2021-02-04 09:58:33 UTC
As it's the kernel crashing, must be a kernel bug.

Comment 2 Juri Lelli 2021-02-05 06:31:30 UTC
Hi,

I wonder if this is a dup of bz1917950.

Could you please test with the following kernel?

http://brew-task-repos.usersys.redhat.com/repos/official/kernel-rt/4.18.0/280.rt7.45.el8.dt3.1/x86_64/

Thanks!

Comment 3 YongkuiGuo 2021-02-05 08:19:58 UTC
(In reply to Juri Lelli from comment #2)
> Hi,
> 
> I wonder if this is a dup of bz1917950.
> 
> Could you please test with the following kernel?
> 
> http://brew-task-repos.usersys.redhat.com/repos/official/kernel-rt/4.18.0/
> 280.rt7.45.el8.dt3.1/x86_64/
> 
>
Hi Juri, I tested kernel-4.18.0-280.rt7.45.el8.dt3.1.x86_64+debug and libguestfs-test-tool works fine. Thanks.

Comment 4 Juri Lelli 2021-02-05 08:26:51 UTC
No prob. Closing as DUP then.

Thanks for testing!

*** This bug has been marked as a duplicate of bug 1917950 ***