Bug 1431940 - Intermittent kernel panic in guest when using virt-customize or spawning a VM
Summary: Intermittent kernel panic in guest when using virt-customize or spawning a VM
Keywords:
Status: CLOSED DUPLICATE of bug 1430297
Alias: None
Product: Fedora
Classification: Fedora
Component: libguestfs
Version: rawhide
Hardware: Unspecified
OS: Unspecified
unspecified
low
Target Milestone: ---
Assignee: Richard W.M. Jones
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-03-14 05:45 UTC by David Hill
Modified: 2019-01-09 12:54 UTC (History)
12 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-04-25 07:45:56 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
virt-customize and kernel panic (27.34 KB, text/plain)
2017-03-15 03:14 UTC, David Hill
no flags Details

Description David Hill 2017-03-14 05:45:25 UTC
Description of problem:
When redirecting virt-customize output with 2>/var/log/stderr 1>/var/log/stdout, virt-customize hangs

Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1. Run       sudo virt-customize -a $jenkinspath/VMs/${vmname}.qcow2 $uploadcmd customize.service:/etc/systemd/system/ $uploadcmd tmp/S01customize:/etc/rc.d/rc3.d/ $uploadcmd S01loader:/etc/rc.d/rc3.d/ --root-password password:$rootpasswd --link /etc/systemd/system/customize.service:/etc/systemd/system/multi-user.target.wants/customize.service $uploadcmd cloud.cfg:/etc/cloud 2>$stderr 1>$stdout
2.
3.

Actual results:
Wait forever

Expected results:
Wait 85 seconds before it completes

Additional info:
I've also managed to get it to work by changing the ownership of the directory to qemu.  Seems like running virt-customize now forks a qemu-kvm process running under qemu instead of root but this is not my main issue as if I solve this issue , then virt-customize simply hangs in there unless I use 2>>$stderr and 1>>$stdout instead of 2>$stderr and 1>$stdout.   This behavior is really strange ...

Comment 1 Pino Toscano 2017-03-14 09:34:06 UTC
This bug is a big scarce of details -- please provide all the following information:
- the version of libguestfs (eg `virt-customize --version`, and the version from the package manager
- the distribution it runs on (name + version)
- the full command line that fals, without any $variable (so with everything expanded)
- add -v -x as arguments to the virt-customize invocation, and provide the full content of the stdout and stderr files
- whether it works if nothing is redirected

Comment 2 David Hill 2017-03-14 17:20:50 UTC
[root@zappa jenkins]# virt-customize --version
virt-customize 1.37.1fedora=27,release=1.fc27,libvirt

[root@zappa jenkins]# cat /etc/redhat-release
Fedora release 27 (Rawhide)


[    1.166294] Call Trace:
[    1.167372]  ? cfq_get_queue+0x5/0x5e0
[    1.168995]  ? check_blkcg_changed+0xcd/0x4a0
[    1.170919]  ? check_blkcg_changed+0x5/0x4a0
[    1.172824]  ? debug_lockdep_rcu_enabled+0x1d/0x30
[    1.174902]  ? cfq_set_request+0x7c/0x350
[    1.176671]  cfq_set_request+0x27b/0x350
[    1.178390]  ? mark_held_locks+0x5f/0x90
[    1.180113]  ? _raw_spin_unlock_irq+0x2c/0x40
[    1.181953]  ? trace_hardirqs_on_caller+0xf4/0x1b0
[    1.184077]  ? trace_hardirqs_on+0xd/0x10
[    1.185793]  ? _raw_spin_unlock_irq+0x2c/0x40
[    1.187679]  elv_set_request+0x2b/0x60
[    1.189297]  get_request+0x7cf/0xc10
[    1.190893]  ? get_request+0x69/0xc10
[    1.192518]  ? finish_wait+0x90/0x90
[    1.194077]  blk_get_request+0x80/0x110
[    1.195746]  scsi_execute+0x40/0x270
[    1.197296]  scsi_test_unit_ready+0x7d/0xf0
[    1.199120]  sd_check_events+0xf8/0x1b0
[    1.200794]  disk_check_events+0x62/0x150
[    1.202539]  disk_events_workfn+0x1c/0x20
[    1.204266]  process_one_work+0x260/0x750
[    1.205995]  ? process_one_work+0x1db/0x750
[    1.207816]  worker_thread+0x4e/0x4a0
[    1.209425]  ? process_one_work+0x750/0x750
[    1.211242]  kthread+0x12c/0x150
[    1.212643]  ? kthread_create_on_node+0x60/0x60
[    1.214655]  ret_from_fork+0x31/0x40
[    1.216229] Code: 81 ce 60 01 00 00 45 84 f6 0f 45 c6 49 8d 74 24 68 41 89 45 04 e8 fc f5 ff ff 48 8b 8d 48 ff ff ff 49 89 8d e0 00 00 00 4c 8b 21 <41> 8b 84 24 20 01 00 00 85 c0 0f 8e 51 02 00 00 3e 41 ff 84 24
[    1.224444] RIP: cfq_get_queue+0x335/0x5e0 RSP: ffffab73401bb9e8
[    1.227008] ---[ end trace 177699a8fdd3f42c ]---
[    1.229060] BUG: sleeping function called from invalid context at ./include/linux/percpu-rwsem.h:33
[    1.233027] in_atomic(): 1, irqs_disabled(): 1, pid: 30, name: kworker/0:1
[    1.236091] INFO: lockdep is turned off.
[    1.237752] irq event stamp: 5378
[    1.239214] hardirqs last  enabled at (5377): [<ffffffffab97af9c>] _raw_spin_unlock_irq+0x2c/0x40
[    1.242978] hardirqs last disabled at (5378): [<ffffffffab97ad6f>] _raw_spin_lock_irq+0x1f/0x80
[    1.246670] softirqs last  enabled at (5366): [<ffffffffab980812>] __do_softirq+0x382/0x511
[    1.250323] softirqs last disabled at (5347): [<ffffffffab0ba82f>] irq_exit+0x10f/0x120
[    1.253758] CPU: 0 PID: 30 Comm: kworker/0:1 Tainted: G      D         4.11.0-0.rc1.git3.1.fc27.x86_64 #1
[    1.257874] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.1-2.fc26 04/01/2014
[    1.261648] Workqueue: events_freezable_power_ disk_events_workfn
[    1.264269] Call Trace:
[    1.265340]  dump_stack+0x8e/0xd1
[    1.266762]  ___might_sleep+0x144/0x260
[    1.268422]  __might_sleep+0x4a/0x80
[    1.269948]  exit_signals+0x33/0x250
[    1.271490]  do_exit+0xc3/0xd80
[    1.272865]  ? process_one_work+0x750/0x750
[    1.274688]  ? kthread+0x12c/0x150
[    1.276278]  rewind_stack_do_exit+0x17/0x20
[    1.278110] note: kworker/0:1[30] exited with preempt_count 2

Comment 3 David Hill 2017-03-14 20:08:37 UTC
I'm wondering if this wouldn't be a kernel issue instead.  I'm having hard time getting VMs to spawn.   The previous error message is intermittent so it doesn't appear to be the redirection but intermittent kernel crash or kvm/libvirt issue causing those crashes.

Comment 4 David Hill 2017-03-14 20:13:04 UTC
sudo virt-customize -v -a /var/lib/jenkins/VMs/undercloud-0-ocata.qcow2 --copy-in customize.service:/etc/systemd/system/ --copy-in tmp/S01customize:/etc/rc.d/rc3.d/ --copy-in S01loader:/etc/rc.d/rc3.d/ --root-pa
ssword password:root --link /etc/systemd/system/customize.service:/etc/systemd/system/multi-user.target.wants/customize.service --copy-in cloud.cfg:/etc/cloud

Comment 5 David Hill 2017-03-14 23:12:09 UTC
It might be due to the fact that the behavior changed and it's transitioning from jenkins -> root -> qemu user while I'm doing a "sudo virt-customize" of a file located in /var/lib/jenkins/VMs which is not world readable/writable.  I chowned /var/lib/jenkins/VMs to qemu and now it seems to work properly every time.    Instead of returning a wrong permissions, I get a kernel panic once in a while which is puzzling instead of getting a permission denied in via virt-customize.

Comment 6 David Hill 2017-03-14 23:32:57 UTC
It's not due to that... sometimes it kernels panic while customizing the qcow2 image.

Comment 7 David Hill 2017-03-14 23:52:56 UTC
I rebooted the server with the previous kernel (4.9.13-201.fc25.x86_64) instead of the fc27 one (4.11.0-0.rc1.git3) and am retrying to reproduce this issue.

Comment 8 David Hill 2017-03-15 00:55:07 UTC
I confirm this seems to be a kernel problem as I'm no longer able to reproduce this issue with 4.9.13-201.

Comment 9 David Hill 2017-03-15 01:25:50 UTC
Finally, I was able to reproduce it with kernel 4.9.13-201.fc25.x86_64 ...

Comment 10 David Hill 2017-03-15 03:14:17 UTC
Created attachment 1263170 [details]
virt-customize and kernel panic

Comment 11 Richard W.M. Jones 2017-04-25 07:45:56 UTC
Sorry for missing this bug.  This is a kernel bug which is now fixed.

*** This bug has been marked as a duplicate of bug 1430297 ***


Note You need to log in before you can comment on or make changes to this bug.