Bug 1231709
Summary: | Guest will crash and reboot while shutdown a guest with console configured | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | zhenfeng wang <zhwang> | ||||
Component: | qemu-kvm-rhev | Assignee: | David Gibson <dgibson> | ||||
Status: | CLOSED WONTFIX | QA Contact: | Virtualization Bugs <virt-bugs> | ||||
Severity: | low | Docs Contact: | |||||
Priority: | low | ||||||
Version: | 7.2 | CC: | dgibson, dyuan, dzheng, gsun, hannsj_uhl, jsuchane, juzhang, knoel, lvivier, michen, mzhan, ngu, qzhang, rbalakri, thuth, virt-maint, zhwang | ||||
Target Milestone: | rc | ||||||
Target Release: | --- | ||||||
Hardware: | ppc64le | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2016-02-22 09:13:00 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 1308609, 1359843 | ||||||
Attachments: |
|
Description
zhenfeng wang
2015-06-15 09:26:14 UTC
Can you please attach the full domain XML? Created attachment 1045328 [details]
The guest's xml
I'm unable to reproduce the issue using the domain XML you attached. Installed packages: kernel-3.10.0-302.el7.ppc64le qemu-kvm-rhev-2.3.0-16.el7.ppc64le libvirt-daemon-1.2.17-4.el7.ppc64le Can you please try again on an updated host? Andrea, I can reproduce this issue with below information. kernel-3.10.0-306.el7.ppc64le qemu-kvm-rhev-2.3.0-19.el7.ppc64le libvirt-1.2.17-6.el7.ppc64le libvirt-daemon-1.2.17-6.el7.ppc64le 1. I am using above attached XML except using virtio bus for disk image this time. <disk type='file' device='disk'> <driver name='qemu' type='qcow2'/> <source file='/var/lib/virt_test/images/jeos-19-64.qcow2'/> <backingStore/> <target dev='vda' bus='virtio'/> <alias name='virtio-disk0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/> </disk> 2. Start the guest and configure /boot/grub2/grub.cfg to use 'console=ttyS0,115200 quiet'. 3. Check /proc/cmdline [root@localhost ~]# cat /proc/cmdline BOOT_IMAGE=/vmlinuz-3.10.0-292.el7.ppc64le root=/dev/mapper/rhel-root ro crashkernel=auto rd.lvm.lv=rhel/root rd.lvm.lv=rhel/swap console=ttyS0,115200 quiet LANG=en_US.UTF-8 4. Shutdown the guest using 'init 0' within the guest or virsh shutdown [root@localhost ~]# init 0 5.# virsh domstate q2 --reason running (crashed) 6. login the guest and check # cat /var/log/messages that same messages are displayed in it as description field. 7. If I change grub.cfg with 'console=hvc0 ' as below, then both 'init 0' and 'virsh shutdown' will not cause crash. [root@localhost ~]# cat /proc/cmdline BOOT_IMAGE=/vmlinuz-3.10.0-292.el7.ppc64le root=/dev/mapper/rhel-root ro crashkernel=auto rd.lvm.lv=rhel/root rd.lvm.lv=rhel/swap console=hvc0 LANG=en_US.UTF-8 This is very weird, and I'm not entirely sure what you are trying to achieve by adding console= to the kernel's command line, but it looks to me like the issue is not in libvirt since the guest is started correctly, so much so that you're checking the kernel's comman line from inside it. Moving this to qemu-kvm-rhev. There are errors in the provided XML: - the memory size given is not a multiple of 256 MB. - A VGA adapter is defined, and I think this confuses the kernel where to drop the console. I think VGA adapter is not supported for pseries. David, what is your opinion ? The memory size wouldn't cause this problem. There might be other problems, but there are other BZs to track that. A VGA console is supported on Power, although it's not expected to be the normal mode of operation. The type of VGA instantiated might matter, though - can we get the qemu command line that this XML is resulting in? I just installed a RHEL 7.2 20150902 ppc64le guest with a VGA console and tried to reproduce this. I wasn't able to reproduce the problem - it shut down just fine with the graphical console. kernel-3.10.0-312.el7.ppc64le qemu-kvm-rhev-2.3.0-21.el7.ppc64le libvirt-daemon-1.2.17-6.el7.ppc64le Guest was installed with: sudo virt-install --name dwg-rhel72-20150902-le --location http://download.eng.rdu2.redhat.com/composes/nightly/RHEL-7.2-20150902.n.0/compose/Server/ppc64le/os/ --memory 2048 --disk size=20,bus=virtio --graphics vnc --network model=virtio,bridge=virbr0 So clearly there must be more factors in reproducing this. Hi, Nini Could you give a help on reproducing this bug? Thanks! (In reply to Qunfang Zhang from comment #10) > Hi, Nini > > Could you give a help on reproducing this bug? Thanks! I could reproduce the bug with following qemu cmd: /usr/libexec/qemu-kvm -name virtioblk-0828-le1 -machine pseries-rhel7.2.0,accel=kvm,usb=off -m 2048 -realtime mlock=off -smp 5,sockets=1,cores=5,threads=1 -uuid cb5bb9dd-3567-464c-8090-4690e80fe363 -no-user-config -nodefaults -monitor stdio -rtc base=utc -no-shutdown -boot strict=on -device spapr-vscsi,id=scsi0,reg=0x1000 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x1 -usb -drive id=drive_image1,if=none,cache=none,snapshot=off,aio=native,format=qcow2,file=/home/virtioblk-0828-le1 -device virtio-blk-pci,id=image1,drive=drive_image1,bus=pci.0,addr=0x10,bootindex=1 -netdev tap,id=hostnet0,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown -device spapr-vlan,netdev=hostnet0,id=net0,mac=52:54:00:c4:e7:15,reg=0x2000 -chardev pty,id=charserial0 -device spapr-vty,chardev=charserial0,reg=0x30000000 -chardev socket,id=charchannel0,path=/home/virt-tests-vm1.org.qemu.guest_agent.0,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.0 -device usb-kbd,id=input0 -device usb-mouse,id=input1 -device usb-tablet,id=input2 -vnc 0.0.0.0:0 -device VGA,id=video0,vgamem_mb=16,bus=pci.0,addr=0x4 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3 -object rng-random,id=objrng0,filename=/dev/random -device virtio-rng-pci,rng=objrng0,id=rng0,max-bytes=1234,period=2000,bus=pci.0,addr=0x5 -msg timestamp=on Please note, to reproduce the bug, 'console=ttyS0' should be added to the kernel command line before reboot the guest; and if use 'console=hvc0,115200' to replace the 'console=ttyS0 , there is no the bug problem. The detailed sw versions in my test are: Host kernel: 3.10.0-306.0.1.el7.ppc64le Guest kernel: 3.10.0-306.0.1.el7.ppc64le Qemu-kvm-rhev: qemu-kvm-rhev-2.3.0-22.el7.ppc64le Ah, I missed the console=ttyS0 in the initial description, sorry. Power guests don't have a true serial port (only the hypervisor console), and so console=ttyS0 can't work. Attempting to use it should fail more gracefully, so this is a real bug, but I'm deferring to 7.3 and lowering priority because it's not a real use case. I've tried to re-create the problem, and indeed, the guest also crashes here if I add the "console=ttyS0" kernel option. I changed the XML to "<on_crash>coredump-destroy</on_crash>", but since I didn't get a dump this way, I took it manually after the crash and then loaded the core file into the "crash" utility. The dmesg output ends like this: [ 29.116588] systemd-shutdown[1]: All filesystems unmounted. [ 29.116594] systemd-shutdown[1]: Deactivating swaps. [ 29.116697] systemd-shutdown[1]: All swaps deactivated. [ 29.116699] systemd-shutdown[1]: Detaching loop devices. [ 29.117107] systemd-shutdown[1]: All loop devices detached. [ 29.117109] systemd-shutdown[1]: Detaching DM devices. [ 29.117340] systemd-shutdown[1]: Detaching DM 253:0. [ 29.204165] systemd-shutdown[1]: Not all DM devices detached, 1 left. [ 29.204229] systemd-shutdown[1]: Detaching DM devices. [ 29.204333] systemd-shutdown[1]: Not all DM devices detached, 1 left. [ 29.204336] systemd-shutdown[1]: Cannot finalize remaining DM devices, continuing. [ 29.205328] systemd-shutdown[1]: Failed to acquire terminal: No such device [ 29.205330] systemd-shutdown[1]: Successfully changed into root pivot. [ 29.205332] systemd-shutdown[1]: Returning to initrd... [ 29.206826] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000100 [ 29.206829] CPU: 0 PID: 1 Comm: shutdown Not tainted 3.10.0-327.el7.ppc64le #1 [ 29.206831] Call Trace: [ 29.206864] [c00000007eec3bb0] [c000000000017760] show_stack+0x80/0x330 (unreliable) [ 29.206879] [c00000007eec3c60] [c00000000095ea88] dump_stack+0x30/0x44 [ 29.206881] [c00000007eec3c80] [c0000000009548b8] panic+0xf4/0x26c [ 29.206888] [c00000007eec3d00] [c0000000000da680] do_exit+0xb50/0xb60 [ 29.206890] [c00000007eec3df0] [c0000000000da840] SyS_exit_group+0x50/0xf0 [ 29.206892] [c00000007eec3e30] [c00000000000a17c] system_call+0x38/0xb4 [ 29.206896] Sending IPI to other CPUs [ 29.207904] IPI complete When doing a backtrace, the output also does not look very useful: PID: 1 TASK: c00000007ee80000 CPU: 0 COMMAND: "shutdown" R0: 0000000024004884 R1: c000000009123cf0 R2: c000000009123130 R3: 0000000000000000 R4: c00000000908d2e0 R5: 0000000000000000 R6: 0018701df9000000 R7: 0000000000000000 R8: 0000000000000000 R9: c000000009042400 R10: 0000000000000001 R11: 00023fecb373fba4 R12: 0000000000000000 R13: c00000000fb80000 R14: 0000000000000000 R15: 0000000009570600 R16: 000000001fff0000 R17: 0000000000000000 R18: 0000000000000000 R19: c00000007ee80000 R20: 0000000000000000 R21: c000000008993778 R22: 0000000000000001 R23: c00000000909bad3 R24: c000000009535070 R25: 0000000000000000 R26: c00000000908d2e0 R27: 0000000000000000 R28: 0000000000000000 R29: c00000000908d2e0 R30: 0000000437cea04c R31: 0000000000000000 NIP: c0000000080946b4 MSR: 8000000100009033 OR3: 0000000000000000 CTR: c00000000874fb80 LR: c00000000874fc38 XER: 0000000000000000 CCR: 0000000024004884 MQ: 0000000000000000 DAR: 0000000000000000 DSISR: 0000000000000000 Syscall Result: 0000000000000000 NIP [c0000000080946b4] (null) LR [c00000000874fc38] (null) This is a very hard to debug crash that only occurs if you give an illegal console string on the command line anyway. Really not much point fixing this. |