Created attachment 1264708 [details] serial log Description of problem: guest does not reboot after kdump Version-Release number of selected component (if applicable): HOST: kernel-3.10.0-606.el7.x86_64 kernel-debuginfo-3.10.0-606.el7.x86_64 kernel-debuginfo-common-x86_64-3.10.0-606.el7.x86_64 qemu-kvm-rhev-2.8.0-5.el7.x86_64 GUEST: kernel-3.10.0-606.el7.x86_64 kexec-tools 2.0.14 How reproducible: 100% Steps to Reproduce: 1.boot up guest /usr/libexec/qemu-kvm \ -name 'avocado-vt-vm1' \ -sandbox off \ -machine pc \ -nodefaults \ -vga cirrus \ -device ich9-usb-ehci1,id=usb1,addr=1d.7,multifunction=on,bus=pci.0 \ -device ich9-usb-uhci1,id=usb1.0,multifunction=on,masterbus=usb1.0,addr=1d.0,firstport=0,bus=pci.0 \ -device ich9-usb-uhci2,id=usb1.1,multifunction=on,masterbus=usb1.0,addr=1d.2,firstport=2,bus=pci.0 \ -device ich9-usb-uhci3,id=usb1.2,multifunction=on,masterbus=usb1.0,addr=1d.4,firstport=4,bus=pci.0 \ -drive id=drive_image1,if=none,snapshot=off,aio=native,cache=none,format=qcow2,file=/home/kvm_autotest_root/images/rhel73-64-virtio.qcow2 \ -device virtio-blk-pci,id=image1,drive=drive_image1,bootindex=0,bus=pci.0,addr=04 \ -device virtio-net-pci,mac=9a:4d:4e:4f:50:51,id=id3DveCw,vectors=4,netdev=idgW5YRp,bus=pci.0,addr=05 \ -netdev tap,id=idgW5YRp \ -m 2048 \ -smp 4,maxcpus=4,cores=2,threads=1,sockets=2 \ -cpu 'SandyBridge',+kvm_pv_unhalt \ -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \ -vnc :0 \ -rtc base=utc,clock=host,driftfix=slew \ -boot order=cdn,once=c,menu=off,strict=off \ -enable-kvm \ -monitor stdio \ -qmp tcp:localhost:4444,server,nowait\ 2.In guest /etc/kdump.conf,default action is set to: default reboot 3.In guest /etc/default/grub: GRUB_CMDLINE_LINUX="rd.lvm.lv=rhel/swap crashkernel=auto rd.lvm.lv=rhel/root rhgb quiet #service kdump start 4.trigger a crash in the guest # echo c >/proc/sysrq-trigger Actual results: The rhel guest hangs with the black screen. Expected results: The guest dumps and reboot. Additional info: Due to network issue, I will upload serial log file later.
Created attachment 1264893 [details] serial log with -serial stdio When I add "-serial stdio" in the qemu cmd, the guest takes dump and reboot. But if I don't, it will hangs at the black screen.
Created attachment 1264894 [details] log monitor stdio
Created attachment 1264900 [details] qemu 2.6.0.27 serial log I tried with qemu-kvm-rhev-2.6.0-27.el7.x86_64. If I use "-serial stdio" in qemu cmd, the guest takes dump and reboot. if I use "-monitor stdio" in qemu cmd, the guest hangs at the black screen.
Created attachment 1264901 [details] qemu-kvm-rhev-2.6.0-27.el7.x86_66 monitor stdio log
Created attachment 1264902 [details] /etc/kdump.conf I tested with same kernel and configurations but different qemu versions: 1.With qemu-kvm-rhev-2.8.0-6.el7,for rhel guests: 1.1 dump coredump in guest /var/crash. I triggered a crash using # echo c >/proc/sysrq-trigger The rhel guest hangs with the black screen. (It should take a dump and reboot) 1.2 guest kdump over ssh. edit /etc/kdump.conf: ssh root.73.85 <-- host ip sshkey /root/.ssh/id_rsa path /var/crash core_collector makedumpfile -F -l --message-level 1 -d 31 default reboot # service kdump start Then,I triggered a crash using # echo c >/proc/sysrq-trigger. The rhel guest hangs with the black screen. (It should take a dump and reboot) 2.With qemu-kvm-rhev-2.6.0-27.el7,for rhel guests: 2.1 dump coredump in guest /var/crash. I triggered a crash using # echo c >/proc/sysrq-trigger The rhel guest hangs with the black screen. (It should take a dump and reboot) 2.2 guest kdump over ssh. edit /etc/kdump.conf: ssh root.73.85 <-- host ip sshkey /root/.ssh/id_rsa path /var/crash core_collector makedumpfile -F -l --message-level 1 -d 31 default reboot # service kdump start Then,I triggered a crash using # echo c >/proc/sysrq-trigger. The rhel guest takes a dump and reboot. ***I suspect it is a qemu bug***
After editing /etc/kdump.conf, you have to (re)start kdump. kdump will then regenerate the initrd, packaging the updated version of /etc/kdump.conf. I assume that this was done in your case. However I wonder if there is a general problem. I set it to "default shell", restarted kdump and made sure that the updated config file ended up in the initrd. There was no way of stopping kdump of rebooting the guest. The default parameter just got ignored.
Hi hachen. I used the same qemu version and guest kernel/kexec-tools as you reported. But fail to use the following cmd to reproduce the bug.(The cmd is copied from yours except that the network-config) Since guest hangs with the black screen, could you test with the following step: -1. insert "gdb --args" before your cmdline, -2. set breakpoint by "break pc_machine_reset" -3. run When the guest boot up, gdb will hit the breakpoint, you can ignore it. But after you "echo c > /proc/sysrq-trigger", please notice whether the breakpoint is hit or not. I will do further analysis and debug based on the result Thx, Pingfan --- cmd I used --- gdb --args \ /usr/libexec/qemu-kvm \ -name 'avocado-vt-vm1' \ -sandbox off \ -machine pc \ -nodefaults \ -vga cirrus \ -device ich9-usb-ehci1,id=usb1,addr=1d.7,multifunction=on,bus=pci.0 \ -device ich9-usb-uhci1,id=usb1.0,multifunction=on,masterbus=usb1.0,addr=1d.0,firstport=0,bus=pci.0 \ -device ich9-usb-uhci2,id=usb1.1,multifunction=on,masterbus=usb1.0,addr=1d.2,firstport=2,bus=pci.0 \ -device ich9-usb-uhci3,id=usb1.2,multifunction=on,masterbus=usb1.0,addr=1d.4,firstport=4,bus=pci.0 \ -drive id=drive_image1,if=none,snapshot=off,aio=native,cache=none,format=raw,file=$guest_img \ -device virtio-blk-pci,id=image1,drive=drive_image1,bootindex=0,bus=pci.0,addr=04 \ -net nic,model=virtio,macaddr=$(< /sys/class/net/macvtap0/address) \ -net tap,fd=3 3<>/dev/tap$(< /sys/class/net/macvtap0/ifindex) \ -m 2048 \ -smp 4,maxcpus=4,cores=2,threads=1,sockets=2 \ -cpu 'SandyBridge',+kvm_pv_unhalt \ -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \ -vnc :2 \ -rtc base=utc,clock=host,driftfix=slew \ -boot order=cdn,once=c,menu=off,strict=off \ -enable-kvm \ -monitor stdio pc_machine_reset
login to the buggy system, I find that the 2nd kernel does not boot up. Also the gdb can not hit the breakpoint pc_machine_reset for the 2nd time. It is strange, need more time to debug. Thx, Pingfan
Hi hechen, Could you help to retest this problem by fixed package kexec-tools-2.0.14-7.el7? -- Thanks, Qiao
I test on host: kernel-3.10.0-656.el7.x86_64 kernel-debuginfo-3.10.0-656.el7.x86_64 kernel-debuginfo-common-x86_64-3.10.0-656.el7.x86_64 kexec-tools-2.0.14-7.el7 qemu-kvm-rhev-2.9.0-5.el7.x86_64 guest: kernel-3.10.0-656.el7.x86_64 kexec-tools-2.0.14-7.el7 It works as the guest reboot after dump.
(In reply to hachen from comment #12) > I test on > host: > kernel-3.10.0-656.el7.x86_64 > kernel-debuginfo-3.10.0-656.el7.x86_64 > kernel-debuginfo-common-x86_64-3.10.0-656.el7.x86_64 > kexec-tools-2.0.14-7.el7 > qemu-kvm-rhev-2.9.0-5.el7.x86_64 > > guest: > kernel-3.10.0-656.el7.x86_64 > kexec-tools-2.0.14-7.el7 > > It works as the guest reboot after dump. Thanks! I really appreciate it. Move to Verified. -- Thanks, Qiao
(In reply to hachen from comment #4) > Created attachment 1264900 [details] > qemu 2.6.0.27 serial log > > I tried with qemu-kvm-rhev-2.6.0-27.el7.x86_64. > If I use "-serial stdio" in qemu cmd, the guest takes dump and reboot. > if I use "-monitor stdio" in qemu cmd, the guest hangs at the black screen. I think that in description, you miss something for the kernel cmdline. In it, you used "console=tty0 console=ttyS0", so when you tried qemu without "-serial stdio"(i.e. the VM does not implement serial device), the kdump failed. Regards, Pingfan
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:2300
(In reply to Pingfan Liu from comment #14) > (In reply to hachen from comment #4) > > Created attachment 1264900 [details] > > qemu 2.6.0.27 serial log > > > > I tried with qemu-kvm-rhev-2.6.0-27.el7.x86_64. > > If I use "-serial stdio" in qemu cmd, the guest takes dump and reboot. > > if I use "-monitor stdio" in qemu cmd, the guest hangs at the black screen. > > I think that in description, you miss something for the kernel cmdline. In > it, you used "console=tty0 console=ttyS0", so when you tried qemu without > "-serial stdio"(i.e. the VM does not implement serial device), the kdump > failed. > > Regards, > Pingfan In comment #2, when I commented I added "-serial stdio", I think it was someone asked for the serial log. In my test cases, I normally use "-monitor stdio" as I posted in the Description. The first time I report this bug was using "-monitor stdio" to boot a guest. then run # service kdump start, next run # echo c > /proc/sysrq-trigger to trigger dump. At that time, the guest hangs with the black screen. The "GRUB_CMDLINE_LINUX="rd.lvm.lv=rhel/swap crashkernel=auto rd.lvm.lv=rhel/root rhgb quiet" was served as additional information there, I did not change anything. After its fix, when I follow the same step, the guest will reboot. Hope this make this bug clear. Thanks Haotong