Bug 1097779
Summary: | gdb no longer works to debug qemu: Remote 'g' packet reply is too long | ||
---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Robin Hack <rhack> |
Component: | qemu | Assignee: | Fedora Virtualization Maintainers <virt-maint> |
Status: | CLOSED EOL | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
Severity: | unspecified | Docs Contact: | |
Priority: | unspecified | ||
Version: | 20 | CC: | amit.shah, berrange, cfergeau, ciro.santilli, dwmw2, gbenson, itamar, jan.kratochvil, palves, patrickm, pbonzini, pmuldoon, rhack, rjones, sassmann, scottt.tw, sergiodj, todoleza, virt-maint |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2015-06-29 20:37:08 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Robin Hack
2014-05-14 13:56:35 UTC
I have also seen this. It's very annoying because debugging qemu is a useful feature. I suspect it's a bug in gdb however. Sounds like you didn't specify the binary to gdb before connecting. Shouldn't the error message be "you didn't specify the binary" instead of "remote 'g' packet reply is too long"? I'm fairly sure when I did this, I did specify the vmlinux, but I would have to go back and check that. Here is a small, self-contained reproducer. You will first need to install 'kernel' and 'kernel-debuginfo' packages, and alter the version strings below according to what kernel version you actually installed. In one window, run: sudo qemu-system-x86_64 -s -S -kernel /boot/vmlinuz-3.14.3-200.fc20.x86_64 -initrd /boot/initramfs-3.14.3-200.fc20.x86_64.img In a second window, run: $ gdb /usr/lib/debug/lib/modules/3.14.3-200.fc20.x86_64/vmlinux GNU gdb (GDB) Fedora 7.6.50.20130731-19.fc20 Reading symbols from /usr/lib/debug/lib/modules/3.14.3-200.fc20.x86_64/vmlinux...done. (gdb) target remote :1234 Remote debugging using :1234 0x0000000000000000 in irq_stack_union () (gdb) cont Continuing. The virtual machine should start booting. At some point hit ^C in the gdb window, and you will see the error: ^CRemote 'g' packet reply is too long: [...] Also if you kill qemu, then gdb segfaults but that's probably a different bug. qemu-system-x86-2.0.0-0.1.rc0.fc21.x86_64 gdb-7.6.50.20130731-19.fc20.x86_64 (In reply to palves from comment #2) > Sounds like you didn't specify the binary to gdb before connecting. Yes. I didn't specify vmlinux binary before connection. I think that more userfriendly (also not dead machine from my point of view) approach will be better. Additional info: With vmlinux specified: # gdb vmlinux ... (gdb) target remote localhost:1234 Works almost perfectly. This is annoying too: Program received signal SIGINT, Interrupt. native_safe_halt () at /usr/src/debug/kernel-3.13.fc20/linux-3.13.5-200.fc20.x86_64/arch/x86/include/asm/irqflags.h:50 50 in /usr/src/debug/kernel-3.13.fc20/linux-3.13.5-200.fc20.x86_64/arch/x86/include/asm/irqflags.h but is relative easy to solve. (In reply to Richard W.M. Jones from comment #4) > Here is a small, self-contained reproducer. You will first need to > install 'kernel' and 'kernel-debuginfo' packages, and alter the > version strings below according to what kernel version you actually > installed. I can reproduce it. And indeed qemu grows the g packet reply for some (broken) reason: Right after connection: Sending packet: $g#67...Ack Packet received: 0000000000000000230600000000000000000000000000000000000000000000f0ff00000200000000f00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000007f030000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000801f0000 I let the kernel boot a bit, and ctrl-c. Still the same size. Let it boot some more, and then: Sending packet: $g#67...Ack Packet received: 6ca40400000000007d6000000000000071000000000000005313a081ffffffffe53c58060088ffff7b00000000000000d03c58060088ffffd03c58060088ffff0100000000000000ceba9781ffffffffd4ba9781ffffffff000000000000000060a4040000000000ab1ea281ffffffff8f1101000000000000000000000000001d721081ffffffff0602000010000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000007f0300000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000801f0000 So indeed GDB is right in its complaint: ^CRemote 'g' packet reply is too long: [...] Sounds like something changed on the qemu end. (In reply to Richard W.M. Jones from comment #4) > Also if you kill qemu, then gdb segfaults but that's probably > a different bug. Yeah. I couldn't reproduce this one. But definitely a different bug. (In reply to Richard W.M. Jones from comment #3) > Shouldn't the error message be "you didn't specify the binary" > instead of "remote 'g' packet reply is too long"? Thing is, specifying a binary is not always required, if the server sends all the necessary bits. I see that qemu doesn't send a xml target (registers) description, for example. Actually I'm not coming up with a reason you'd see that on initial connection if you don't specify a binary. Absent a description, GDB tries to figure out the architecture from the g packet size. So if you do see that on initial connection without a binary, then that's something to look at. And without a binary, indeed I don't get the error immediately: (gdb) tar remote :1234 Remote debugging using :1234 0x0000fff0 in ?? () only if I let qemu boot for a little while, then I get: (gdb) c Continuing. ^CRemote 'g' packet reply is too long: d0bfc7c0ff7f00000b000000000000000000000000000000d0bfc7c0ff7f00000000000000000000e6bfc7c0ff7f000060c5c7c0ff7f0000d0bfc7c0ff7f00000090314d6d7f0000a0b9314d6d7f000000a0314d6d7f000046020000000000000b00000000000000f0bfc7c0ff7f0000d095314d6d7f00000000000000000000fde3104d6d7f000002020000330000002b0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000007f03000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000002424242424242424242424242424242400000000000000000000000000000000000000000000000000000000000000ff000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000801f0000 I wonder whether this isn't the old problem of mode switching? As in, first qemu is reporting 32-bit mode registers, and then later 64-bit mode? https://sourceware.org/ml/gdb/2009-01/msg00008.html (BTW, GDB has indeed become multi-arch since and every frame has an architecture associated with it, but, there's no x86 support for such in the RSP currently.) When a virtual machine (or indeed, a real PC) boots, it switches through 16 bit (8086), 32 bit and 64 bit (long) modes, as a kind of "ontogeny recapitulates phylogeny". Yes, but from Jan's old post I thought qemu nowadays sticked with 64-bit layout, always. "(*) QEMU recently decided to stick with 64 bit layout even if the x86-64 target is running in 16 or 32 bit mode. Before that the remote protocol used to be switched between 32 and 64 bit dynamically, depending on the current target mode. That solved many issues, but not all (manual 'set arch' was required, and gdb became confused in a few cases). We are now discussing again on qemu-devel how to deal with 16/32 bit system-level debugging in 64 bit emulation environment: either try to improve gdb quickly or reintroduce the old workaround, at least temporarily." That was in 2009. Sounds like qemu decided to change this back? Or is this not a new bug? Did this ever work before? It worked until fairly recently, probably within the last two months. OK. Would be good to see the RSP log with an older set of tools that worked. I suspect the change was on the qemu side, but someone should really bisect qemu/gdb looking for whatever change is causing this. Can't be me though. (In reply to Pedro Alves from comment #14) > OK. Would be good to see the RSP log with an older set of tools that > worked. I suspect the change was on the qemu side, but someone should > really bisect qemu/gdb looking for whatever change is causing this. Can't > be me though. Hi. I can try. I have time and resources. Is this also upstream bug? (In reply to Robin Hack from comment #15) > (In reply to Pedro Alves from comment #14) > > OK. Would be good to see the RSP log with an older set of tools that > > worked. I suspect the change was on the qemu side, but someone should > > really bisect qemu/gdb looking for whatever change is causing this. Can't > > be me though. > > Hi. I can try. I have time and resources. > Is this also upstream bug? Pretty sure yes. Fedora follows upstream qemu very closely. Well. After little bit testing I found maybe interesting think. I have x86_64 host and guest. Guest is fully booted (64-bit protected mode). (gdb) target remote :1234 Remote debugging using :1234 312 -> rsa->sizeof_g_packet 1072 - packet size Remote 'g' packet reply is too long: edffffff00000000d81fc081ffffffff0000000000000001000000000000000000000000000000004600000000000000981ec081ffffffff981ec081ffffffff000000000000000000000000000000000100000000000000a879ed81ffffffff0000000000000000d81fc081ffffffffc002dd81ffffffffd81fc081ffffffff66450581ffffffff8602000010000000180000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000007f03000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000002f2f2f2f2f2f2f2f2f2f2f2f2f2f2f2f00000000000000000000000000000000ff0000000000000000000000000000ff000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000801f0000 (gdb) set architecture i386:x86-64 The target architecture is assumed to be i386:x86-64 (gdb) target remote :1234 Remote debugging using :1234 544 - rsa->sizeof_g_packet 1072 - packet size 0xffffffff81054566 in ?? () (In reply to Robin Hack from comment #17) > Well. After little bit testing I found maybe interesting think. *thing of course! I dumped tcp connection between gdb and qemu gdb: gdb sends: +$qSupported:multiprocess+;xmlRegisters=i386;qRelocInsn+#b5+$Hg0#df+$qTStatus#49+$?#3f+$Hc-1#09+$qAttached#8f+$g#67+ gdb receive: +$PacketSize=1000#f1+$OK#9a+$#00+$T05thread:01;#07+$OK#9a+$#00+$edffffff00000000d81fc081ffffffff0000000000000001000000000000000000000000000000004600000000000000981ec081ffffffff981ec081ffffffff00000000000000000000000000000000000000000000000000000000000000000000000000000000d81fc081ffffffffc002dd81ffffffffd81fc081ffffffff66450581ffffffff8602000010000000180000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000007f03000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000002f2f2f2f2f2f2f2f2f2f2f2f2f2f2f2f00000000000000000000000000000000ff0000000000000000000000000000ff000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000801f0000#55 QEMU recognizes if remote gdb understand XML just by checking qXfer:features:read: in query packet. Then I just enabled that feature: (gdb) set remote target-features-packet on And found: (gdb) target remote :1234 Remote debugging using :1234 Enabled packet qXfer:features:read (target-features) not recognized by stub I dug in qemu codes (v1.6.2) and I found (file: gdbstub.c): if (strncmp(p, "Supported", 9) == 0) { snprintf(buf, sizeof(buf), "PacketSize=%x", MAX_PACKET_LENGTH); cc = CPU_GET_CLASS(first_cpu); if (cc->gdb_core_xml_file != NULL) { pstrcat(buf, sizeof(buf), ";qXfer:features:read+"); } put_packet(s, buf); break; } if (strncmp(p, "Xfer:features:read:", 19) == 0) { const char *xml; target_ulong total_len; cc = CPU_GET_CLASS(first_cpu); if (cc->gdb_core_xml_file == NULL) { goto unknown_command; } gdb_has_xml = true; But it looks like (cc->gdb_core_xml_file == NULL) is always true and unknown_command is always reached on my system. Therefore cc->gdb_core_xml_file is NULL while it should not be NULL. This seems to be qemu problem. Or maybe even just qemu build configuration/installation problem. Yes. I agree. 1) cc->gdb_core_xml is NULL on x86_64 arch 2) I don't see xml arch specific files in gdb-xml (gdbstub-xml.c files are generated from xml files in gdb-xml directory) This message is a reminder that Fedora 20 is nearing its end of life. Approximately 4 (four) weeks from now Fedora will stop maintaining and issuing updates for Fedora 20. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a Fedora 'version' of '20'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora 20 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior this bug is closed as described in the policy above. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. Fedora 20 changed to end-of-life (EOL) status on 2015-06-23. Fedora 20 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. If you are unable to reopen this bug, please file a new report against the current release. If you experience problems, please add a comment to this bug. Thank you for reporting this bug and we are sorry it could not be fixed. Similar on the GDB bugtracker: https://sourceware.org/bugzilla/show_bug.cgi?id=13984t |