Created attachment 399848 [details] bash binary from Exherbo Linux Description of problem: gdb is unable to debug a certain binary (attached - a build of bash), but only when run inside an LXC container. Version-Release number of selected component (if applicable): kernel-PAE-2.6.32.9-70.fc12.i686 How reproducible: Always Steps to Reproduce: 1. Set up an LXC chroot containing gdb and the attached binary (e.g. using "mach yum install gdb") 2. Define the LXC chroot in virsh, e.g. virsh define foo.xml for some libvirt configuration file foo.xml (see bug 554203 for an example) 3. virsh --connect lxc:/// start foo && virsh --connect lxc:/// console foo 4. gdb bash.exherbo 5. Type 'r' to run the program Actual results: GNU gdb (GDB) Fedora (7.0.1-33.fc12) Copyright (C) 2009 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "i686-redhat-linux-gnu". For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>... Reading symbols from /bin/bash...Reading symbols from /usr/lib/debug/bin/bash.debug...done. done. (gdb) r Starting program: /bin/bash warning: linux_test_for_tracefork: failed to kill second child waiting for new child: No child processes. (gdb) Expected results: GNU gdb (GDB) Fedora (7.0.1-33.fc12) Copyright (C) 2009 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "i686-redhat-linux-gnu". For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>... Reading symbols from /home/greenrd/exherbo/bin/bash...Missing separate debuginfo for /home/greenrd/exherbo/bin/bash (no debugging symbols found)...done. (gdb) r Starting program: /home/greenrd/exherbo/bin/bash warning: .dynamic section for "/lib/libc.so.6" is not at the expected address warning: difference appears to be caused by prelink, adjusting expectations Detaching after fork from child process 740. Detaching after fork from child process 741. Detaching after fork from child process 742. Detaching after fork from child process 744. Detaching after fork from child process 746. Detaching after fork from child process 747. Detaching after fork from child process 748. Detaching after fork from child process 749. Detaching after fork from child process 750. Detaching after fork from child process 751. Detaching after fork from child process 752. Detaching after fork from child process 753. Detaching after fork from child process 754. Detaching after fork from child process 755. [root@cspcnh exherbo]# Additional info: This bug does not occur on Fedora 13. It does not occur with the /bin/bash from Fedora 12, either (which is why I attached the particular binary it occurs with).
Workaround: use an older kernel, e.g. kernel-PAE-2.6.31.12-174.2.22.fc12.i686
Correction: This bug essentially *does* still occur on Fedora 13 (with a minimal Fedora 13 installed inside the chroot). But unlike in Fedora 12, bash.exherbo starts apparently successfully, and the error message "Waiting for child: no such process" only appears when you try to run an external command such as /bin/ls inside of bash.exherbo. So the manifestation of the bug is slightly different.
And it turns out this bug also occurs with Fedora 13's /bin/bash on Fedora 13, in the same way I stated in comment#2 - so you don't need to use another distro's bash binary to reproduce this.
Does the bug happen if you use an upstream kernel instead of the fedora kernel? I would suspect the utrace patches as the cause if it doesn't happen on an unmodified kernel.
With upstream kernel 2.6.33.7 (the closest released version available) I get a different error message from gdb: bash-4.1# ls Couldn't write debug register: No such process I've filed that bug upstream as https://bugzilla.kernel.org/show_bug.cgi?id=17281 I didn't add it to the upstream bug field of this bug because it's not necessarily the same issue.
The upstream bug is now fixed in head, by http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=068e35eee9ef98eb4cab55181977e24995d273be and is also fixed in 2.6.35.6. Can this patch be applied to the Fedora kernel, please, to see if it fixes this bug as well?
Issue still exists in F14 with kernel-PAE-2.6.35.6-39.fc14.i686 (note: I tested with the same F13 installation inside the chroot - I only upgraded the host, not the guest). If I am reading the RPM version correctly, the upstream patch should be in that version (unless it's overridden by some Fedora patch), so this looks like it's probably NOT the same bug as the upstream bug.
(In reply to comment #4) > Does the bug happen if you use an upstream kernel instead of the fedora kernel? > I would suspect the utrace patches as the cause if it doesn't happen on an > unmodified kernel. Yup, I commented out the utrace patches and rebuilt kernel-PAE-2.6.35.11-82.fc14.i686, and this bug disappeared. Without those patches commented out, this bug occurs.
sorry, -83, not -82.
I get the same error (waiting for child: No child processes.) from a very simple pthreads example program, using 2.6.35.12-90.fc14.i686 -- This happens immediately after the call to pthread_create()
Oleg, any thoughts on this? See comment #8.
I am puzzled. And I don't know what exactly fails. OK, perhaps gdb tries to attach (or auto-attach) to the forked task. I don't think http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=068e35eee9ef98eb4cab55181977e24995d273be can make any difference. And yes, we have the problems with the pid_ns here, but utrace looks bug-compatible in this respect. I'll try to investigate when I reserve the testing machine.
(In reply to comment #10) > I get the same error (waiting for child: No child processes.) from a very > simple pthreads example program, using 2.6.35.12-90.fc14.i686 -- This happens > immediately after the call to pthread_create() Aha, I didn't notice this message. So, yes, it seems that something is wrong with pids... May be. Any chance you can confirm that this doesn't happen without CONFIG_UTRACE or with upstream kernel?
(In reply to comment #12) > > And yes, we have the problems with the pid_ns here, but utrace > looks bug-compatible in this respect. Ooh, it is not. Can't understand how I didn't notice this before. > I'll try to investigate > when I reserve the testing machine. unneeded. I'm pretty sure I understand the problem.
[PATCH F-14] bz#573210: ptrace-utrace: fix PTRACE_GETEVENTMSG(pid) in LXC http://lists.fedoraproject.org/pipermail/kernel/2011-August/003340.html
I've committed the patch from comment #15. This should be in the next F14 kernel build.
kernel-2.6.35.14-96.fc14 has been submitted as an update for Fedora 14. https://admin.fedoraproject.org/updates/kernel-2.6.35.14-96.fc14
Package kernel-2.6.35.14-96.fc14: * should fix your issue, * was pushed to the Fedora 14 testing repository, * should be available at your local mirror within two days. Update it with: # su -c 'yum update --enablerepo=updates-testing kernel-2.6.35.14-96.fc14' as soon as you are able to, then reboot. Please go to the following url: https://admin.fedoraproject.org/updates/kernel-2.6.35.14-96.fc14 then log in and leave karma (feedback).
kernel-2.6.35.14-96.fc14 has been pushed to the Fedora 14 stable repository. If problems still persist, please make note of it in this bug report.