On s390x, I'm seeing udevd stay at 50% of the CPU and continually spew the following message to the console: udevd: error getting buffer for inotify I am not sure what's going on, but if you want help debugging this, let me know.
this is a kernel bug, IIRC.. is this an old kernel?
can you strace udevd? might be related to bug #499907 Eric, what do you think?
btw, this is a HUGE problem for S390 debug effort - which is currently blocking rhel6 alpha2. Help appreciated.
Let me see the strace of udevd and we'll see what's going on....
also let me know what the kernel is in question and i'll make sure that kernel doesn't have any of the obvious problems....
Created attachment 362355 [details] udevd.strace.xz Command used: strace -f -F -v -o /udevd.strace udevd --debug --debug-trace
Kernel is 2.6.31-0.174.rc7.git2.fc12.s390x
Not the same problem.... 389 ioctl(6, FIONREAD, [224]) = 0 389 mmap(NULL, 962072678400, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = -1 ENOMEM (Cannot allocate memory) Looks like the ioctl gave something reasonable. But udev is trying to allocate a buffer WAY WAY WAY to big....
This is the code in question: ssize_t nbytes; if ((ioctl(pfd[FD_INOTIFY].fd, FIONREAD, &nbytes) < 0) || (nbytes <= 0)) return 0; buf = malloc(nbytes);
the kernel calls it a size_t rather than ssize_t no idea how that could cause the problem in question. But i'll check to see if this is a change. If not, I think we need to find a libc person....
RHEL5 kernel called this as an unsigned int.... 962072678400 = 0x000000E000001000 1168231108608 = 0x0000011000001000 So it looks like some of the higher bits of the nbytes is non-zero. I admit I'm confused, I could see that happening on the old setup...
ok, I'm back. RHEL5 struct inotify_device { [snip] unsigned int queue_size; /* size of the queue (bytes) */ [snip] }; ret = put_user(dev->queue_size, (int __user *) p); upstream/RHEL6 size_t send_len = 0; ret = put_user(send_len, (int __user *) p); In both cases though we should only be writing an int's worth of data. I believe that on s390, and x86_64 even: size_t = long ssize_t = unsigned long but I'm only writing an int back from the kernel. My guess is that those high bits are just stack leftovers. I'm assuming that udev has just always gotten lucky and had 0 there and now we've hit a system where you don't. I'd say either use an int, or explicitly 0 it out..... -Eric
upstream patch for udev happened yesterday: http://git.kernel.org/?p=linux/hotplug/udev.git;a=commitdiff;h=4daa146bf71cea174271371a0eb3cf22719a550b;hp=49c3a01d444052169363030dfd996fc7fd6a4fad
should be fixed in udev-145-9.fc12 http://koji.fedoraproject.org/koji/taskinfo?taskID=1707372