Bug 442315
Summary: | FATAL: Error inserting ecryptfs (/lib/modules/2.6.18-81.el5/kernel/fs/ecryptfs/ecryptfs.ko): Input/output error | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 5 | Reporter: | Michal Nowak <mnowak> | ||||
Component: | kernel | Assignee: | Tom Coughlan <coughlan> | ||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | Red Hat Kernel QE team <kernel-qe> | ||||
Severity: | medium | Docs Contact: | |||||
Priority: | medium | ||||||
Version: | 5.2 | CC: | coughlan, dchapman, duck, dzickus, esandeen, james.smart, luyu, ohudlick | ||||
Target Milestone: | rc | ||||||
Target Release: | --- | ||||||
Hardware: | All | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2009-06-25 10:45:41 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Michal Nowak
2008-04-14 09:52:43 UTC
But runs OK on intel-s6e5132-01.rhts.boston.redhat.com with -89 kernel. So far runs OK on all the other systems I have tried this on. I am reserving hp-sapphire-01 now to see if the problem is unique to that box but I have a suspicion that this is more related to network configuration (based on the netlink error). Had any networking tests or something that may have done someting "special" with the networking been done on this box prior to seeing the error? > Had any networking tests or something that may have done
> someting "special" with the networking been done on this box prior to seeing the
> error?
Don't think so. It happened right after restart. Basically the networking
worked, was logged via ssh in.
please post dmesg after you see "Error inserting ecryptfs..." Unluckily, we have in lab only one this HP Sapphire machine and that is booked. I sent my registration job to queue and will wait for it, dunno how long it might take. Anyone, feel free to pick up the machine if you get it into it and post the kring data here. Created attachment 303133 [details]
kring from HP Sapphire
Does this happen every time? I wonder why the netlink socket creation fails (this is below ecryptfs, FWIW...) Instrumenting to find where it failed would probably be instructive... (In reply to comment #8) > Does this happen every time? I wonder why the netlink socket creation fails > (this is below ecryptfs, FWIW...) Instrumenting to find where it failed would > probably be instructive... Yes, every time. I am sorry but probably I am unable to provide more information. I just picked up the machine from RHTS and modprobed ecryptfs kernel module and got error msg on stdout, in /var/log/messages as well as in kring/dmesg. I saw it happen only on this machine and only with this module. For more information, do register the machine in RHTS lab and take a look on your own, please. There are a few possibilities: * no networking support in the kernel. * Low memory. * Security policy. I don't see any possibility that is IA64 arch specific so far... So it should have chance to be reproduced on other platforms with similar configurations... Please verify if any possibility above applies. (In reply to comment #10) > There are a few possibilities: > > * no networking support in the kernel. I am connected to the server via SSH, there are eth0-3, eth0 active. > * Low memory. Don't think so. [root@hp-sapphire-01 ~]# free total used free shared buffers cached Mem: 100125888 1033856 99092032 0 31120 230080 -/+ buffers/cache: 772656 99353232 Swap: 4194272 0 4194272 > * Security policy. [root@hp-sapphire-01 ~]# ausearch -m avc -ts recent <no matches> after modprobing module. Another "Security policy" I should check? > I don't see any possibility that is IA64 arch specific so far... > So it should have chance to be reproduced on other platforms with similar > configurations... Nor do I. (Flipped to "All".) Any progress on this? Finally got my hands on hp-sapphire-01, can't reproduce this on any other system (no idea why). here is what I am seeing in brief, I need to dig deeper to understand the details. down in the netlink layer netlink_insert is returning -EADDRINUSE. The call stack is: netlink_insert netlink_kernel_create ecryptfs_init_netlink ecryptfs_init_messaging ecryptfs_init I have no idea why we are seeing this and why it is only this system. Ah HA!!!!! This is due to an UGLY hack in the lpfc driver. Both ecryptfs and lpfc are using 19 as the unit number for netlink. ecryptfs is doing the right thing by adding NETLINK_ECRYPTFS to linux/netlink.h however lpfc has this hack. In drivers/scsi/lpfc/Makefile: EXTRA_CFLAGS += -DNETLINK_FCTRANSPORT=19 So, on you will run into this bug on any system that has an emulex FC adapter. Chip, Not sure if you "own" lpfc but I am guessing you are the right owner for this? - Doug Hey, thanks for sorting that out :) FWIW, eCryptfs in RHEL5.3 and beyond should not be using netlink anymore, at least by default... Probably still worth fixing, though. Pretty good work Doug! Thanks for resolution. Worth fixing. Adding James Smart at Emulex to the CC: list. James: it seems we have a collision in netlink unit numbers between the lpfc driver and the ecryptfs filesystem. Can we move lpfc to a different number or will that break management apps? Chip Yep. This a shortcoming of our driver. The driver that will be submitted for 5.3 changed the netlink unit number from 19 to 25. Is this acceptable ? We'd really like to use the just-pushed-upstream patch for driver-specific netlink messages that use the scsi unit number, but that requires a change that likely makes a binary interface change - thus I'm assuming it can't be done. coughlan: How can I help you? (In reply to comment #21) > coughlan: How can I help you? The fix is in snapshot 2, kernel 2.6.18-122.el5. Please test this and confirm the fix. Looks like we don't have this box in RHTS anymore. Probably can't help. We have it actually. Just tested with -153.el5 - works out of the box. Closing. |