Description of problem: Files and processes owned by users with uids above 64k are claimed to be owned by user nfsnobody. Having traced this further, the correct owner seem to be stored in the kernel and on the file system (both local file system and NFS). The problem seem to be incorrect values retured by lstat() (and probably the other stat-like syscalls as well). I've written a small shell script demonstrating the problem. Version-Release number of selected component (if applicable): This is the kernel version on a test machine I used to isolate the problem. We have this problem on all our ia64 systems with RedHat Advanced Server. Linux draco.uio.no 2.4.18-e.31smp #1 SMP Wed May 21 17:47:59 EDT 2003 ia64 unknown How reproducible: Happen every time. Steps to Reproduce: I'll attach a small script demonstrating the problem in /tmp/. We discovered it first on users home directories accessed over NFS, but I wanted to provide a small self-contained test script.
Created attachment 95385 [details] Shell script demonstrating the problem. Run script as root on a system to check if the problem exist there.
The reason for this problem seem to be that the ia64 KERNEL is compiled with the config flag CONFIG_UID16 set to 'y', and this make the kernel truncate UIDs. I found more info on the problem at <URL: http://www.x86-64.org/lists/discuss/msg04027.html>. That messsage is related to x86_64, but the information seem to apply to ia64 as well. This issue give a some of the machines used by the High Performance Computing Group at the University of Oslo problems with its users. To keep the support from RedHat and HP, we need to use a official kernel and it is thus not an option to compile our own kernel.
ok, i agree with the first approach mentioned on the x86_64 list. That is, we should not set CONFIG_UID16=y, and instead fix up the ia32 emulation code. I'll post a patch shortly.
ok, so i disabled CONFIG_UID16, and applied the following patch. The test case provided now works much better. nice test program! These changes are being considered for U3 inclusion. --- linux/arch/ia64/ia32/sys_ia32.c.orig 2003-10-22 17:28:59.000000000 -0400 +++ linux/arch/ia64/ia32/sys_ia32.c 2003-10-22 17:38:27.000000000 -0400 @@ -76,6 +76,25 @@ #define PAGE_OFF(addr) ((addr) & ~PAGE_MASK) #define MINSIGSTKSZ_IA32 2048 +/* backwards compatability for 16-bit uids */ + +#undef high2lowuid +#undef high2lowgid +#undef low2highuid +#undef low2highgid + +#define high2lowuid(uid) ((uid) > 65535) ? (u16)overflowuid : (u16)(uid) +#define high2lowgid(gid) ((gid) > 65535) ? (u16)overflowgid : (u16)(gid) +#define low2highuid(uid) ((uid) == (u16)-1) ? (uid_t)-1 : (uid_t)(uid) +#define low2highgid(gid) ((gid) == (u16)-1) ? (gid_t)-1 : (gid_t)(gid) + +extern int overflowuid,overflowgid; + +typedef u16 old_uid_t; +typedef u16 old_gid_t; + +#include "../../../kernel/uid16.c" + extern asmlinkage long sys_execve (char *, char **, char **, struct pt_regs *); extern asmlinkage long sys_mprotect (unsigned long, size_t, unsigned long); extern asmlinkage long sys_munmap (unsigned long, size_t); @@ -86,6 +105,8 @@ extern unsigned long arch_get_unmapped_a asmlinkage long sys32_mprotect (unsigned int, unsigned int, int); asmlinkage unsigned long sys_brk(unsigned long); + + /* * Anything that modifies or inspects ia32 user virtual memory must hold this semaphore * while doing so. @@ -185,8 +206,8 @@ putstat (struct stat32 *ubuf, struct sta err |= __put_user(kbuf->st_ino, &ubuf->st_ino); err |= __put_user(kbuf->st_mode, &ubuf->st_mode); err |= __put_user(kbuf->st_nlink, &ubuf->st_nlink); - err |= __put_user(kbuf->st_uid, &ubuf->st_uid); - err |= __put_user(kbuf->st_gid, &ubuf->st_gid); + err |= __put_user(high2lowuid(kbuf->st_uid), &ubuf->st_uid); + err |= __put_user(high2lowgid(kbuf->st_gid), &ubuf->st_gid); err |= __put_user(kbuf->st_rdev, &ubuf->st_rdev); err |= __put_user(kbuf->st_size, &ubuf->st_size); err |= __put_user(kbuf->st_atime, &ubuf->st_atime);
Can we get a better desciption of the severity of this issue? As this is fixed in the current 3.0 release.
An upgrade to 3.0 is an acceptable solution.
This fix is in the currently testing RHEL 2.1 beta, which is scheduled to ship shortly. RHEL 3 should already be immune to this issue.
An errata has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2003-368.html