From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.8) Gecko/20050511 Firefox/1.0.4 Description of problem: The following simple reproducer causes a panic on ia64 when run as any user. Due to this I am considering this a security sensitive problem. This is another situation where bad arguments to set_mempolicy causes a system panic. I have verified this on ia64 running the latest kernel however it is likey not a recent regression. I have not had a chance to dig into the code to narrow down the issue but it is easily reproducable (on ia64 at least, would be interested in seeing if it can be hit elsewhere). VM: killing process a.out Unable to handle kernel NULL pointer dereference (address 0000000000000000) a.out[7796]: Oops 8847632629764 [1] Modules linked in: nfs lockd nfs_acl md5 ipv6 parport_pc lp parport autofs4 i2c_dev i2c_core sunrpc ds yenta_socket pcmcia_core scsi_dump diskdump zlib_deflate vfat fat dm_multipath button ohci_hcd ehci_hcd e1000 dm_snapshot dm_zero dm_mirror ext3 jbd dm _mod qla2300 qla2xxx lpfc scsi_transport_fc mptscsih mptsas mptspi mptfc mptscsi mptbase sd_mod scsi_mod Pid: 7796, CPU 0, comm: a.out psr : 0000101008126010 ifs : 800000000000cc18 ip : [<a00000010024e650>] Not tainted ip is at __copy_user+0xb0/0x940 unat: 0000000000000000 pfs : 0000000000000a99 rsc : 0000000000000003 rnat: 0000000000000001 bsps: 0000000000000000 pr : 00000001aa6a0b19 ldrs: 0000000000000000 ccv : 0000000000000000 fpsr: 0009804c0270033f csd : 0000000000000000 ssd : 0000000000000000 b0 : a0000001002fcfe0 b6 : a0000001002f7940 b7 : a0000001002fcc00 f6 : 0ffff8000000000000000 f7 : 000000000000000000000 f8 : 000000000000000000000 f9 : 000000000000000000000 f10 : 000000000000000000000 f11 : 000000000000000000000 r1 : a0000001009adda0 r2 : 0000000000000000 r3 : 00000000000c0221 r8 : 0000000000000000 r9 : ffffffffffffffff r10 : 0000000000000000 r11 : 00000001aa6a0a59 r12 : e00000002c0afd70 r13 : e00000002c0a8000 r14 : e00000002c0afde0 r15 : 0000000000000000 r16 : 0000000000000050 r17 : e00000002c0afde0 r18 : e00000002c0afde1 r19 : 0000000000000000 r20 : e00000002c0afde0 r21 : e00000002c0afdb8 r22 : a0000001006638d0 r23 : a0000001007ae9a8 r24 : e00000002c0afdd0 r25 : e00000002c0afdc8 r26 : 0000000000000000 r27 : 0000001008126010 r28 : 0000000000000000 r29 : 0000000000000000 r30 : 0000000000000008 r31 : 0000000000000a99 Call Trace: [<a000000100016b20>] show_stack+0x80/0xa0 sp=e00000002c0af900 bsp=e00000002c0a91e0 [<a000000100017430>] show_regs+0x890/0x8c0 sp=e00000002c0afad0 bsp=e00000002c0a9198 [<a00000010003dbb0>] die+0x150/0x240 sp=e00000002c0afaf0 bsp=e00000002c0a9158 [<a000000100061e80>] ia64_do_page_fault+0x8c0/0xbc0 sp=e00000002c0afaf0 bsp=e00000002c0a90f0 [<a00000010000f540>] ia64_leave_kernel+0x0/0x260 sp=e00000002c0afba0 bsp=e00000002c0a90f0 [<a00000010024e650>] __copy_user+0xb0/0x940 sp=e00000002c0afd70 bsp=e00000002c0a9030 [<a0000001002fcfe0>] write_chan+0x3e0/0xc20 sp=e00000002c0afd70 bsp=e00000002c0a8f80 [<a0000001002ed940>] tty_write+0x440/0x640 sp=e00000002c0afe20 bsp=e00000002c0a8f00 [<a0000001001202d0>] vfs_write+0x290/0x360 sp=e00000002c0afe20 bsp=e00000002c0a8eb0 [<a0000001001204f0>] sys_write+0x70/0xe0 sp=e00000002c0afe20 bsp=e00000002c0a8e38 [<a00000010000f3e0>] ia64_ret_from_syscall+0x0/0x20 sp=e00000002c0afe30 bsp=e00000002c0a8e38 [<a000000000010640>] 0xa000000000010640 sp=e00000002c0b0000 bsp=e00000002c0a8e38 Version-Release number of selected component (if applicable): kernel-2.6.9-30.EL How reproducible: Always Steps to Reproduce: 1. compile the reproducer with cc foo.c -lnuma 2. ./a.out 3. watch smoke fly Actual Results: panic Expected Results: no panic! Additional info:
One important point I neglected to mention. I tried this on the latest upstream kernel (2.6.16-rc1) and did not see the panic.
Did some investigation of the code: During the page fault a new page is in the process of being allocated but this fails and returns NULL way down in alloc_pages_current when it calls zonelist_policy because current->policy->v.zonelist[0] == NULL This is set when we do the set_mempolicy because we are sending it a NULL mask (as we have seen with other set_mempoilcy bugs other invalid masks are likely to do break as well). sys_set_mempolicy calls mpol_new which calls bind_zonelist In bind_zonelist since it doesn't find any bits set in the "nodes" bitmask sets zl->zones[num] = NULL; (where num is 0 since we made no iterations of the for loop). There is a check in mpol_new for the MPOL_BIND case where it checks that policy->v.zonelist is not null after calling bind_zonelist. Perhaps we should return EINVAL if policy->v.zonelist[0] is NULL here as well.
Neither of the two reproducers above cause any harm on x86_64.
I am still able to reproduce this in RHEL4 U4 kernel-2.6.9-42.EL
I still can see the problem with the most recent rhel 4 kernel. The two test cases work with upstream and rhel 5.1 kernel.
lwoodman, This is a old bug and I believe I can help to fix the problem. Please forgive me I will assign it to me if you are not going to look into it recently. Thanks, Luming
Created attachment 161331 [details] patch for fixing invalid argument please test the attached patch. I tested it, it fixes the problem.
moving to parent bug, creating tracking bugs for 4.5.z and 4.6. This issue doesn't affect other rhel.
Luming, please see the follow up comments to your patch on rhkernel-list. you patch has been NACK'ed, and it needs more work.
This issue has been addressed in following products: Red Hat Linux Enterprise 4 Red Hat Linux Enterprise 4.6.z Via RHSA-2008:0055 https://rhn.redhat.com/errata/RHSA-2008-0055.html