Description of problem: I found a problem about kswapd. This problem causes a system stall. int try_to_free_pages(unsigned int gfp_mask) { int ret = 1; gfp_mask = pf_gfp_mask(gfp_mask); if (gfp_mask & __GFP_WAIT) { current->flags |= PF_MEMALLOC; ret = do_try_to_free_pages(gfp_mask); current->flags &= ~PF_MEMALLOC; } return ret; } If kswapd call try_to_free_pages(), the PF_MEMALLOC flag is lost. Under heavy memory pressure, kswapd may fail to call getblk() and call free_more_memory(). The free_more_memory() calls try_to_free_pages(). As a result, the kswapd loses the PF_MEMALLOC flag. diff -uNr linux.org/mm/vmscan.c linux/mm/vmscan.c --- linux.org/mm/vmscan.c 2004-12-13 14:46:03.000000000 +0900 +++ linux/mm/vmscan.c 2004-12-13 14:52:22.000000000 +0900 @@ -1273,6 +1273,7 @@ int try_to_free_pages(unsigned int gfp_mask) { int ret = 1; + unsigned long pf_flags = current->flags & PF_MEMALLOC; gfp_mask = pf_gfp_mask(gfp_mask); if (gfp_mask & __GFP_WAIT) { @@ -1281,6 +1282,8 @@ current->flags &= ~PF_MEMALLOC; } + current->flags |= pf_flags; + return ret; } Version-Release number of selected component (if applicable): kernel-2.4.21-20.EL How reproducible: It needs heavy memory pressure. It is difficult to reproduce this problem.
Fixed in RHEL3-U4: int try_to_free_pages(unsigned int gfp_mask) { int ret = 0; gfp_mask = pf_gfp_mask(gfp_mask); if (gfp_mask & __GFP_WAIT) { if (!(current->flags & PF_MEMALLOC)) { current->flags |= PF_MEMALLOC; ret = do_try_to_free_pages(gfp_mask); current->flags &= ~PF_MEMALLOC; } else if (gfp_mask & (__GFP_IO | __GFP_FS)) ret = do_try_to_free_pages(gfp_mask & ~(__GFP_IO | __GFP_FS)); } return ret; }
A fix for this problem was committed to the RHEL3 U4 patch pool on 10-Sep-2004 (in kernel version 2.4.21-20.5.EL).
An errata has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2004-550.html