Bug 119820

Summary: Mapping more than 4096 pages using map_user_kiobuf fails
Product: Red Hat Enterprise Linux 3 Reporter: Stuart Mills <smills>
Component: kernelAssignee: Dave Anderson <anderson>
Status: CLOSED NOTABUG QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: 3.0CC: petrides, riel
Target Milestone: ---   
Target Release: ---   
Hardware: i686   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2004-04-07 20:29:54 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Stuart Mills 2004-04-02 13:01:25 UTC
From Bugzilla Helper:
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; 
T312461; .NET CLR 1.0.3705)

Description of problem:
Calls to map_user_kiobuf in a driver to map user memory to kernel 
memory seem to fail when the buffer length exceeds 4096 pages.

The return value from the call is -12, ENOMEM, although there is 
plenty of memory available.

This worked in previous versions of the kernel (2.4.20-8 for example) 
and works in the Fedora kernel 2.4.22-1.  Replacing the Enterprise 
Linux kernel RPM with the Fedora kernel RPM, also solved the problem.

Version-Release number of selected component (if applicable):
2.4.21-9

How reproducible:
Always

Steps to Reproduce:
Write a driver which can:
1. be passed a user buffer, using an ioctl operation for example
2. allocate a kiobuf (alloc_kiovec)
2. map the user memory into the kiobuf using map_user_kiobuf
3. display the result of the map operation (e.g. print to log)
4. unmap the memory  (unmap_kiobuf)
5. free the kiobuf (free_kiovec)

After loading the module pass a buffer to the module (using a small 
test program, for example), and note the result of the operation.  
    

Actual Results:  For me, the return value was 0 for buffers less than 
4096 pages (approx 16MBytes), and -12 for anything larger than this.

Expected Results:  If memory is available, the return value should be 
0 for all map operations.

Additional info:

Comment 1 Arjan van de Ven 2004-04-02 13:08:33 UTC
can you provide a pointer to the source of the driver ?

Comment 2 Stuart Mills 2004-04-02 13:19:51 UTC
The problem was experienced in a driver I've written.

Below is the function causing the problem:

int map_write_buffer(struct net_device *dev, struct kiobuf **iobuf,
                void *buffer, size_t buffersize)
{
        int result = 0;

        result = alloc_kiovec(1, iobuf);
	if (result != 0)
	{
		ERRORMSG("Could not allocate memory for the mapping 
of user memory to kernel memory, result = %d\n", result);
		return result;
	}

	result = map_user_kiobuf(WRITE, *iobuf, (unsigned long)buffer,
				buffersize);
	if (result != 0)
	{
		ERRORMSG("Could not map user memory to kernel memory, 
result = %d\n", result);
		free_kiovec(1, iobuf);
                *iobuf = NULL;
	}

	return result;
}

Comment 4 Dave Anderson 2004-04-05 16:21:21 UTC

What appears to be happening is this:

map_user_kiobuf() is calling expand_kiobuf() with the request page count:

       err = expand_kiobuf(iobuf, pgcount);
       if (err)
               return err;

expand_kiobuf() is calling kmalloc() like so:

        blocks = kmalloc(wanted * SECTORS_PER_PAGE *
                         sizeof(unsigned long), GFP_KERNEL);
        if (unlikely(!blocks)) {
                kfree(maplist);
                return -ENOMEM;
        }

and returning ENOMEM (-12).

Given a page count of 4096, SECTORS_PER_PAGE of 8 (512*8) and long
size of 4, the request would be 128K.

If the system has 128K of *contiguous* memory available, then the
request will return OK.  If not, and the defragmentation of kernel
memory cannot create a 128K contigous chunk, the call will fail.

However, given that the largest RHEL3 slab cache size is the
"size-131072" slab, any attempt to kmalloc more than 4096 pages
is guaranteed to fail.

It probably works OK in non-RHEL3 kernels because the kmalloc() of
the "blocks" is not done up-front.  I'm unfamiliar with this area of
code, but the change to make the blocks allocation done up-front in
expand_kiobuf() is in:

  linux-2.4.21-kiobuf-fixes.patch

Perhaps somebody familiar with this patch can explain the change?

In any case, there's no way a kmalloc() greater than 128K can be done.