Al Viro discovered the following problem in cifs.ko while overhauling some of the VFS layer write code: > Take a look at cifs_iovec_writev(). We have decided that > the next chunk to send should cover nr_pages worth of data, allocated > the pages and > save_len = cur_len; > for (i = 0; i < nr_pages; i++) { > copied = min_t(const size_t, cur_len, PAGE_SIZE); > copied = iov_iter_copy_from_user(wdata->pages[i], &it, > 0, copied); > cur_len -= copied; > iov_iter_advance(&it, copied); > } > cur_len = save_len - cur_len; > tried to copy from iovec into said pages. What happens if iovec spans > an munmapped area? It'll copy as much as it can and from that point on > iov_iter_copy_from_user() will be returning 0. And iov_iter_advance(&it, 0) > will, of course, do nothing. > > So we'll end up with cur_len well less than nr_pages * PAGE_SIZE. Then we > start to populate a structure that will be passed to (async) write request: > wdata->sync_mode = WB_SYNC_ALL; > wdata->nr_pages = nr_pages; > wdata->offset = (__u64)offset; > wdata->cfile = cifsFileInfo_get(open_file); > wdata->pid = pid; > wdata->bytes = cur_len; > wdata->pagesz = PAGE_SIZE; > wdata->tailsz = cur_len - ((nr_pages - 1) * PAGE_SIZE); > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > rc = cifs_uncached_retry_writev(wdata); > > and we have a negative number in wdata->tailsz. I don't have a setup > ready for testing, so this is strictly from RTFS, but AFAICS we'll send > nr_pages-1 full pages out (some - with valid data, some - with whatever > page_alloc() has left there) and then hit the last one. We kmap it and > set an iovec with base at the address we'd mapped it on and length > a bit under 4Gb. Then we call kernel_sendmsg() on that turd. Note that > verify_iovec() isn't called - iovec is already kernel-side anyway and > kernel_sendmsg() assumes it to be valid. > > Again, I hadn't tested it, but it looks like a user-triggerable oops at > the very least. All that is needed is writable file on cifs volume > mounted with cache=strict... He's quite correct (though it's even easier to reproduce if you mount with cache=none). I have a reproducer that can trivially crash the kernel when run as an unprivileged user and a patch that fixes the problem. I'll attach those in a bit. Since fixing this is holding up some important work in other areas, I'm going to propose that we disclose this in one week (February 14th).
Created attachment 860442 [details] patch -- cifs: ensure that uncached writes handle unmapped areas correctly This patch fixes the bug. It should apply fairly cleanly to most recent kernels (including RHEL6/7).
Created attachment 862435 [details] patch -- cifs: sanity check length of data to send before sending Second patch to help prevent issues like this in the future. If we get a bad send request from the upper layers, have the transport layer throw a warning and an error: When testing with the reproducer, I get this with this patch applied and not the other one: [ 2253.669549] WARNING: CPU: 1 PID: 8911 at fs/cifs/transport.c:312 smb_send_rqst+0x22f/0x290 [cifs]() [ 2253.670968] Send length mismatch(send_length=4294971460 smb_buf_length=4160) ...plus the expected WARN stack trace. I think we'll want to submit both patches for inclusion.
Patches have now been sent upstream: http://article.gmane.org/gmane.linux.kernel.cifs/9401 http://article.gmane.org/gmane.linux.kernel.cifs/9402 ...they should make their way into mainline soon, and the first one should go to stable soon afterward.
Fixed in git.
Building. In the 3.14-rc2-git4 snapshot kernel. Should hit rawhide tomorrow.