Description of problem:
When uploading a Virtual Machine image to an FTP server curl fails to get the size of the block device correctly and errors the upload.
curl: (18) Uploaded unaligned file size (2147483648 out of 0 bytes)
Version-Release number of selected component (if applicable):
curl 7.15.5 (x86_64-redhat-linux-gnu) libcurl/7.15.5 OpenSSL/0.9.8b zlib/1.2.3 libidn/0.6.5
Protocols: tftp ftp telnet dict ldap http file https ftps
Features: GSS-Negotiate IDN IPv6 Largefile NTLM SSL libz
Create an LVM device and attempt to upload it to an FTP server
curl -T /dev/servers/vm-image-001 ftp://backup-server/incoming/vmsnapshot
curl: (18) Uploaded unaligned file size (2147483648 out of 0 bytes)
Upload completes without error.
Workaround is to redirect to stdin, although this removes the size consistency check completely.
curl -T - ftp://backup-server/incoming/vmsnapshot </dev/servers/vm-image-001
Could you please explain what you mean by 'curl fails to get the size of the block device correctly'?
What does the following command say?
$ stat /dev/servers/vm-image-001
Well the error "(2147483648 out of 0 bytes)" gives the game away. It's trying to get the size of a block special device.
"What does the following command say?
$ stat /dev/servers/vm-image-001"
As you would expect with LVM, it says it is a block special device - once you follow the symlink.
Size: 0 Blocks: 0 IO Block: 4096 block special file
Device: 11h/17d Inode: 8087 Links: 1 Device type: fd,1
Access: (0660/brw-rw----) Uid: ( 0/ root) Gid: ( 6/ disk)
Access: 2010-08-10 12:38:04.294987867 +0100
Modify: 2010-08-10 12:28:46.758987867 +0100
Change: 2010-08-10 12:28:46.758987867 +0100
I tried to upload the content of a block device -- it was correctly transferred to the server and then it returned the CURLE_PARTIAL_FILE. Note that the FTP protocol can't really transfer the block device as a device -- there will be a regular file on server's side as the result.
So you are just concerned about the error code then?
I suggest to take CURLE_PARTIAL_FILE as success in case of a block device. That's basically equal to your workaround with stdin redirection. How could curl ever know the size in advance if the file system does not provide such information?
The idea is to transfer the contents of the block device to the FTP server as a backup. Clearly curl can't transfer the block device as a block device.
It shouldn't be doing a file size check on files that it can't reliably obtain the size of. So I'd say if stat(2) returns a file size of zero (which seems to be the "I don't know or I'm not telling you " value for that syscall - as well as a real empty file indicator!) skip the check.
You can't restrict to 'special files' as lots of entries in, say, /proc and /sys won't tell you their file size and often they are just marked as standard files. (/proc/partitions for example).
You can do stuff with stat to get the filesystem block size information for block specials, but I'm not sure it is standard enough to be a reliable check - or frankly worth the effort.
(In reply to comment #4)
> It shouldn't be doing a file size check on files that it can't reliably obtain
> the size of. So I'd say if stat(2) returns a file size of zero (which seems to
> be the "I don't know or I'm not telling you " value for that syscall - as well
> as a real empty file indicator!) skip the check.
That's wrong. For a regular file of size zero, it should be still checked if indeed zero bytes were transferred.
> You can't restrict to 'special files' as lots of entries in, say, /proc and
> /sys won't tell you their file size and often they are just marked as standard
> files. (/proc/partitions for example).
What we can however do, is to check the S_IFREG (and perhaps S_IFLNK?) bit given by the stat call. If it's not a regular file, we can bypass the setting of CURLOPT_INFILESIZE_LARGE. Note that the stat call occurs at the curl tool level, so that it won't anyhow affect the behavior of curl library.
Daniel, would be such a change welcome from upstream perspective?
(In reply to comment #5)
> That's wrong. For a regular file of size zero, it should be still checked if
> indeed zero bytes were transferred.
Is it. The problem is that you can't tell if it is a regular file of zero size or a magic regular file in the /proc area (for example) - as /proc/partitions demonstrates. Similarly with all dynamically generated facsimile filesystems - via fuse or whatever.
> What we can however do, is to check the S_IFREG (and perhaps S_IFLNK?) bit
> given by the stat call. If it's not a regular file, we can bypass the setting
> of CURLOPT_INFILESIZE_LARGE. Note that the stat call occurs at the curl tool
> level, so that it won't anyhow affect the behavior of curl library.
The acid test is that an upload of magic regular files like /proc/partitions and an upload of a special file operate without an error.
The libcurl library docs mention that some protocols won't work without knowledge of the file size. Doesn't say which ones though.
curl tool itself is not really a backup tool. I suggest to use an archiver (e.g. tar) and then curl to transmit the result to the server, or do it together using a pipe.
> The libcurl library docs mention that some protocols won't work without
> knowledge of the file size. Doesn't say which ones though.
For example SCP, according to libcurl API documentation:
There is basically the same limitation in scp(1). That's given by the protocol.
Clearly such a file can't be SCP'ed properly, and I'm not sure how we should make curl attempt to send a 0-byte-that-isnt-empty file compared to the 0-byte-that-is-empty file...
Daniel, the original report was about *FTP* upload.
What I was proposing was to make curl(1) ignore the file size for char/block devices, sockets and the like. It should help to upload their content without error (for e.g. FTP).
As for the SCP protocol, the change would cause curl(1) to fail properly on non-regular files, as scp(1) does. The current behavior of curl(1) is to silently create an empty file on server.
I realized this was about FTP, but curl has very very little code that cares about the specific protocol (and quite frankly it doesn't even know it) so I would assume that code in curl would much rather be made generic than trying to identify only FTP URLs. The problem will happen for other protocols as well if you set a bad size for an upload.
So back to the suggested fix, surely for example the file "/proc/interrupts" in a typical Linux box is a normal file, it is 0 bytes when listed with ls -l and yet it shows some 1500 bytes if I cat it... The check for "char/block
devices, sockets etc" will not magically fix this.
I would rather suggest that a more generic fix (work-around?) would be to not set the INFILESIZE at all if the file on disk is reported to be exactly 0 bytes.
(In reply to comment #10)
> I would rather suggest that a more generic fix (work-around?) would be to not
> set the INFILESIZE at all if the file on disk is reported to be exactly 0
It would break SCP upload of empty files. We agreed with Daniel on the less intrusive variant -- distinguish among regular files and special files. It should be sufficient to solve the problem originally reported in comment #0.
patch proposed upstream:
pushed upstream: http://github.com/bagder/curl/commit/5907777
As the patch introduces a change in behavior, it's not acceptable for Enterprise Linux. It has an easy workaround after all. I am closing the bug as UPSTREAM. Thanks for your effort to improve curl.
fix included also in Fedora: