Bug 1303456

Summary: dd byte count report does not correlate with df byte count report
Product: [Fedora] Fedora Reporter: Bob Gustafson <bobgus>
Component: coreutilsAssignee: Ondrej Vasik <ovasik>
Status: CLOSED UPSTREAM QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 23CC: admiller, bobgus, kdudka, kzak, ovasik, p, ricky.tigg, twaugh
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-02-01 16:50:28 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Bob Gustafson 2016-01-31 21:45:52 UTC
Description of problem:

I am trying to copy /boot, /root, /home from a single disk (running the os) to a bigger raid 1 disk pair running in the same chassis.

fedora is the lvm vg_group of the running system, fedora23 is the lvm vg_group of the new disks (raid 1, 2TB disks).

[root@hoho8-chidig-com user1]# lvscan
  ACTIVE            '/dev/fedora/swap' [15.57 GiB] inherit
  ACTIVE            '/dev/fedora/home' [865.45 GiB] inherit
  ACTIVE            '/dev/fedora/root' [50.00 GiB] inherit
  ACTIVE            '/dev/fedora23/swap' [15.60 GiB] inherit
  ACTIVE            '/dev/fedora23/root' [50.00 GiB] inherit
  ACTIVE            '/dev/fedora23/home' [1.75 TiB] inherit


Below is the command I'm using to copy over one partition to the raid pair:

[root@hoho8-chidig-com user1]# dd if=/dev/fedora/root of=/dev/fedora23/root
104857600+0 records in
104857600+0 records out
53687091200 bytes (54 GB) copied, 5639.3 s, 9.5 MB/s
[root@hoho8-chidig-com user1]#


[root@hoho8-chidig-com user1]# df -h
Filesystem               Size  Used Avail Use% Mounted on
devtmpfs                  16G     0   16G   0% /dev
tmpfs                     16G  392K   16G   1% /dev/shm
tmpfs                     16G  1.8M   16G   1% /run
tmpfs                     16G     0   16G   0% /sys/fs/cgroup
/dev/mapper/fedora-root   50G   27G   20G  58% /
tmpfs                     16G   84K   16G   1% /tmp
/dev/loop0               1.9G  3.0M  1.7G   1% /srv/node/swift_loopback
/dev/mapper/fedora-home  852G   57G  752G   7% /home
/dev/sdc1                477M  215M  233M  48% /boot
tmpfs                    3.2G  8.0K  3.2G   1% /run/user/42
tmpfs                    3.2G   24K  3.2G   1% /run/user/1000

Note that dd says 54GB were copied over into a disk partition of size only 50GB. No errors during the copy. And the results show that only 27G was in the fedora partition to start with. (54GB is double 27GB - this may be a clue)

Another example:

[root@hoho8-chidig-com user1]# dd if=/dev/fedora/home of=/dev/fedora23/home
1814978560+0 records in
1814978560+0 records out
929269022720 bytes (929 GB) copied, 32437.8 s, 28.6 MB/s
[root@hoho8-chidig-com user1]# 

Taking a look at the data size on the target partitions:

[root@hoho8-chidig-com user1]# mount /dev/fedora23/root /mnt/rootn
[root@hoho8-chidig-com user1]# mount /dev/fedora23/home /mnt/homen

[root@hoho8-chidig-com user1]# du -h /mnt/rootn | tail -1
26G	    /mnt/rootn

[root@hoho8-chidig-com user1]# du -h /mnt/homen | tail -1
57G	    /mnt/homen

for the rootn partition, the 26G copied size is consistent with the source size.

For the homen partition, the 57G is consistent with the source size.

However, the count given by the dd copies don't match:

 In the case of the root->rootn copy it is 54G instead of 27G
 in the case of the home->homen copy it is 942G instead of 57G


Version-Release number of selected component (if applicable):

dd is (coreutils) 8.24

How reproducible:

I did two tests - as shown

Steps to Reproduce:
1. create raid 1 pair
2. copy a 3rd single disk partition to a raid1 pair lvm partition
3. do df and compare results.

Actual results:

As shown above in problem narative.

Expected results:

I would think that the sizes reported by dd and df would be comparable, but they are not.

Additional info:

Comment 1 Ondrej Vasik 2016-01-31 22:08:36 UTC
It is Raid 1, so dd is IMHO correct with the statement of the double copy byte size. dd just reports reads/writes byte counts provided by kernel...
I'm not sure what causes the home->homen copy difference, strace output might be helpful. 
If you want to have this seen by wider audience, please send this info to coreutils , asking for the root cause of this behaviour - as I think the same behaviour will be in upstream coreutils. I most probably won't get to reproducing this anytime soon.

Comment 2 Kamil Dudka 2016-01-31 23:11:57 UTC
df shows sizes of mounted file systems, which may have different (lower) sizes than the block devices they reside on.  For the actual size of the block devices, which are logical volumes in your case, you should use lvdisplay instead.

Comment 3 Bob Gustafson 2016-02-01 04:54:15 UTC
OK, I sent it out to bug-coreutils

Comment 4 Bob Gustafson 2016-02-01 05:04:53 UTC
(In reply to Kamil Dudka from comment #2)
> For the actual size of the
> block devices, which are logical volumes in your case, you should use
> lvdisplay instead.

lvdisplay does give more or less the same information as found iin the Size column of the df display, but the actual amount of data in the allocated space is given only loosely by the Current LE number.

Comment 5 Bob Gustafson 2016-02-01 05:08:34 UTC
(In reply to Bob Gustafson from comment #4)
> But the actual amount of data in the allocated
> space is given only loosely by the Current LE number.

It looks like all of the space in Size has been allocated and is reflected in the number of 4k extents. No correlation to actual data written.

Comment 6 Kamil Dudka 2016-02-01 08:00:14 UTC
This bug report is totally confusing for me.

The sentence in summary is something that is not supposed to hold -- sizes of mounted file systems will almost never match sizes of their underlying block devices if you copy just contents of the block device.

Please show us:

- exactly one dd operation for which you think the reported size is wrong (the exact command line and its output)

- the exact sizes of the source and destination block devices for that operation (note that df/du is unusable for this)

Comment 7 Bob Gustafson 2016-02-01 13:43:13 UTC
(In reply to Kamil Dudka from comment #6)
> This bug report is totally confusing for me.
> 
> Please show us:
> 
> - exactly one dd operation for which you think the reported size is wrong
> (the exact command line and its output)
> 
> - the exact sizes of the source and destination block devices for that
> operation (note that df/du is unusable for this)

Give me a command that you would like me to test and I will do it. The system is lightly used and it should be quite close to the state tested originally.

Comment 8 Kamil Dudka 2016-02-01 14:21:13 UTC
(In reply to Bob Gustafson from comment #7)
> Give me a command that you would like me to test and I will do it. The
> system is lightly used and it should be quite close to the state tested
> originally.

You first need to tell us what kind of problem you are trying to report.  If the only problem is that "dd byte count report does not correlate with df byte count report", then we can safely close this as NOTABUG.

There are two invocations of dd in comment #0 -- which of them do you think reported wrong byte count?

How are you confirming that the byte count reported by dd is wrong?

Comment 9 Bob Gustafson 2016-02-01 15:54:57 UTC
(In reply to Kamil Dudka from comment #8)
> (In reply to Bob Gustafson from comment #7)

> There are two invocations of dd in comment #0 -- which of them do you think
> reported wrong byte count?
> 
> How are you confirming that the byte count reported by dd is wrong?

I am confused by the reporting of 'bytes copied' in this invocation of dd:

[root@hoho8-chidig-com user1]# dd if=/dev/fedora/home of=/dev/fedora23/home
1814978560+0 records in
1814978560+0 records out
929269022720 bytes (929 GB) copied, 32437.8 s, 28.6 MB/s

When I mount the target partition (/dev/fedora23/home) and look to see the number of bytes as reported by df I don't see 929 GB. I see only 57 GB. This is a big difference.

Can you explain the discrepancy?

[root@hoho8-chidig-com user1]# mount /dev/fedora23/home /mnt/homen

[root@hoho8-chidig-com user1]# du -h /mnt/homen | tail -1
57G	    /mnt/homen

All these numbers were reported in the original presentation.

Comment 10 Bob Gustafson 2016-02-01 16:38:46 UTC
I just now repeated the 'root' dd copy.

[root@hoho8-chidig-com user1]# date
Mon Feb  1 10:05:58 CST 2016

[root@hoho8-chidig-com user1]# dd if=/dev/fedora/root of=/dev/fedora23/root
104857600+0 records in
104857600+0 records out
53687091200 bytes (54 GB) copied, 1392.33 s, 38.6 MB/s
[root@hoho8-chidig-com user1]#

[root@hoho8-chidig-com user1]# date
Mon Feb  1 10:29:34 CST 2016

Compared with the original copy - the bytes copied are almost exactly the same. The speed is a lot higher though.

I did not delete everything in the target partition before starting this copy, but that shouldn't matter with a dd copy, yes?

Comment 11 Kamil Dudka 2016-02-01 16:50:28 UTC
dd transfers data from the input file to the output file (both of them being block devices in your case).  The data is transferred at a low level, where file systems do not exist.  dd does not (and cannot) resize the file system for you.

If you transfer file system data by dd between block devices of different size, you need to update the file system _metadata_ to reflect the actual size of the underlying block device.  This can be done by file system-specific utilities (e.g. resize2fs for ext2/ext3/ext4).

Note that using file system-specific dump/restore utilities to transfer the data is usually a better choice.

Comment 12 Bob Gustafson 2016-02-01 18:30:18 UTC
I'm disappointed that you were not able to explain why dd reports 929 GB of data transferred - and only 57 GB shows up in the target partition. Bytes are bytes, independent of block sizes.

Ondrej Vasik did a good job (at least to me) of explaining why dd reports 54 G and the target disk ('root') showed only 27 G (see Comment #1). But this explanation does not easily extend to the dd copy of 'home'.

The reason I was using 'dd' was that I had previously used 'cpio' and cpio was failing because it 'ran out of space'. If cpio is using dd to do the actual transfers and gets a report of greater than 50 GB when copying the /root segment (where dd reports it transferred 54 GB, then you have a case where the inaccurate reporting of the bytes copied is a problem.

I think you are a bit premature in your proclamation of NOTABUG.

Comment 13 Kamil Dudka 2016-02-01 19:56:23 UTC
(In reply to Bob Gustafson from comment #12)
> I'm disappointed that you were not able to explain why dd reports 929 GB of
> data transferred - and only 57 GB shows up in the target partition.

I have put a lot of effort to explain that.  I am sorry it was not successful.

> Bytes are bytes, independent of block sizes.

Size of a block device is one thing and size of a file system is another thing.  You do not seem to understand why the sizes may differ from each other and how to deal with that.  It is explained in comment #11.

Comment 14 Bob Gustafson 2016-02-01 22:01:46 UTC
(In reply to Kamil Dudka from comment #13)
> (In reply to Bob Gustafson from comment #12)

> > Bytes are bytes, independent of block sizes.
> 
> Size of a block device is one thing and size of a file system is another
> thing.  You do not seem to understand why the sizes may differ from each
> other and how to deal with that.  It is explained in comment #11.

In Comment #11, you are talking about block sizes and resizing disks. I am not doing that at all. I'm doing a simple copy from one disk partition to *another partition* on a *different disk*.

dd is a dirt simple data copy program.

**As one of its outputs, it reports 'bytes copied'.**

I am trying to understand why dd is reporting such a large value for bytes copied - see my Comment #9

Comment 15 Kamil Dudka 2016-02-01 22:22:45 UTC
(In reply to Bob Gustafson from comment #14)
> In Comment #11, you are talking about block sizes

Nope.  You are confusing the term "block size" with "size of block device".

> and resizing disks.

Nope.  It was about resizing a file system.

> I am not doing that at all.

And that is exactly the cause of your troubles as I understand it ;-)

Comment 16 Bob Gustafson 2016-02-01 22:50:17 UTC
You have been talking about blocks and filesystems and I have been consistently talking about bytes - particularly the byte count reported by dd.

Why not just pass this bug report on to someone else.

And take off the CLOSED and NOTABUG status flags.

Comment 17 Kamil Dudka 2016-02-03 08:34:15 UTC
For reference, there is a response to the identical bug report on the upstream mailing list:

http://lists.gnu.org/archive/html/bug-coreutils/2016-02/msg00004.html

Comment 18 Bob Gustafson 2016-02-03 14:47:47 UTC
(In reply to Kamil Dudka from comment #17)
> upstream mailing list:
> 
> http://lists.gnu.org/archive/html/bug-coreutils/2016-02/msg00004.html

Yes, I did receive an email from Bernard Voelker and I replied yesterday morning.

I agreed that there can be small differences between the different reporting utilities, but:

these differences are not off by a factor of 929/57=16+ in the case of copying the 'home' partition.

My reply should appear on that upstream thread sooner or later.

Comment 19 Kamil Dudka 2018-06-28 13:57:49 UTC
*** Bug 1596183 has been marked as a duplicate of this bug. ***