Bug 1582165 - Enhancements – Accuracy, Documentation, Conventions, Basic units of measure
Summary: Enhancements – Accuracy, Documentation, Conventions, Basic units of measure
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Fedora
Classification: Fedora
Component: coreutils
Version: 28
Hardware: x86_64
OS: Linux
unspecified
low
Target Milestone: ---
Assignee: Kamil Dudka
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-05-24 12:14 UTC by ricky.tigg
Modified: 2018-05-28 13:21 UTC (History)
10 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2018-05-25 08:44:20 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description ricky.tigg 2018-05-24 12:14:27 UTC
Version-Release number of component: coreutils.x86_64 8.29-6.fc28 @updates

1. Accuracy

Actual results: In terminal, outputs resulting from the execution of command:
– 'df -h' are specified at best with one digit accuracy.

Enhancement: An accuracy with two digits (model 0.00 M) would be appropriate.

2. Documentation

Actual results: Df tool displays sizes associated to data storage observing the notation (k, M, G, ...) while obviously outputs resulting from 'df' tends to indicate that actually, by default sizes associated to data storage are specified by observing the international standard ISO 80000 –also known as IEC 80000– which sets the convention regarding binary prefixes (e.g. 1024 = 1 Ki = '1 kibi') in that field. But that is all a deduction.

'man df' lacks valuable information related to that policy that required to be established in man page.

Enhancement: above mentioned policy to be dully documented.

3. Conventions

Actual results: Convention used regarding the signs '%' and Metric prefixes (k, M, G, ...)  attached along with data storage values –made upon format (model 0%)– is the one that applies to American English and British English and others countries attached to the British crown. Nevertheless others countries' convention regarding values an units representation observe a format which introduces a space between values and units (here as model: 0 %).

Furthermore readability of expressions displayed in such formats are not equal; model '0 %' is superior to '0%'.

Enhancement: As expected model '0 %' is the one who fill higher purpose.

4. Basic units of measure are missing (by policy, not as result of an issue)

Actual results: Basic units of measure are missing since only metric prefixes are displayed.

Enhancement: Since unit prefixes illustrated here are in this context likely to indicate nothing else but multiples of the unit proper to digital information, characters 'b' or 'B' and their equivalents for others supported system languages must be considered to be part of the designation.

Additional info: Metric prefix is a unit prefix that precedes a basic unit of measure to indicate in that case a multiple of the unit.

Comment 1 Kamil Dudka 2018-05-24 12:29:46 UTC
Thanks for the suggestion!  Unfortunately, I do not think we can change the output format without breaking compatibility with existing scripts that use the current output format.

If you need to get more precise results, or results in any different format, I would suggest to run df _without_ the -h option and post-process the output of df by an awk script.

I would also suggest to take further enhancement proposals to the upstream mailing list, which has wider audience and it is thus more appropriate to discuss such proposals:

https://www.gnu.org/software/coreutils/coreutils.html#help

Comment 2 ricky.tigg 2018-05-24 20:04:58 UTC
Is then expression 'df | awk '<pattern> {<action>}' /output-file' the right interpretation?

Comment 3 Kamil Dudka 2018-05-25 08:44:20 UTC
You can start with something like this:

$ df | awk '{ if (NR > 1) printf("%-30s %8.2f GiB %8.2f GiB %8.2f GiB %7.2f %%   %s\n", $1, $2/(2**20), $3/(2**20), $4/(2**20), 100.0*$3/$2, $6) }'

... which prints the following on my system:

udev                               0.01 GiB     0.00 GiB     0.01 GiB    0.04 %   /dev
/dev/mapper/kdudka--nb-root      458.93 GiB   341.97 GiB   115.16 GiB   74.51 %   / 
tmpfs                              1.55 GiB     0.00 GiB     1.55 GiB    0.07 %   /run
shm                                7.77 GiB     0.04 GiB     7.74 GiB    0.47 %   /dev/shm
cgroup_root                        0.01 GiB     0.00 GiB     0.01 GiB    0.00 %   /sys/fs/cgroup
/dev/mapper/kdudka--nb-root      458.93 GiB   341.97 GiB   115.16 GiB   74.51 %   /home/kdudka
/dev/mapper/kdudka--nb-root      458.93 GiB   341.97 GiB   115.16 GiB   74.51 %   /tmp
/dev/mapper/kdudka--nb-root      458.93 GiB   341.97 GiB   115.16 GiB   74.51 %   /usr/portage
/dev/mapper/kdudka--nb-root      458.93 GiB   341.97 GiB   115.16 GiB   74.51 %   /usr/src
/dev/mapper/kdudka--nb-root      458.93 GiB   341.97 GiB   115.16 GiB   74.51 %   /var/cache
/dev/mapper/kdudka--nb-root      458.93 GiB   341.97 GiB   115.16 GiB   74.51 %   /var/lib/docker
/dev/mapper/kdudka--nb-root      458.93 GiB   341.97 GiB   115.16 GiB   74.51 %   /var/lib/libvirt
/dev/mapper/kdudka--nb-root      458.93 GiB   341.97 GiB   115.16 GiB   74.51 %   /var/tmp
none                               7.77 GiB     0.00 GiB     7.77 GiB    0.00 %   /run/user/1024
/dev/mapper/data                 916.77 GiB   390.83 GiB   479.35 GiB   42.63 %   /home/share

Comment 4 ricky.tigg 2018-05-25 10:28:47 UTC
The less I can do is to start investigating how to use the awk tool starting at the GNU version of the AWK text processing utility 'man gawk', and figuring out the meaning of the above mentioned input command.

Comment 5 Kamil Dudka 2018-05-28 08:45:22 UTC
Pádraig Brady suggested to use numfmt to format the numbers at upstream ML:

$ df -B1 | numfmt --header --field=2-4 --to=iec --format %.2f

https://lists.gnu.org/archive/html/coreutils/2018-05/msg00063.html

Comment 6 ricky.tigg 2018-05-28 10:48:28 UTC
I figured out all of the command meanings except expressions '"%-30s', '2**20'.

Suggested command in Comment 5 illustrates presently common mistaken use of convention mentioned in Description and that command announced in Comment 3 did achieve to overcome. Percentage too remained with a non-adequate accuracy.

$ df -B1 | numfmt --header --field=2-4 --to=iec --format %.2f
Filesystem                 1B-blocks        Used    Available Use% Mounted on
devtmpfs                       1.89G        0.00        1.89G   0% /dev
...

Comment 7 Kamil Dudka 2018-05-28 12:09:04 UTC
(In reply to ricky.tigg from comment #6)
> I figured out all of the command meanings except expressions '"%-30s',
> '2**20'.

'%-30s' is a printf conversion for left-aligned string of size 30 chars.

'2**20' is an awk expression for 2^20, which is equal to 1048576.

> Suggested command in Comment 5 illustrates presently common mistaken use of
> convention mentioned in Description and that command announced in Comment 3
> did achieve to overcome. Percentage too remained with a non-adequate
> accuracy.

You can even combine both the approaches together:

$ df -B1 | awk '{ if (NR > 1) printf("%-30s %12d %12d %12d %7.2f%% %s\n", $1, $2, $3, $4, $3?100.0*$3/$2:0, $6) }' | numfmt --field=2-4 --to=iec --format %.2f

Comment 8 ricky.tigg 2018-05-28 13:21:10 UTC
Thank you. 'df -B1' piped twice. With current parameters 'numfmt --field=2-4 --to=iec --format %.2f' degraded well achieved command output in Comment 3: '1,81G', '0,00', once more '0.00%'.

$ df -B1 | awk '{ if (NR > 1) printf("%-30s %12d %12d %12d %7.2f%% %s\n", $1, $2, $3, $4, $3?100.0*$3/$2:0, $6) }' | numfmt --field=2-4 --to=iec --format %.2f
devtmpfs          1,81G         0,00        1,81G    0.00%       /dev


Note You need to log in before you can comment on or make changes to this bug.