Bug 1719983

Summary: Update mountstats and nfsiostat tools for new count of ops completing with errors in RPC iostats version 1.1
Product: Red Hat Enterprise Linux 8 Reporter: Dave Wysochanski <dwysocha>
Component: nfs-utilsAssignee: Steve Dickson <steved>
Status: CLOSED ERRATA QA Contact: Yongcheng Yang <yoyang>
Severity: unspecified Docs Contact: Alexandra Nikandrova <anikandr>
Priority: medium    
Version: 8.0CC: ajmitchell, anikandr, bcodding, jiyin, steved, swhiteho, xzhou, yoyang
Target Milestone: rcKeywords: FutureFeature
Target Release: 8.2   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: nfs-utils-2.3.3-27.el8 Doc Type: Enhancement
Doc Text:
.New `per-op` error counter is now available in the output of the `mountstats` and `nfsiostat` A minor supportability feature is available for the NFS client systems: the output of the `mountstats` and `nfsiostat` commands in `nfs-utils` have a `per-op` error count. This enhancement allows these tools to display `per-op` error counts and percentages that can assist in narrowing down problems on specific NFS mount points on an NFS client machine. Note that these new statistics depend on kernel changes that are inside the Red{nbsp}Hat Enterprise{nbsp}Linux 8.2 kernel.
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-04-28 16:51:05 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1636572    
Bug Blocks: 1755139    

Description Dave Wysochanski 2019-06-12 21:13:59 UTC
Description of problem:
A couple nfs-utils tools need updated for the new /proc/self/mountstats count that records the number of ops completing in error on a per-op basis.  The kernel portion is tracked in https://bugzilla.redhat.com/show_bug.cgi?id=1636572

Version-Release number of selected component (if applicable):
nfs-utils-2.3.3-14.el8.x86_64

How reproducible:
Easy

Steps to Reproduce:
1. Mount a share on an NFS client
2. Do some operation that triggers one or more ops completing with errors
3. Run mountstats mountstats, nfsiostat, mountstats iostat


Expected results:
You should see 'errors' counts in the output for the opcodes that have ops completing in errors

./tools/mountstats/mountstats.py iostat | less

...

rhel7u6-node2:/exports mounted on /mnt/nfsv4.1:

           ops/s       rpc bklog
           0.167           0.000

read:              ops/s            kB/s           kB/op         retrans    avg RTT (ms)    avg exe (ms)  avg queue (ms)          errors
                   0.000           0.058         512.316        0 (0.0%)          17.500          17.500           0.000        0 (0.0%)
write:             ops/s            kB/s           kB/op         retrans    avg RTT (ms)    avg exe (ms)  avg queue (ms)          errors
                   0.001           0.262         512.398        0 (0.0%)           1.667           5.778           3.889       1 (11.1%)


 ./tools/mountstats/mountstats.py mountstats | less

...
SEQUENCE:
        278 ops (9%)    7 errors (2%)
        avg bytes sent per op: 84       avg bytes received per op: 48
        backlog wait: 344.438849        RTT: 313.273381         total execute time: 23973.737410 (milliseconds)
GETATTR:
        24 ops (0%)     2669 retrans (11120%)   0 major timeouts        1 errors (4%)
        avg bytes sent per op: 21550    avg bytes received per op: 235
        backlog wait: 401.791667        RTT: 5459.833333        total execute time: 5861.750000 (milliseconds)
WRITE:
        9 ops (0%)      1 errors (11%)
        avg bytes sent per op: 524528   avg bytes received per op: 167
        backlog wait: 3.888889  RTT: 1.666667   total execute time: 5.777778 (milliseconds)
EXCHANGE_ID:
        8 ops (0%)      4 errors (50%)
        avg bytes sent per op: 134      avg bytes received per op: 70
        backlog wait: 0.000000  RTT: 0.375000   total execute time: 938.750000 (milliseconds)
OPEN_NOATTR:
        6 ops (0%)      1 errors (16%)
        avg bytes sent per op: 272      avg bytes received per op: 283
        backlog wait: 45370.833333      RTT: 1.333333   total execute time: 45373.500000 (milliseconds)
CLOSE:
        5 ops (0%) 
        avg bytes sent per op: 224      avg bytes received per op: 163
        backlog wait: 0.000000  RTT: 0.600000   total execute time: 0.600000 (milliseconds)

./tools/nfs-iostat/nfs-iostat.py 

rhel7u6-node2:/exports mounted on /mnt/nfsv4.1:

           ops/s       rpc bklog
           0.165           0.000

read:              ops/s            kB/s           kB/op         retrans    avg RTT (ms)    avg exe (ms)  avg queue (ms)          errors
                   0.000           0.058         512.316        0 (0.0%)          17.500          17.500           0.000        0 (0.0%)
write:             ops/s            kB/s           kB/op         retrans    avg RTT (ms)    avg exe (ms)  avg queue (ms)          errors
                   0.001           0.260         512.398        0 (0.0%)           1.667           5.778           3.889       1 (11.1%)


Additional info:

See kernel bug https://bugzilla.redhat.com/show_bug.cgi?id=1636572#c27

Comment 1 Dave Wysochanski 2019-06-12 21:22:05 UTC
One patch is already upstream but I have a few more in progress.

In progress:
f1cf10c5 mountstats: Check for RPC iostats version >= 1.1 with error counts
f1e53a18 mountstats: Add per-op error counts to iostat command when RPC iostats version >= 1.1
8e9851de nfsiostat: Add error counts to output when RPC iostats version >= 1.1

Already upstream
73491ef2 mountstats: add per-op error counts for mountstats command

Comment 5 Dave Wysochanski 2019-06-26 17:08:01 UTC
New patches sent to linux-nfs:

[PATCH 1/2] nfsiostat: Add error counts to output when RPC iostats version >= 1.1
[PATCH 2/2] mountstats: Add per-op error counts to iostat command when RPC iostats version >= 1.1
[PATCH] mountstats: Fix nfsstat command to handle RPC iostats version >= 1.1

Comment 6 Dave Wysochanski 2019-07-25 15:32:15 UTC
SteveD - I don't see the above patches from comment #5 merged yet in nfs-utils or any reply on the list.  Can you tell me what needs done next to move it along?

Comment 8 Steve Dickson 2019-08-01 16:15:10 UTC
commit c917e4ba433594b2a80fc90bfcb449e6a32043a9 (HEAD -> master, origin/master, origin/HEAD)
Author: Dave Wysochanski <dwysocha>
Date:   Thu Aug 1 12:10:29 2019 -0400

    mountstats: Fix nfsstat command to handle RPC iostats version >= 1.1

commit 1af88ff6b4ca272dff6f6b2ee4ba231405dc33d2
Author: Dave Wysochanski <dwysocha>
Date:   Thu Aug 1 12:08:53 2019 -0400

    mountstats: Add per-op error counts to iostat command when RPC iostats version >= 1.1

commit cfa65efa572c6dfc5d174d08a4454ede01acb5a0
Author: Dave Wysochanski <dwysocha>
Date:   Thu Aug 1 12:07:24 2019 -0400

    nfsiostat: Add error counts to output when RPC iostats version >= 1.1

Comment 12 Steve Dickson 2019-08-26 14:48:57 UTC
Here is the correct patch set, correct?

commit c917e4ba433594b2a80fc90bfcb449e6a32043a9
Author: Dave Wysochanski <dwysocha>
Date:   Thu Aug 1 12:10:29 2019 -0400

    mountstats: Fix nfsstat command to handle RPC iostats version >= 1.1

commit 1af88ff6b4ca272dff6f6b2ee4ba231405dc33d2
Author: Dave Wysochanski <dwysocha>
Date:   Thu Aug 1 12:08:53 2019 -0400

    mountstats: Add per-op error counts to iostat command when RPC iostats versi
on >= 1.1

commit cfa65efa572c6dfc5d174d08a4454ede01acb5a0
Author: Dave Wysochanski <dwysocha>
Date:   Thu Aug 1 12:07:24 2019 -0400

    nfsiostat: Add error counts to output when RPC iostats version >= 1.1

commit 73491ef272f9131888ef9f45207abbc2055d6aae
Author: Dave Wysochanski <dwysocha>
Date:   Mon Jun 3 10:31:09 2019 -0400

    mountstats: add per-op error counts for mountstats command

Comment 13 Dave Wysochanski 2019-08-26 17:45:28 UTC
(In reply to Steve Dickson from comment #12)
> Here is the correct patch set, correct?
> 
> commit c917e4ba433594b2a80fc90bfcb449e6a32043a9
> Author: Dave Wysochanski <dwysocha>
> Date:   Thu Aug 1 12:10:29 2019 -0400
> 
>     mountstats: Fix nfsstat command to handle RPC iostats version >= 1.1
> 
> commit 1af88ff6b4ca272dff6f6b2ee4ba231405dc33d2
> Author: Dave Wysochanski <dwysocha>
> Date:   Thu Aug 1 12:08:53 2019 -0400
> 
>     mountstats: Add per-op error counts to iostat command when RPC iostats
> versi
> on >= 1.1
> 
> commit cfa65efa572c6dfc5d174d08a4454ede01acb5a0
> Author: Dave Wysochanski <dwysocha>
> Date:   Thu Aug 1 12:07:24 2019 -0400
> 
>     nfsiostat: Add error counts to output when RPC iostats version >= 1.1
> 
> commit 73491ef272f9131888ef9f45207abbc2055d6aae
> Author: Dave Wysochanski <dwysocha>
> Date:   Mon Jun 3 10:31:09 2019 -0400
> 
>     mountstats: add per-op error counts for mountstats command

Yes these 4 patches cover this bug.  Thanks!

Comment 17 Yongcheng Yang 2020-02-05 09:48:35 UTC
Have verified the "errors" field added in tools `mountstats` and `nfsiostat`!

Comment 32 Benjamin Coddington 2020-04-27 11:16:10 UTC
Let's correct the Doc Text to be as David suggested in comment 22, just use the RN format Alexandra needs.

Comment 36 errata-xmlrpc 2020-04-28 16:51:05 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:1832