RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 858793 - When started too early, rpc.statd emits "nsm_parse_reply: can't decode RPC reply" messages every few seconds
Summary: When started too early, rpc.statd emits "nsm_parse_reply: can't decode RPC re...
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: nfs-utils
Version: 6.3
Hardware: All
OS: Linux
unspecified
low
Target Milestone: rc
: ---
Assignee: Steve Dickson
QA Contact: Filesystem QE
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-09-19 17:16 UTC by Orion Poplawski
Modified: 2018-12-06 14:49 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-08-13 17:52:42 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Orion Poplawski 2012-09-19 17:16:09 UTC
Description of problem:

Apparently if rpc.statd starts to early it can get into a bad state:

Sep 11 16:01:28 hawk rpc.statd[1303]: Version 1.2.3 starting
Sep 11 16:01:28 hawk sm-notify[1304]: Version 1.2.3 starting
Sep 11 16:01:28 hawk rpc.statd[1303]: nsm_parse_reply: can't decode RPC reply
Sep 11 16:01:28 hawk rpc.statd[1303]: nsm_parse_reply: can't decode RPC reply
Sep 11 16:01:29 hawk kernel: RPC: Registered named UNIX socket transport module.
Sep 11 16:01:29 hawk kernel: RPC: Registered udp transport module.
Sep 11 16:01:29 hawk kernel: RPC: Registered tcp transport module.
Sep 11 16:01:29 hawk kernel: RPC: Registered tcp NFSv4.1 backchannel transport module.
Sep 11 16:01:41 hawk rpc.statd[1303]: nsm_parse_reply: can't decode RPC reply
Sep 11 16:01:41 hawk rpc.statd[1303]: nsm_parse_reply: can't decode RPC reply
Sep 11 16:01:45 hawk rpc.statd[1303]: nsm_parse_reply: can't decode RPC reply
Sep 11 16:01:45 hawk rpc.statd[1303]: nsm_parse_reply: can't decode RPC reply
Sep 11 16:01:57 hawk rpc.statd[1303]: nsm_parse_reply: can't decode RPC reply
Sep 11 16:01:57 hawk rpc.statd[1303]: nsm_parse_reply: can't decode RPC reply
Sep 11 16:01:58 hawk rpc.statd[1303]: nsm_parse_reply: can't decode RPC reply
Sep 11 16:01:58 hawk rpc.statd[1303]: nsm_parse_reply: can't decode RPC reply
....

Restarting rpc.statd fixes things.  From the log it does look like rpc.statd starts before rpc is fully set up going by the kernel modules.  Not sure the proper way to make it wait.  Perhaps the rpcbind init script needs to not return until it is fully initialized?

This machine is an old dual processor PIII, so perhaps the combination allows the race to occur.

Version-Release number of selected component (if applicable):
nfs-utils-1.2.3-26.el6.i686
rpcbind-0.2.0-9.el6.i686

How reproducible:
Not very it seems, only seen once, but unless you're checking logs you won't see it.

Comment 2 Steve Dickson 2012-10-08 20:31:47 UTC
(In reply to comment #0)
> Description of problem:
> 
> Apparently if rpc.statd starts to early it can get into a bad state:
> 
> Sep 11 16:01:28 hawk rpc.statd[1303]: Version 1.2.3 starting
> Sep 11 16:01:28 hawk sm-notify[1304]: Version 1.2.3 starting
> Sep 11 16:01:28 hawk rpc.statd[1303]: nsm_parse_reply: can't decode RPC reply
> Sep 11 16:01:28 hawk rpc.statd[1303]: nsm_parse_reply: can't decode RPC reply
> Sep 11 16:01:29 hawk kernel: RPC: Registered named UNIX socket transport
> module.
> Sep 11 16:01:29 hawk kernel: RPC: Registered udp transport module.
> Sep 11 16:01:29 hawk kernel: RPC: Registered tcp transport module.
> Sep 11 16:01:29 hawk kernel: RPC: Registered tcp NFSv4.1 backchannel
> transport module.
> Sep 11 16:01:41 hawk rpc.statd[1303]: nsm_parse_reply: can't decode RPC reply
> Sep 11 16:01:41 hawk rpc.statd[1303]: nsm_parse_reply: can't decode RPC reply
> Sep 11 16:01:45 hawk rpc.statd[1303]: nsm_parse_reply: can't decode RPC reply
> Sep 11 16:01:45 hawk rpc.statd[1303]: nsm_parse_reply: can't decode RPC reply
> Sep 11 16:01:57 hawk rpc.statd[1303]: nsm_parse_reply: can't decode RPC reply
> Sep 11 16:01:57 hawk rpc.statd[1303]: nsm_parse_reply: can't decode RPC reply
> Sep 11 16:01:58 hawk rpc.statd[1303]: nsm_parse_reply: can't decode RPC reply
> Sep 11 16:01:58 hawk rpc.statd[1303]: nsm_parse_reply: can't decode RPC reply
I have to wonder if this is some type of network issue... This error
is happen because xdr_replymsg() is failing. The only reason xdr_replymsg()
can fail there is some corruption in the RPC header...

Comment 3 Orion Poplawski 2012-10-08 20:36:42 UTC
I think it's triggered by rpc.statd starting before portmap and/or the kernel rpc modules being configured.

Comment 4 Steve Dickson 2012-10-12 13:15:55 UTC
I'm having hard time reproduce this.... any suggestions?

Comment 5 Orion Poplawski 2012-10-12 17:19:11 UTC
Hmm, not really.  If it happens again I'll attach a debugger to try to get more info.

Comment 10 Steve Dickson 2013-08-13 17:52:42 UTC
Since we have not seen this for a while I'm going to close it....


Note You need to log in before you can comment on or make changes to this bug.