| Summary: | call_bail logs should have remote host information for debugging purposes | ||
|---|---|---|---|
| Product: | Red Hat Gluster Storage | Reporter: | krishnan parthasarathi <kparthas> |
| Component: | glusterfs | Assignee: | krishnan parthasarathi <kparthas> |
| Status: | CLOSED ERRATA | QA Contact: | SATHEESARAN <sasundar> |
| Severity: | high | Docs Contact: | |
| Priority: | high | ||
| Version: | 2.0 | CC: | kparthas, ndevos, nsathyan, psriniva, vagarwal, vbellur, vraman |
| Target Milestone: | --- | Keywords: | ZStream |
| Target Release: | RHGS 2.1.2 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | glusterfs-3.4.0.49rhs | Doc Type: | Bug Fix |
| Doc Text: |
Previously, when a mount process waits for response from a brick process, for longer than 30 minutes (default timeout), the mount would fail the request from the application. This event would generate log messages with information about the request that timed out. With this update, the information in the log messages is enhanced to provide the brick processes' hostname to help with meaningful analysis of log messages.
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2014-02-25 08:06:06 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
|
Description
krishnan parthasarathi
2013-11-26 05:28:59 UTC
Downstream patch: https://code.engineering.redhat.com/gerrit/#/c/16279/ KP, Please review the doc text. I've edited it and made it a little less technical for the purpose of errata. Pavithra, The edits look good to me. Tested with glusterfs -3.4.0.55rhs-1.el6rhs Verified with the following steps : 1. Created a 2 Node cluster 2. Created a distribute replicate volume with 2X2 bricks 3. Started the volume 4. Set the frame-timeout to 10seconds (i.e) gluster volume set <vol-name> network.frame-timeout 10 5. Set the ping-timeout to max value (i.e) 1013 seconds (i.e) gluster volume set <vol-name> network.ping-timeout 1013 6. Fuse mount the volume 7. Block all the traffic to glusterd and bricks (i.e) iptables -I INPUT 1 -p tcp --dport 553:24007 -j DROP 8. execute "df- Th" Following snip is found in mount log : <snip> [2014-01-13 18:09:37.167035] E [rpc-clnt.c:208:call_bail] 0-drvol-client-0: bailing out frame type(GlusterFS 3.3) op(STATFS(14)) xid = 0x13x sent = 2014-01-13 18:09:17.115990. timeout = 10 for 10.70.37.10:49153 [2014-01-13 18:09:37.167092] W [client-rpc-fops.c:807:client3_3_statfs_cbk] 0-drvol-client-0: remote operation failed: Transport endpoint is not connected [2014-01-13 18:09:37.167110] E [rpc-clnt.c:208:call_bail] 0-drvol-client-2: bailing out frame type(GlusterFS 3.3) op(STATFS(14)) xid = 0x12x sent = 2014-01-13 18:09:17.116044. timeout = 10 for 10.70.37.10:49154 [2014-01-13 18:09:37.167120] W [client-rpc-fops.c:807:client3_3_statfs_cbk] 0-drvol-client-2: remote operation failed: Transport endpoint is not connected [2014-01-13 18:09:37.167131] E [rpc-clnt.c:208:call_bail] 0-drvol-client-1: bailing out frame type(GlusterFS 3.3) op(STATFS(14)) xid = 0x13x sent = 2014-01-13 18:09:17.116020. timeout = 10 for 10.70.37.11:49153 [2014-01-13 18:09:37.167131] E [rpc-clnt.c:208:call_bail] 0-drvol-client-1: bailing out frame type(GlusterFS 3.3) op(STATFS(14)) xid = 0x13x sent = 2014-01-13 18:09:17.116020. timeout = 10 for 10.70.37.11:49153 [2014-01-13 18:09:37.167151] W [client-rpc-fops.c:807:client3_3_statfs_cbk] 0-drvol-client-1: remote operation failed: Transport endpoint is not connected [2014-01-13 18:09:37.167170] E [rpc-clnt.c:208:call_bail] 0-drvol-client-3: bailing out frame type(GlusterFS 3.3) op(STATFS(14)) xid = 0x12x sent = 2014-01-13 18:09:17.116061. timeout = 10 for 10.70.37.11:49154 </snip> From the above snip its clear that call bail logs have corresponding IP address of the machine Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHEA-2014-0208.html |