Bug 382621
Summary: | gfs umount deadlock cman:kcl_leave_service | ||||||
---|---|---|---|---|---|---|---|
Product: | [Retired] Red Hat Cluster Suite | Reporter: | Corey Marthaler <cmarthal> | ||||
Component: | gfs | Assignee: | David Teigland <teigland> | ||||
Status: | CLOSED NOTABUG | QA Contact: | GFS Bugs <gfs-bugs> | ||||
Severity: | low | Docs Contact: | |||||
Priority: | low | ||||||
Version: | 4 | CC: | ccaulfie | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | All | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2007-11-14 17:38:24 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Corey Marthaler
2007-11-14 15:19:57 UTC
I wonder if this is some how related to bz 290971? Created attachment 258181 [details]
stack traces form grant-01
[root@grant-01 ~]# cman_tool nodes Node Votes Exp Sts Name 1 1 6 M link-02 2 1 6 M grant-03 3 1 6 M grant-01 4 1 6 M grant-02 5 1 6 M link-07 6 1 6 M link-08 [root@grant-01 ~]# cman_tool services Service Name GID LID State Code Fence Domain: "default" 2 2 run - [1 3 5 6 4 2] DLM Lock Space: "clvmd" 3 3 run - [1 3 5 6 4 2] DLM Lock Space: "LINK_1286" 385 150 run S-15,200,2 [3 2] DLM Lock Space: "LINK_1288" 377 152 run - [3 2] DLM Lock Space: "LINK_1283" 422 154 run - [3 2] DLM Lock Space: "LINK_1284" 369 156 run - [3] DLM Lock Space: "LINK_1287" 412 158 run - [3 4 2] DLM Lock Space: "LINK_1289" 361 160 run - [3 2] DLM Lock Space: "LINK_1282" 353 162 run - [3 4 2] DLM Lock Space: "LINK_1285" 432 164 run - [3 2] GFS Mount Group: "LINK_1288" 381 153 run - [3 2] GFS Mount Group: "LINK_1283" 427 155 run - [3 2] GFS Mount Group: "LINK_1284" 373 157 run - [3] GFS Mount Group: "LINK_1287" 417 159 run - [3 4 2] GFS Mount Group: "LINK_1289" 365 161 run - [3 2] GFS Mount Group: "LINK_1282" 357 163 run - [3 4 2] GFS Mount Group: "LINK_1285" 437 165 run - [3 2] There's a possibility it might be related to 373671 I suppose. The sooner we get that one acked & included the happier I'll be about some of these odd hangs. grant-03 would be just as interesting to inspect, can we still get data from that node? In addition to the bug Patrick mentioned, there are a number of other bugs that we found and fixed while doing mount/unmount stress tests for nokia. [root@grant-03 ~]# cman_tool services Service Name GID LID State Code Fence Domain: "default" 2 2 run - [1 2 3 4 5 6] DLM Lock Space: "clvmd" 3 3 run - [1 2 3 4 5 6] DLM Lock Space: "LINK_1289" 361 68 run - [2 3] DLM Lock Space: "LINK_1281" 402 70 run - [2] DLM Lock Space: "LINK_1282" 353 72 run - [2 3 4] DLM Lock Space: "LINK_1288" 377 74 run - [2 3] DLM Lock Space: "LINK_1283" 422 76 run - [2 3] DLM Lock Space: "LINK_1285" 432 78 run - [2 3] DLM Lock Space: "LINK_1280" 439 80 run - [2] DLM Lock Space: "LINK_1287" 412 82 run - [2 3 4] DLM Lock Space: "LINK_1286" 385 84 run - [2] GFS Mount Group: "LINK_1289" 365 69 run - [2 3] GFS Mount Group: "LINK_1281" 407 71 run - [2] GFS Mount Group: "LINK_1282" 357 73 run - [2 3 4] GFS Mount Group: "LINK_1288" 381 75 run - [2 3] GFS Mount Group: "LINK_1283" 427 77 run - [2 3] GFS Mount Group: "LINK_1285" 437 79 run - [2 3] GFS Mount Group: "LINK_1280" 441 81 run - [2] GFS Mount Group: "LINK_1287" 417 83 run - [2 3 4] GFS Mount Group: "LINK_1286" 389 85 run - [2] Nov 13 19:53:04 grant-03 NET: /sbin/dhclient-script : updated /etc/resolv.conf Nov 13 19:53:04 grant-03 kernel: ADDRCONF(NETDEV_UP): eth0: link is not ready Nov 13 19:53:04 grant-03 dhclient: DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 4 Nov 13 19:53:04 grant-03 dhclient: receive_packet failed on eth0: Network is down Nov 13 19:53:06 grant-03 kernel: CMAN: sendmsg failed: -22 Nov 13 19:53:06 grant-03 kernel: SM: send_nodeid_message error -22 to 3 Nov 13 19:53:07 grant-03 kernel: CMAN: resend failed: -22 Nov 13 19:53:07 grant-03 kernel: tg3: eth0: Link is up at 1000 Mbps, full duplex. Nov 13 19:53:07 grant-03 kernel: tg3: eth0: Flow control is off for TX and off for RX. Nov 13 19:53:07 grant-03 kernel: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready Looks like the net was down during that umount attempt. |