Bug 1339246
Summary: | High IO/load causes VMs to enter Paused state | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [Community] GlusterFS | Reporter: | Nathan Hill <Sustugriel> | ||||||
Component: | sharding | Assignee: | Krutika Dhananjay <kdhananj> | ||||||
Status: | CLOSED NEXTRELEASE | QA Contact: | bugs <bugs> | ||||||
Severity: | high | Docs Contact: | |||||||
Priority: | unspecified | ||||||||
Version: | 3.7.11 | CC: | bugs, Sustugriel | ||||||
Target Milestone: | --- | Keywords: | Triaged | ||||||
Target Release: | --- | ||||||||
Hardware: | x86_64 | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2016-07-01 02:01:31 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Description
Nathan Hill
2016-05-24 13:10:22 UTC
Can you please attach brick logs and client logs to the bug? Created attachment 1163317 [details]
Brick Log for Server.
Created attachment 1163318 [details]
NFS client log from Server
I've attached the brick log from the main share point as well as the NFS client log. If you need more, such as the logs from other bricks let me know. I'll have to rotate them to cut them down in size. Also, the last time this occurred was May 29th, 12:55p. Hi, Thanks for the bug report and apologies for the delay in looking into it. I went through your attachments and I only see logs about loss of quorum. Could you try and recreate this with FUSE and then attach fuse mount logs? FWIW, two other community users, namely Lindsay Mathieson and Kevin Lemmonier hit vm pauses due to bugs in replicate module and due to races in interaction between sharding and replicate module in 3.7.11. They've been fixed now and the fixes should be available in 3.7.12. And as per http://www.gluster.org/pipermail/gluster-devel/2016-May/049677.html , 3.7.12 will be out around 9th of June. Let me know if that works for you. If not, I could share the patches/src tar ball with the fixes applied on top of 3.7.11 and you could confirm that the patches fix the problems you're seeing. -Krutika Greetings, Sorry for the delay getting back to you. At this point if you have a strong hunch that this is resolved in 3.7.12 I might update to that when it comes out and try to recreate the issue on that version. I'm not exactly sure how to recreate it with FUSE as it happens in a virtual environment. Would simply copying the VM image to another network source suffice? -Nathan Yes, it would be good to try it with 3.7.12 and run the same test case and confirm whether the issue still appears with the fixes. You can follow updates on gluster-users and gluster-devel MLs for announcement on the release of 3.7.12. -Krutika Not sure what it was, but 3.7.12 resolved this. I've been running several days without any pausing or any changes. |