Bug 1172905
Summary: | [HC] restarting vdsmd on a centos 7 host remounts gluster volumes, irrevocably pausing any running VMs | ||||||
---|---|---|---|---|---|---|---|
Product: | [oVirt] vdsm | Reporter: | Darrell <budic> | ||||
Component: | General | Assignee: | Nir Soffer <nsoffer> | ||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | SATHEESARAN <sasundar> | ||||
Severity: | urgent | Docs Contact: | |||||
Priority: | urgent | ||||||
Version: | --- | CC: | amureini, asmarre, bazulay, budic, bugs, gklein, kripper, lsurette, mgoldboi, nsoffer, rbalakri, riehecky, sbonazzo, stefano.stagnaro, tnisan, yeylon, ykaul, ylavi | ||||
Target Milestone: | ovirt-3.6.0-rc | Flags: | rule-engine:
ovirt-3.6.0+
ylavi: planning_ack+ rule-engine: devel_ack+ rule-engine: testing_ack+ |
||||
Target Release: | 4.17.8 | ||||||
Hardware: | All | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | v4.17.0.4 | Doc Type: | Bug Fix | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2016-03-11 07:20:04 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | Gluster | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 1175354 | ||||||
Attachments: |
|
Description
Darrell
2014-12-11 03:47:44 UTC
Please try to reproduce with vdsm-4.16.8, which has bug 1162640 fixed. Then, please share your {super,}vdsm.log after the vdsmd restart? glusterfs logs may come up useful, too. Reproduction above was with vdsm-4.16.8-6.gitc240f5c.el7.x86_64. Will upload logs from this test for you. Created attachment 967447 [details]
log files from vm pause
That's probably because QEMU doesn't reopen the file descriptors after they got invalidated. See my comments here: https://bugzilla.redhat.com/show_bug.cgi?id=1058300 Fixed in Gerrit: https://gerrit.ovirt.org/#/c/40239/ https://gerrit.ovirt.org/#/c/40240/ Tested on CentOS 7 Merged and working fine in 3.6 Alpha. Can be closed. Sorry, the patches seems not to be present in 3.6 alpha branch, only in master. Please include in alpha-2, since killing the storage is too dangerous. (In reply to Christopher Pereira from comment #7) > Sorry, the patches seems not to be present in 3.6 alpha branch, only in > master. > Please include in alpha-2, since killing the storage is too dangerous. Since the patches are merged, this bug should be in MODIFIED. It will be included in the next upstream official build, and should already be available in the nightly builds (for the last month or so). I just tested alpha-2 and the patches are now included. Moving to ON_QA so this can be formally verified Tested on 3.6-rc1 on CentOS 7. Patches verified in production for some months. Can be easily verified by checking that glusterd service is NOT running inside VDSM group. The main issue was: https://bugzilla.redhat.com/show_bug.cgi?id=1201355#c7 Bug tickets that are moved to testing must have target release set to make sure tester knows what to test. Please set the correct target release before moving to ON_QA. Yanvi, what info do you need? See comment #12 (In reply to Yaniv Dary from comment #14) The fix exists since 4.17.1, setting target release to 4.17.8 since no other version is available. Tested with RHEV 3.6.3.3 and RHGS 3.1.2 RC by adding RHGS node to 3.5 compatible cluster. 1. Lauched the VM with its disk image on gluster storage domain 2. Restarted the vdsm ( vdsm-4.17.20-0.1.el7ev.noarch ) App VM was not running uninterrupted |