Bug 1443156
Summary: | HE agent is slow when faced with stale nfs mount | ||
---|---|---|---|
Product: | [oVirt] ovirt-hosted-engine-ha | Reporter: | Jiri Belka <jbelka> |
Component: | General | Assignee: | bugs <bugs> |
Status: | CLOSED DUPLICATE | QA Contact: | Nikolai Sednev <nsednev> |
Severity: | high | Docs Contact: | |
Priority: | unspecified | ||
Version: | 2.1.0.5 | CC: | bugs, msivak, nsoffer, rhodain, tnisan |
Target Milestone: | ovirt-4.2.5 | Flags: | rule-engine:
ovirt-4.2?
rule-engine: ovirt-4.3+ |
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2018-07-02 08:53:25 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | Storage | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Jiri Belka
2017-04-18 15:26:43 UTC
I sincerely doubt there's anything we can do here - the mount is stale, and NFS itself takes too long to return an error. Nir, am I missing anything? (In reply to Allon Mureinik from comment #2) > I sincerely doubt there's anything we can do here - the mount is stale, and > NFS itself takes too long to return an error. If the mount was stale, we would never succeed with extracting the ovf. Seems that the nfs server was simply very slow. I don't know what we can do better in this case. (In reply to Nir Soffer from comment #3) > (In reply to Allon Mureinik from comment #2) > > I sincerely doubt there's anything we can do here - the mount is stale, and > > NFS itself takes too long to return an error. > > If the mount was stale, we would never succeed with extracting the ovf. > Seems that > the nfs server was simply very slow. I don't know what we can do better in > this > case. So just CLOSE CANTFIX? I would check the logs first to understand this issue better. We have just hit this issue in our testing lab when our export domain and NFS domain got stale. The hosted engine SD is placed on FC domain and the gent got stuck on bot of the HE nodes. The problem is in ovirt_hosted_engine_ha/lib/heconflib.py in method get_volume_path. We create the volume path like this: 317 volume_path = os.path.join( 318 volume_path, 319 '*', 320 sd_uuid, 321 'images', 322 img_uuid, 323 vol_uuid, 324 ) The volume path looks like this in our case: /rhev/data-center/mnt/*/27da7524-f4b7-41d9-bcc4-c524e4540568/images/1853ae71-943f-4b70-81cb-5e5bcb538524/f2208f13-2f76-46f9-89ba-44a1a0c2ac43 as there are also another mount points than the HE SD we are delayed on the stale iso and export domain. Changing to high to bring attention to this issue as it affect HE availability in case of networking issues. Based on comment 6, moving to integration team. Martin, can you check this? We already fixed this in 4.2. https://bugzilla.redhat.com/show_bug.cgi?id=1485883 https://gerrit.ovirt.org/gitweb?p=ovirt-hosted-engine-ha.git;a=commit;h=b4730a1c1da19b7690c21eeb28531f6a4028b194 And in 4.1.8 https://bugzilla.redhat.com/1485883 https://gerrit.ovirt.org/gitweb?p=ovirt-hosted-engine-ha.git;a=commit;h=0d15a8678e17ebcef3e208772d5676a2fd23205d *** This bug has been marked as a duplicate of bug 1485883 *** |