Bug 861444
Summary: | rhevm-iso-uploader fails because NFS client doesn't autonegotiate | |||
---|---|---|---|---|
Product: | Red Hat Enterprise Virtualization Manager | Reporter: | Keith Robertson <kroberts> | |
Component: | ovirt-engine-iso-uploader | Assignee: | Kiril Nesenko <knesenko> | |
Status: | CLOSED INSUFFICIENT_DATA | QA Contact: | Leonid Natapov <lnatapov> | |
Severity: | unspecified | Docs Contact: | ||
Priority: | urgent | |||
Version: | 3.1.0 | CC: | acathrow, ashetty, dfediuck, dyasny, grajaiya, hateya, iheim, italkohe, jmoran, knesenko, lnatapov, mgoldboi, Rhev-m-bugs, sgrinber, thildred, vbellur, ykaul | |
Target Milestone: | --- | Keywords: | ZStream | |
Target Release: | 3.2.0 | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | integration | |||
Fixed In Version: | rhevm-iso-uploader-3.2.0-0.1.master.el6ev.noarch.rpm | Doc Type: | Bug Fix | |
Doc Text: |
No doc text required.
|
Story Points: | --- | |
Clone Of: | 857028 | |||
: | 883511 (view as bug list) | Environment: | ||
Last Closed: | 2013-04-09 12:46:12 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | 796352, 857028, 883515 | |||
Bug Blocks: | 858626, 915537 |
Comment 1
Ohad Basan
2012-10-14 13:26:52 UTC
merged downstream: https://gerrit.eng.lab.tlv.redhat.com/gitweb?p=rhevm-iso-uploader.git;a=commit;h=e26c1c8a32a0040fdc61c803bd8302e68e7c4aa8 Still fails on si24.4. The build comes with rhevm-image-uploader-3.1.0-7.el6ev.noarch. I'm running kernel 2.6.32-279.14.1.el6.x86_64. "rpm -q --requires rhevm-iso-uploader" looks like this: /usr/bin/python config(rhevm-iso-uploader) = 3.1.0-8.el6ev kernel >= 2.6.32-279.1.1 rhevm-sdk rpmlib(CompressedFileNames) <= 3.0.4-1 rpmlib(FileDigests) <= 4.6.0-1 rpmlib(PartialHardlinkSets) <= 4.0.4-1 rpmlib(PayloadFilesHavePrefix) <= 4.0-1 rpmlib(PayloadIsXz) <= 5.2-1 Failure looks like this: [root@aqua-rhel ~]# rhevm-image-uploader -v upload -e Export-Gluster /tmp/tailed_engine_log.log Please provide the REST API password for the admin@internal RHEV-M user (CTRL+D to abort): DEBUG: API Vendor(Red Hat) API Version(3.1.0) DEBUG: id=12b61cf3-73d9-4c2e-bf74-44b7ab582958 address=filer01.qa.lab.tlv.redhat.com path=/paikov1 DEBUG: local NFS mount point is /tmp/tmpibxfW8 DEBUG: NFS mount command (/bin/mount -t nfs -o rw,sync,soft filer01.qa.lab.tlv.redhat.com:/paikov1 /tmp/tmpibxfW8) DEBUG: /bin/mount -t nfs -o rw,sync,soft filer01.qa.lab.tlv.redhat.com:/paikov1 /tmp/tmpibxfW8 DEBUG: _cmds(['/bin/mount', '-t', 'nfs', '-o', 'rw,sync,soft', 'filer01.qa.lab.tlv.redhat.com:/paikov1', '/tmp/tmpibxfW8']) DEBUG: returncode(32) DEBUG: STDOUT() DEBUG: STDERR(mount.nfs: Connection timed out ) ERROR: mount.nfs: Connection timed out DEBUG: /bin/umount -t nfs -f /tmp/tmpibxfW8 DEBUG: /bin/umount -t nfs -f /tmp/tmpibxfW8 DEBUG: _cmds(['/bin/umount', '-t', 'nfs', '-f', '/tmp/tmpibxfW8']) DEBUG: returncode(1) DEBUG: STDOUT() DEBUG: STDERR(umount2: Invalid argument umount: /tmp/tmpibxfW8: not mounted ) DEBUG: umount2: Invalid argument umount: /tmp/tmpibxfW8: not mounted (In reply to comment #5) > Still fails on si24.4. > > The build comes with rhevm-image-uploader-3.1.0-7.el6ev.noarch. I'm running > kernel 2.6.32-279.14.1.el6.x86_64. I'm not sure why but Kiril changed the minimum kernel version from 2.6.32-280 to 2.6.32-279.1.1 in comment 3. According to bug 796352, the kernel didn't get the NFS autonegotiate fix until 2.6.32-280. Kiril? 2.6.32-280 is flagged as 6.4+ 6.3-. this bug was back-ported to 6.3 on https://bugzilla.redhat.com/show_bug.cgi?id=832365, and was released with kernel-2.6.32-279.1.1.el6 . anyhow, i don't see this bug as blocker, Yaniv? (In reply to comment #7) > 2.6.32-280 is flagged as 6.4+ 6.3-. > this bug was back-ported to 6.3 on > https://bugzilla.redhat.com/show_bug.cgi?id=832365, and was released with > kernel-2.6.32-279.1.1.el6 . > Ah. Makes sense. > anyhow, i don't see this bug as blocker, Yaniv? Ok, the BZ has clearly been backported to this kernel and yet autonegotiate still doesn't work. It would be helpful to see verbose output from a mount command on aqua-rhel to filer01 wherein version isnt supplied [1]. I really didn't want to pin the NFS version in the code because it seemed clunky and there is a kernel fix. However, if we need an urgent tactical solution it is an easy fix as the NFS mount options are a global variable right at the top of the program. [1] /bin/mount -v -t nfs -o rw,sync,soft filer01.qa.lab.tlv.redhat.com:/paikov1 /tmp/somedir (In reply to comment #9) > (In reply to comment #7) > > 2.6.32-280 is flagged as 6.4+ 6.3-. > > this bug was back-ported to 6.3 on > > https://bugzilla.redhat.com/show_bug.cgi?id=832365, and was released with > > kernel-2.6.32-279.1.1.el6 . > > > Ah. Makes sense. > > anyhow, i don't see this bug as blocker, Yaniv? > > Ok, the BZ has clearly been backported to this kernel and yet autonegotiate > still doesn't work. It would be helpful to see verbose output from a mount > command on aqua-rhel to filer01 wherein version isnt supplied [1]. > > I really didn't want to pin the NFS version in the code because it seemed > clunky and there is a kernel fix. However, if we need an urgent tactical > solution it is an easy fix as the NFS mount options are a global variable > right at the top of the program. > > [1] /bin/mount -v -t nfs -o rw,sync,soft > filer01.qa.lab.tlv.redhat.com:/paikov1 /tmp/somedir Output looks like this: [root@aqua-rhel ~]# /bin/mount -v -t nfs -o rw,sync,soft filer01.qa.lab.tlv.redhat.com:/paikov1 /tmp/somedir mount.nfs: timeout set for Thu Nov 29 10:14:30 2012 mount.nfs: trying text-based options 'soft,vers=4,addr=10.35.64.21,clientaddr=10.35.113.33' mount.nfs: mount(2): Connection refused mount.nfs: trying text-based options 'soft,vers=4,addr=10.35.64.21,clientaddr=10.35.113.33' mount.nfs: mount(2): Connection refused [...many retries...] mount.nfs: Connection timed out (In reply to comment #10) > (In reply to comment #9) > > (In reply to comment #7) > > > 2.6.32-280 is flagged as 6.4+ 6.3-. > > > this bug was back-ported to 6.3 on > > > https://bugzilla.redhat.com/show_bug.cgi?id=832365, and was released with > > > kernel-2.6.32-279.1.1.el6 . > > > > > Ah. Makes sense. > > > anyhow, i don't see this bug as blocker, Yaniv? > > > > Ok, the BZ has clearly been backported to this kernel and yet autonegotiate > > still doesn't work. It would be helpful to see verbose output from a mount > > command on aqua-rhel to filer01 wherein version isnt supplied [1]. > > > > I really didn't want to pin the NFS version in the code because it seemed > > clunky and there is a kernel fix. However, if we need an urgent tactical > > solution it is an easy fix as the NFS mount options are a global variable > > right at the top of the program. > > > > [1] /bin/mount -v -t nfs -o rw,sync,soft > > filer01.qa.lab.tlv.redhat.com:/paikov1 /tmp/somedir > > Output looks like this: > [root@aqua-rhel ~]# /bin/mount -v -t nfs -o rw,sync,soft > filer01.qa.lab.tlv.redhat.com:/paikov1 /tmp/somedir > mount.nfs: timeout set for Thu Nov 29 10:14:30 2012 > mount.nfs: trying text-based options > 'soft,vers=4,addr=10.35.64.21,clientaddr=10.35.113.33' > mount.nfs: mount(2): Connection refused > mount.nfs: trying text-based options > 'soft,vers=4,addr=10.35.64.21,clientaddr=10.35.113.33' > mount.nfs: mount(2): Connection refused > [...many retries...] > mount.nfs: Connection timed out OK, I'm stumped. This should work (according to the manpages and the NFS protocol). We can clearly see above that mount.nfs keeps retrying with vers=4. I will augment the tool so that you can override the "vers" value with the default being 3. Tactical fix is here: https://gerrit.eng.lab.tlv.redhat.com/3518 Still doesn't work. Tested with rhevm-image-uploader-3.2.0-0.1.beta.el6ev.noarch. [root@purple-vds2 tmp]# rhevm-image-uploader -v upload -e export /tmp/rhev-hypervisor-advanced-3.2-20130307.0.el6ev.noarch.rpm Please provide the REST API password for the admin@internal oVirt Engine user (CTRL+D to abort): ERROR: Unable to connect to REST API. Reason: Unauthorized ERROR: 'NoneType' object is not iterable INFO: Use the -h option to see usage. DEBUG: Configuration: DEBUG: command: upload DEBUG: Traceback (most recent call last): DEBUG: File "/usr/bin/rhevm-image-uploader", line 1403, in <module> DEBUG: imageup = ImageUploader(conf) DEBUG: File "/usr/bin/rhevm-image-uploader", line 329, in __init__ DEBUG: self.upload_to_storage_domain() DEBUG: File "/usr/bin/rhevm-image-uploader", line 1186, in upload_to_storage_domain DEBUG: (id, address, path) = self.get_host_and_path_from_export_domain(self.configuration.get('export_domain')) DEBUG: TypeError: 'NoneType' object is not iterable [root@purple-vds2 tmp]# rhevm-image-uploader -v upload -e export /tmp/rhev-hypervisor-advanced-3.2-20130307.0.el6ev.noarch.rpm Please provide the REST API password for the admin@internal oVirt Engine user (CTRL+D to abort): DEBUG: API Vendor(Red Hat) API Version(3.2.0) DEBUG: id=2c141113-3e4d-4b5c-89a4-e504a7aa93be address=filer01.qa.lab.tlv.redhat.com path=/DORON1 DEBUG: local NFS mount point is /tmp/tmpiMCR_l DEBUG: NFS mount command (/bin/mount -t nfs -o rw,sync,soft filer01.qa.lab.tlv.redhat.com:/DORON1 /tmp/tmpiMCR_l) DEBUG: /bin/mount -t nfs -o rw,sync,soft filer01.qa.lab.tlv.redhat.com:/DORON1 /tmp/tmpiMCR_l DEBUG: _cmds(['/bin/mount', '-t', 'nfs', '-o', 'rw,sync,soft', 'filer01.qa.lab.tlv.redhat.com:/DORON1', '/tmp/tmpiMCR_l']) /tmp/rhev-hypervisor-advanced-3.2-20130307.0.el6ev.noarch.rpm DEBUG: returncode(32) DEBUG: STDOUT() DEBUG: STDERR(mount.nfs: Connection timed out ) ERROR: mount.nfs: Connection timed out DEBUG: /bin/umount -t nfs -f /tmp/tmpiMCR_l DEBUG: /bin/umount -t nfs -f /tmp/tmpiMCR_l DEBUG: _cmds(['/bin/umount', '-t', 'nfs', '-f', '/tmp/tmpiMCR_l']) DEBUG: returncode(1) DEBUG: STDOUT() DEBUG: STDERR(umount2: Invalid argument umount: /tmp/tmpiMCR_l: not mounted ) DEBUG: umount2: Invalid argument umount: /tmp/tmpiMCR_l: not mounted 1. There is a newer version - rhevm-iso-uploader-3.2.0-1.el6ev 2. On which rhel version you tested it ? rhel 6.4. It is not clear what should be tested. iso-uploader or image-uploader ? Bug title says iso-uploader but from the bug itself and in the thread it looks like it should be image-uploader. (In reply to comment #18) > rhel 6.4. It is not clear what should be tested. iso-uploader or > image-uploader ? > Bug title says iso-uploader but from the bug itself and in the thread it > looks like it should be image-uploader. should work on both - please make sure you are working with kernel-2.6.32-280.el6 and higher. I just verified it on our servers and it works. Seems like you have some storage configuration issues. Leonid, please retest it on another storage domain. The bug explicitly says that the problem was when working against gluster nfs server. From the thread above I understood that the problem was not with the tool but with the kernel and it was fixed in kernel kernel-2.6.32-280 and above. RHEVM machine is RHEL 6.4 with kernel 2.6.32-358.el6.x86_64 Glsuter machine is RHEL 6.2 with kernel 2.6.32-220.23.1 Any thoughts ? Putting needinfo on Keith. Leonid, The problem really isn't with the tools it is with NFS. What I discovered (comment 1) was that the NFS server wasn't auto-negotiating the NFS version. As such, it really doesn't matter if you test on a Gluster enabled RHEL system or not (though you can for posterity's sake). You really just need to... 1) Confirm that the ISO/Image uploader RPMs force kernel >= kernel-2.6.32-280 2) Will upload a file when kernel-2.6.32-280 If 2 fails then I would like to see the output from the following command... - /bin/mount -v -t nfs -o rw,sync,soft <your server here>:/path/to/iso/domain /some/directory/here FYI: Notice that the mount command above does *not* specify the NFS version. This should force the client and the server to auto-negotiate an appropriate NFS version (ie. 3 or 4). If they cannot negotiate then there is still a bug in the kernel. You will see the negotiation sequence in the output. (In reply to comment #21) > The bug explicitly says that the problem was when working against gluster > nfs server. > > > From the thread above I understood that the problem was not with the tool > but with the kernel and it was fixed in kernel kernel-2.6.32-280 and above. > > RHEVM machine is RHEL 6.4 with kernel 2.6.32-358.el6.x86_64 > Glsuter machine is RHEL 6.2 with kernel 2.6.32-220.23.1 > > Any thoughts ? > Putting needinfo on Keith. Leonid, can you check the flow Keith indicated? please reopen if relevant. |