Bug 861444

Summary: rhevm-iso-uploader fails because NFS client doesn't autonegotiate
Product: Red Hat Enterprise Virtualization Manager Reporter: Keith Robertson <kroberts>
Component: ovirt-engine-iso-uploaderAssignee: Kiril Nesenko <knesenko>
Status: CLOSED INSUFFICIENT_DATA QA Contact: Leonid Natapov <lnatapov>
Severity: unspecified Docs Contact:
Priority: urgent    
Version: 3.1.0CC: acathrow, ashetty, dfediuck, dyasny, grajaiya, hateya, iheim, italkohe, jmoran, knesenko, lnatapov, mgoldboi, Rhev-m-bugs, sgrinber, thildred, vbellur, ykaul
Target Milestone: ---Keywords: ZStream
Target Release: 3.2.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: integration
Fixed In Version: rhevm-iso-uploader-3.2.0-0.1.master.el6ev.noarch.rpm Doc Type: Bug Fix
Doc Text:
No doc text required.
Story Points: ---
Clone Of: 857028
: 883511 (view as bug list) Environment:
Last Closed: 2013-04-09 12:46:12 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 796352, 857028, 883515    
Bug Blocks: 858626, 915537    

Comment 1 Ohad Basan 2012-10-14 13:26:52 UTC
Downstream patch posted for review:

https://gerrit.eng.lab.tlv.redhat.com/#/c/2597/

Comment 5 Daniel Paikov 2012-11-28 15:36:45 UTC
Still fails on si24.4.

The build comes with rhevm-image-uploader-3.1.0-7.el6ev.noarch. I'm running kernel 2.6.32-279.14.1.el6.x86_64.

"rpm -q --requires rhevm-iso-uploader" looks like this:
/usr/bin/python  
config(rhevm-iso-uploader) = 3.1.0-8.el6ev
kernel >= 2.6.32-279.1.1
rhevm-sdk  
rpmlib(CompressedFileNames) <= 3.0.4-1
rpmlib(FileDigests) <= 4.6.0-1
rpmlib(PartialHardlinkSets) <= 4.0.4-1
rpmlib(PayloadFilesHavePrefix) <= 4.0-1
rpmlib(PayloadIsXz) <= 5.2-1

Failure looks like this:
[root@aqua-rhel ~]# rhevm-image-uploader -v upload -e Export-Gluster /tmp/tailed_engine_log.log 
Please provide the REST API password for the admin@internal RHEV-M user (CTRL+D to abort): 
DEBUG: API Vendor(Red Hat)      API Version(3.1.0)
DEBUG: id=12b61cf3-73d9-4c2e-bf74-44b7ab582958 address=filer01.qa.lab.tlv.redhat.com path=/paikov1
DEBUG: local NFS mount point is /tmp/tmpibxfW8
DEBUG: NFS mount command (/bin/mount -t nfs -o rw,sync,soft filer01.qa.lab.tlv.redhat.com:/paikov1 /tmp/tmpibxfW8)
DEBUG: /bin/mount -t nfs -o rw,sync,soft filer01.qa.lab.tlv.redhat.com:/paikov1 /tmp/tmpibxfW8
DEBUG: _cmds(['/bin/mount', '-t', 'nfs', '-o', 'rw,sync,soft', 'filer01.qa.lab.tlv.redhat.com:/paikov1', '/tmp/tmpibxfW8'])

DEBUG: returncode(32)
DEBUG: STDOUT()
DEBUG: STDERR(mount.nfs: Connection timed out
)
ERROR: mount.nfs: Connection timed out

DEBUG: /bin/umount -t nfs -f  /tmp/tmpibxfW8
DEBUG: /bin/umount -t nfs -f  /tmp/tmpibxfW8
DEBUG: _cmds(['/bin/umount', '-t', 'nfs', '-f', '/tmp/tmpibxfW8'])
DEBUG: returncode(1)
DEBUG: STDOUT()
DEBUG: STDERR(umount2: Invalid argument
umount: /tmp/tmpibxfW8: not mounted
)
DEBUG: umount2: Invalid argument
umount: /tmp/tmpibxfW8: not mounted

Comment 6 Keith Robertson 2012-11-28 15:47:02 UTC
(In reply to comment #5)
> Still fails on si24.4.
> 
> The build comes with rhevm-image-uploader-3.1.0-7.el6ev.noarch. I'm running
> kernel 2.6.32-279.14.1.el6.x86_64.

I'm not sure why but Kiril changed the minimum kernel version from 2.6.32-280 to 2.6.32-279.1.1 in comment 3.  According to bug 796352, the kernel didn't get the NFS autonegotiate fix until 2.6.32-280.  Kiril?

Comment 7 Moran Goldboim 2012-11-28 21:26:52 UTC
2.6.32-280 is flagged as 6.4+ 6.3-.
this bug was back-ported to 6.3 on https://bugzilla.redhat.com/show_bug.cgi?id=832365, and was released with kernel-2.6.32-279.1.1.el6 .

anyhow, i don't see this bug as blocker, Yaniv?

Comment 9 Keith Robertson 2012-11-29 03:07:41 UTC
(In reply to comment #7)
> 2.6.32-280 is flagged as 6.4+ 6.3-.
> this bug was back-ported to 6.3 on
> https://bugzilla.redhat.com/show_bug.cgi?id=832365, and was released with
> kernel-2.6.32-279.1.1.el6 .
> 
Ah. Makes sense.
> anyhow, i don't see this bug as blocker, Yaniv?

Ok, the BZ has clearly been backported to this kernel and yet autonegotiate still doesn't work.  It would be helpful to see verbose output from a mount command on aqua-rhel to filer01 wherein version isnt supplied [1].

I really didn't want to pin the NFS version in the code because it seemed clunky and there is a kernel fix.  However, if we need an urgent tactical solution it is an easy fix as the NFS mount options are a global variable right at the top of the program.

[1] /bin/mount -v -t nfs -o rw,sync,soft filer01.qa.lab.tlv.redhat.com:/paikov1 /tmp/somedir

Comment 10 Daniel Paikov 2012-11-29 08:19:24 UTC
(In reply to comment #9)
> (In reply to comment #7)
> > 2.6.32-280 is flagged as 6.4+ 6.3-.
> > this bug was back-ported to 6.3 on
> > https://bugzilla.redhat.com/show_bug.cgi?id=832365, and was released with
> > kernel-2.6.32-279.1.1.el6 .
> > 
> Ah. Makes sense.
> > anyhow, i don't see this bug as blocker, Yaniv?
> 
> Ok, the BZ has clearly been backported to this kernel and yet autonegotiate
> still doesn't work.  It would be helpful to see verbose output from a mount
> command on aqua-rhel to filer01 wherein version isnt supplied [1].
> 
> I really didn't want to pin the NFS version in the code because it seemed
> clunky and there is a kernel fix.  However, if we need an urgent tactical
> solution it is an easy fix as the NFS mount options are a global variable
> right at the top of the program.
> 
> [1] /bin/mount -v -t nfs -o rw,sync,soft
> filer01.qa.lab.tlv.redhat.com:/paikov1 /tmp/somedir

Output looks like this:
[root@aqua-rhel ~]# /bin/mount -v -t nfs -o rw,sync,soft filer01.qa.lab.tlv.redhat.com:/paikov1 /tmp/somedir
mount.nfs: timeout set for Thu Nov 29 10:14:30 2012
mount.nfs: trying text-based options 'soft,vers=4,addr=10.35.64.21,clientaddr=10.35.113.33'
mount.nfs: mount(2): Connection refused
mount.nfs: trying text-based options 'soft,vers=4,addr=10.35.64.21,clientaddr=10.35.113.33'
mount.nfs: mount(2): Connection refused
[...many retries...]
mount.nfs: Connection timed out

Comment 11 Keith Robertson 2012-11-29 14:14:00 UTC
(In reply to comment #10)
> (In reply to comment #9)
> > (In reply to comment #7)
> > > 2.6.32-280 is flagged as 6.4+ 6.3-.
> > > this bug was back-ported to 6.3 on
> > > https://bugzilla.redhat.com/show_bug.cgi?id=832365, and was released with
> > > kernel-2.6.32-279.1.1.el6 .
> > > 
> > Ah. Makes sense.
> > > anyhow, i don't see this bug as blocker, Yaniv?
> > 
> > Ok, the BZ has clearly been backported to this kernel and yet autonegotiate
> > still doesn't work.  It would be helpful to see verbose output from a mount
> > command on aqua-rhel to filer01 wherein version isnt supplied [1].
> > 
> > I really didn't want to pin the NFS version in the code because it seemed
> > clunky and there is a kernel fix.  However, if we need an urgent tactical
> > solution it is an easy fix as the NFS mount options are a global variable
> > right at the top of the program.
> > 
> > [1] /bin/mount -v -t nfs -o rw,sync,soft
> > filer01.qa.lab.tlv.redhat.com:/paikov1 /tmp/somedir
> 
> Output looks like this:
> [root@aqua-rhel ~]# /bin/mount -v -t nfs -o rw,sync,soft
> filer01.qa.lab.tlv.redhat.com:/paikov1 /tmp/somedir
> mount.nfs: timeout set for Thu Nov 29 10:14:30 2012
> mount.nfs: trying text-based options
> 'soft,vers=4,addr=10.35.64.21,clientaddr=10.35.113.33'
> mount.nfs: mount(2): Connection refused
> mount.nfs: trying text-based options
> 'soft,vers=4,addr=10.35.64.21,clientaddr=10.35.113.33'
> mount.nfs: mount(2): Connection refused
> [...many retries...]
> mount.nfs: Connection timed out

OK, I'm stumped. This should work (according to the manpages and the NFS protocol).  We can clearly see above that mount.nfs keeps retrying with vers=4.

I will augment the tool so that you can override the "vers" value with the default being 3.

Comment 12 Keith Robertson 2012-11-30 03:15:54 UTC
Tactical fix is here: https://gerrit.eng.lab.tlv.redhat.com/3518

Comment 16 Leonid Natapov 2013-03-17 16:18:44 UTC
Still doesn't work. 
Tested with rhevm-image-uploader-3.2.0-0.1.beta.el6ev.noarch.

[root@purple-vds2 tmp]# rhevm-image-uploader -v upload -e export /tmp/rhev-hypervisor-advanced-3.2-20130307.0.el6ev.noarch.rpm 
Please provide the REST API password for the admin@internal oVirt Engine user (CTRL+D to abort): 
ERROR: Unable to connect to REST API.  Reason: Unauthorized
ERROR: 'NoneType' object is not iterable
INFO: Use the -h option to see usage.
DEBUG: Configuration:
DEBUG: command: upload
DEBUG: Traceback (most recent call last):
DEBUG:   File "/usr/bin/rhevm-image-uploader", line 1403, in <module>
DEBUG:     imageup = ImageUploader(conf)
DEBUG:   File "/usr/bin/rhevm-image-uploader", line 329, in __init__
DEBUG:     self.upload_to_storage_domain()
DEBUG:   File "/usr/bin/rhevm-image-uploader", line 1186, in upload_to_storage_domain
DEBUG:     (id, address, path) = self.get_host_and_path_from_export_domain(self.configuration.get('export_domain'))
DEBUG: TypeError: 'NoneType' object is not iterable
[root@purple-vds2 tmp]# rhevm-image-uploader -v upload -e export /tmp/rhev-hypervisor-advanced-3.2-20130307.0.el6ev.noarch.rpm 
Please provide the REST API password for the admin@internal oVirt Engine user (CTRL+D to abort): 
DEBUG: API Vendor(Red Hat)      API Version(3.2.0)
DEBUG: id=2c141113-3e4d-4b5c-89a4-e504a7aa93be address=filer01.qa.lab.tlv.redhat.com path=/DORON1
DEBUG: local NFS mount point is /tmp/tmpiMCR_l
DEBUG: NFS mount command (/bin/mount -t nfs -o rw,sync,soft filer01.qa.lab.tlv.redhat.com:/DORON1 /tmp/tmpiMCR_l)
DEBUG: /bin/mount -t nfs -o rw,sync,soft filer01.qa.lab.tlv.redhat.com:/DORON1 /tmp/tmpiMCR_l
DEBUG: _cmds(['/bin/mount', '-t', 'nfs', '-o', 'rw,sync,soft', 'filer01.qa.lab.tlv.redhat.com:/DORON1', '/tmp/tmpiMCR_l'])
/tmp/rhev-hypervisor-advanced-3.2-20130307.0.el6ev.noarch.rpm

DEBUG: returncode(32)
DEBUG: STDOUT()
DEBUG: STDERR(mount.nfs: Connection timed out
)
ERROR: mount.nfs: Connection timed out

DEBUG: /bin/umount -t nfs -f  /tmp/tmpiMCR_l
DEBUG: /bin/umount -t nfs -f  /tmp/tmpiMCR_l
DEBUG: _cmds(['/bin/umount', '-t', 'nfs', '-f', '/tmp/tmpiMCR_l'])
DEBUG: returncode(1)
DEBUG: STDOUT()
DEBUG: STDERR(umount2: Invalid argument
umount: /tmp/tmpiMCR_l: not mounted
)
DEBUG: umount2: Invalid argument
umount: /tmp/tmpiMCR_l: not mounted

Comment 17 Kiril Nesenko 2013-03-17 18:08:19 UTC
1. There is a newer version - rhevm-iso-uploader-3.2.0-1.el6ev
2. On which rhel version you tested it ?

Comment 18 Leonid Natapov 2013-03-17 22:09:20 UTC
rhel 6.4. It is not clear what should be tested. iso-uploader or image-uploader ? 
Bug title says iso-uploader  but from the bug itself and in the thread it looks like it should be image-uploader.

Comment 19 Moran Goldboim 2013-03-18 08:45:14 UTC
(In reply to comment #18)
> rhel 6.4. It is not clear what should be tested. iso-uploader or
> image-uploader ? 
> Bug title says iso-uploader  but from the bug itself and in the thread it
> looks like it should be image-uploader.

should work on both - please make sure you are working with kernel-2.6.32-280.el6 and higher.

Comment 20 Kiril Nesenko 2013-03-18 09:02:06 UTC
I just verified it on our servers and it works. Seems like you have some storage configuration issues. 
Leonid, please retest it on another storage domain.

Comment 21 Leonid Natapov 2013-03-18 13:46:14 UTC
The bug explicitly says that the problem was  when working against gluster nfs server.


From the thread above I understood that the problem was not with the tool but with the kernel and it was fixed in kernel kernel-2.6.32-280 and above.

RHEVM machine is RHEL 6.4 with kernel 2.6.32-358.el6.x86_64
Glsuter machine is RHEL 6.2 with kernel 2.6.32-220.23.1

Any thoughts ?
Putting needinfo on Keith.

Comment 22 Keith Robertson 2013-03-27 01:00:57 UTC
Leonid,

The problem really isn't with the tools it is with NFS.  What I discovered (comment 1) was that the NFS server wasn't auto-negotiating the NFS version.  As such, it really doesn't matter if you test on a Gluster enabled RHEL system or not (though you can for posterity's sake).  You really just need to...

1) Confirm that the ISO/Image uploader RPMs force kernel >= kernel-2.6.32-280
2) Will upload a file when kernel-2.6.32-280

If 2 fails then I would like to see the output from the following command...
- /bin/mount -v -t nfs -o rw,sync,soft <your server here>:/path/to/iso/domain /some/directory/here

FYI: Notice that the mount command above does *not* specify the NFS version. This should force the client and the server to auto-negotiate an appropriate NFS version (ie. 3 or 4).  If they cannot negotiate then there is still a bug in the kernel.  You will see the negotiation sequence in the output.



(In reply to comment #21)
> The bug explicitly says that the problem was  when working against gluster
> nfs server.
> 
> 
> From the thread above I understood that the problem was not with the tool
> but with the kernel and it was fixed in kernel kernel-2.6.32-280 and above.
> 
> RHEVM machine is RHEL 6.4 with kernel 2.6.32-358.el6.x86_64
> Glsuter machine is RHEL 6.2 with kernel 2.6.32-220.23.1
> 
> Any thoughts ?
> Putting needinfo on Keith.

Comment 23 Moran Goldboim 2013-04-02 09:05:55 UTC
Leonid, can you check the flow Keith indicated?

Comment 26 Moran Goldboim 2013-04-09 12:46:12 UTC
please reopen if relevant.