Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1111210

Summary: sanlock.get_hosts() off-by-one error when specifying the hostId argument
Product: Red Hat Enterprise Linux 6 Reporter: Nir Soffer <nsoffer>
Component: sanlockAssignee: David Teigland <teigland>
Status: CLOSED CURRENTRELEASE QA Contact: Aharon Canan <acanan>
Severity: high Docs Contact:
Priority: high    
Version: 6.5CC: acanan, agk, amureini, benl, cluster-maint, ebenahar, jharriga, jherrman, jkurik, nsoffer, salmy, scohen, stirabos, teigland, tlavigne
Target Milestone: rcKeywords: ZStream
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: sanlock-2.8-2.el6 Doc Type: Bug Fix
Doc Text:
Prior to this update, an off-by-one error caused the sanlock.get_hosts() function to return incorrect information about the host when it was used with the host_id argument. This error has been fixed and sanlock.get_hosts() returns the correct information when host_id is used.
Story Points: ---
Clone Of:
: 1131192 1139373 (view as bug list) Environment:
Last Closed: 2015-02-18 10:03:31 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1131192, 1131194, 1139373    
Attachments:
Description Flags
[PATCH] Fix off-by-one error in get_hosts none

Description Nir Soffer 2014-06-19 13:21:23 UTC
Description of problem:

sanlock.get_hosts cannot be used with a host_id parameter, because asking for host_id=1 returns the info for host_id=2.

[root@voodoo2 nsoffer]# python
Python 2.6.6 (r266:84292, Nov 21 2013, 10:50:32) 
[GCC 4.4.7 20120313 (Red Hat 4.4.7-4)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import sanlock
>>> sanlock.get_hosts('01b7eb33-e2eb-43bf-a3dd-0f0f09023dc5')
[{'generation': 25, 'host_id': 1, 'flags': 3, 'io_timeout': 10, 'timestamp': 852460}, {'generation': 27, 'host_id': 2, 'flags': 3, 'io_timeout': 10, 'timestamp': 854412}]
>>> sanlock.get_hosts('01b7eb33-e2eb-43bf-a3dd-0f0f09023dc5', 1)
[{'generation': 27, 'host_id': 2, 'flags': 3, 'io_timeout': 10, 'timestamp': 854432}]
>>> sanlock.get_hosts('01b7eb33-e2eb-43bf-a3dd-0f0f09023dc5', 2)
[{'generation': 1, 'host_id': 3, 'flags': 2, 'io_timeout': 10, 'timestamp': 0}]

You cannot workaround this by using host_id=id-1, since when getting host_id=1, the id will become 0, which is used by the bindings as default value meaning give me the list of all hosts:

>>> sanlock.get_hosts('01b7eb33-e2eb-43bf-a3dd-0f0f09023dc5', 0)
[{'generation': 25, 'host_id': 1, 'flags': 3, 'io_timeout': 10, 'timestamp': 852871}, {'generation': 27, 'host_id': 2, 'flags': 3, 'io_timeout': 10, 'timestamp': 854843}]

Version-Release number of selected component (if applicable):
sanlock-2.8-1.el6.x86_64

How reproducible:
Always

Additional info:
Broken also on Fedora using sanlock-2.8-1.fc19.x86_64.

Comment 2 Nir Soffer 2014-06-19 15:43:25 UTC
Created attachment 910461 [details]
[PATCH] Fix off-by-one error in get_hosts

Comment 3 David Teigland 2014-06-19 15:44:21 UTC
Nir, where do you need this fix applied and released?

Comment 4 Nir Soffer 2014-06-19 16:07:46 UTC
I need it for ovirt-3.5, which must run on RHEL-6.5.

Comment 5 Nir Soffer 2014-08-05 22:05:55 UTC
David, this is fixed upstream:
https://git.fedorahosted.org/cgit/sanlock.git/commit/?id=2e5150be0ad662f218a5442bd1c40f12c825022d

We need this trivial fix for ovirt-3.5, which run on EL 6.5 and Fedora 19 and 20.

Comment 6 David Teigland 2014-08-06 16:12:02 UTC
f19: http://koji.fedoraproject.org/koji/buildinfo?buildID=551025
f20: http://koji.fedoraproject.org/koji/buildinfo?buildID=551029

I'll attempt a rhel6.5 build once there's a bz I can use.

Comment 8 David Teigland 2014-08-06 16:23:59 UTC
Nir, what sanlock version will you want in rhel6.6?  A rebase to pending version 3.2.0 (which is going into rhel7.1)?  Or just selected fixes like this one?

Comment 9 Nir Soffer 2014-08-06 16:36:48 UTC
(In reply to David Teigland from comment #8)
> Nir, what sanlock version will you want in rhel6.6?  A rebase to pending
> version 3.2.0 (which is going into rhel7.1)?  Or just selected fixes like
> this one?

I don't know about any other issue effecting rhev-3.5 other then this bug, so I will be happy with a backport of this trivial fix.

If there are any other important bug fixes I guess they should also be backported. But I would like to depend on stable version.

Comment 10 David Teigland 2014-08-06 16:53:19 UTC
OK, just this patch then for 6.6 and 6.5.z.

Comment 14 David Teigland 2014-08-18 15:42:41 UTC
In the past, I believe that RHEV QA has verified sanlock bugs by either exercising the relevant part of RHEV that had a problem, or when that's not possible, I have sometimes provided direct sanlock commands that can be run to demonstrate the correct behavior.

In this case I can't provide a sanlock command to do this directly because there is no option to select a single host id (it always prints all).  Perhaps the python api could be used to do this, though?

Comment 15 Allon Mureinik 2014-08-18 16:16:13 UTC
(In reply to David Teigland from comment #14)
> In the past, I believe that RHEV QA has verified sanlock bugs by either
> exercising the relevant part of RHEV that had a problem, or when that's not
> possible, I have sometimes provided direct sanlock commands that can be run
> to demonstrate the correct behavior.
> 
> In this case I can't provide a sanlock command to do this directly because
> there is no option to select a single host id (it always prints all). 
> Perhaps the python api could be used to do this, though?

Sounds good to me, but this seems like a procedural decision - which QA group "owns" sanlock.

Comment 16 Nir Soffer 2014-08-18 16:26:26 UTC
(In reply to David Teigland from comment #14)
> Perhaps the python api could be used to do this, though?

Yes - this is very simple:

1. Get a setup with some hosts
2. Note the host id of each host (grep HostId in vdsm.log)
3. Run (xxxxxx should be the domain uuid)

$ python
>>> import sanlock
>>> sanlock.get_hosts('xxxxxx')

Will return list of hosts dicts

>>> sanlock.get_hosts('xxxxxx', 1)

Should return list with one dict for host 1 - must match the first item in the list returned by the first call without the host id.

Comment 18 Allon Mureinik 2014-09-08 09:57:52 UTC
David, doesn't sanlock 3.2.1 contain a fix for this already?

Comment 19 Nir Soffer 2014-09-08 10:42:48 UTC
(In reply to Allon Mureinik from comment #18)
sanlock 3.2.1 contains this fix of course, but it is not available on el6.5. We asked for a backport of this tiny fix which is required for 3.5.

Comment 21 Nir Soffer 2015-02-02 16:33:48 UTC
Elad, can you verify this bug? you verified the same thing in bug 1139373.

Comment 22 Nir Soffer 2015-02-04 08:09:35 UTC
Based on acks and the cloned bug being verified, moving to ON_QA so Elad can verify this.

Comment 23 Elad 2015-02-05 09:37:06 UTC
Hi David,
On which OS should this bug tested on?
1139373  was already tested and verified on RHEL6.6

Comment 24 David Teigland 2015-02-05 11:46:18 UTC
I'm not sure how zstream bugs are tested, but I imagine that 6.5.z bugs would be tested on 6.5.

Comment 25 Aharon Canan 2015-02-18 10:03:31 UTC
After consulting Gil and Allon, 
This issue fixed in rhev 3.5 which must run with rhel 6.6 so no need to verify using rhel 6.5

we already verified it using rhel 6.6 so we are done.
https://bugzilla.redhat.com/show_bug.cgi?id=1139373