Bug 1508098 - [RFE] RHV should configure sanlock host name
Summary: [RFE] RHV should configure sanlock host name
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: vdsm
Version: 4.1.6
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ovirt-4.4.4
: 4.4.4
Assignee: Nir Soffer
QA Contact: Evelina Shames
URL:
Whiteboard:
Depends On:
Blocks: 902971 1520566 1547336 1609029 1845909
TreeView+ depends on / blocked
 
Reported: 2017-10-31 20:10 UTC by Marina Kalinin
Modified: 2021-11-04 19:28 UTC (History)
13 users (show)

Fixed In Version: vdsm-4.40.40
Doc Type: Bug Fix
Doc Text:
Cause: Sanlock host name was not configured, so sanlock picks a random host name on each run. Consequence: When activating a host after unclean shutdown sanlock had to wait for long time to ensure that another host is not using the lockspace. Fix: Configure sanlock host name to host hardware id. Result: Activating a host after unclean shutdown is about 3 times faster.
Clone Of:
Environment:
Last Closed: 2021-02-02 13:59:36 UTC
oVirt Team: Storage
Target Upstream Version:
Embargoed:
ratamir: testing_plan_complete-


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2021:0382 0 None None None 2021-02-02 13:59:55 UTC
oVirt gerrit 112498 0 master MERGED tool: sanlock: Speed up add_lockspace 2021-02-17 01:17:45 UTC
oVirt gerrit 112502 0 master MERGED packaging: Configure vdsm after creating vdsm.id 2021-02-17 01:17:45 UTC

Description Marina Kalinin 2017-10-31 20:10:16 UTC
Today RHV does not specify our_host_name in sanlock.log and sanlock each time generates a rundom uuid instead.

This makes it much harder to debug RHV sanlock issues as in the bellow example. Sanlock simply does not have enough space to list the full host name with the randomly generated uuid. This makes it impossible to correlate sanlock uuid to actual host, especially in environments with multiple hosts.

Example:
2017-10-31 13:07:20-0400 329 [exampo]: s1 host 1 233691 3552954 fdcadd1d-6505-44f6-9e3f-5bcabd02e8e5.rhevhost
2017-10-31 13:07:20-0400 329 [1085]: s1 host 2 6 4225585 4b2e5d32-7612-4a1c-a8dc-85290d7b80a9.rhevhost
2017-10-31 13:07:20-0400 329 [1085]: s1 host 3 10 4249851 e4330bde-1b88-4c78-af0a-aa4c526c43df.rhevhost
2017-10-31 13:07:20-0400 329 [1085]: s1 host 4 21 0 3934a00b-6a6d-47b4-a6e4-30fe9016259b.rhevhost
2017-10-31 13:07:20-0400 329 [1085]: s1 host 5 13 3620639 60706565-e7ab-4e70-9c34-480c3e616eea.rhevhost
2017-10-31 13:07:20-0400 329 [1085]: s1 host 6 12 4071186 b8897179-8739-45eb-a225-42c2e01cc2ea.rhevhost
2017-10-31 13:07:20-0400 329 [1085]: s1 host 7 20 4139635 99891d9c-3633-4741-aa55-20d3a86d60fc.rhevhost
2017-10-31 13:07:20-0400 329 [1085]: s1 host 8 8 3565781 50d01723-f71a-47f0-bdc0-2824fe26bae9.rhevhost
2017-10-31 13:07:20-0400 329 [1085]: s1 host 9 7 3990028 fb41dc71-2207-47e9-bab3-fcaf37807242.rhevhost
2017-10-31 13:07:20-0400 329 [1085]: s1 host 10 18 308 962addbd-158a-4888-bded-cfe5cfe1f4ca.localhost. 

Those are 10 different hosts trying to acquire lease on same sanlock space s1. But no way which host is which, they all look same rhevhost - the numbers to identify the hosts are truncated.

Comment 1 Yaniv Kaul 2017-10-31 22:05:38 UTC
sanlock log is not produced by VDSM. Can you set it in the conf and it'll display it in the log?

Comment 2 Marina Kalinin 2017-11-01 13:04:40 UTC
(In reply to Yaniv Kaul from comment #1)
> sanlock log is not produced by VDSM. Can you set it in the conf and it'll
> display it in the log?

Yaniv, that was the actual advise from David. And that's why I opened this bug.
If this value is not set in the conf file, sanlock will generate a random value each time it starts.

Comment 3 David Teigland 2017-11-01 15:18:54 UTC
If sanlock is not configured with a host name (our_host_name in sanlock.conf), it will generate a new uuid to identify the host each time sanlock is started.  It is critical that the host name be unique, so sanlock doesn't just grab uname/hostname.  Using a uuid to identify the host has the downside of making it more difficult to debug, because you have to hunt down which host was using which uuid (by looking in sanlock.log).

If rhev/vdsm has a way of assigning globally unique names to every host, then it could put that name into sanlock.conf/our_host_name.  This could make things simpler to debug if those unique names were more readable than uuids.  Even if they were uuids, it could still be advantageous to make these uuid names persistent so that a host would use the same name/uuid each time it restarted.

I think that Nir had some ideas in this area, but I have forgotten the details.

Comment 5 Yaniv Kaul 2017-11-01 15:54:31 UTC
We have a host ID which is unique in RHV. Can't we use it?

Comment 6 David Teigland 2017-11-01 16:04:04 UTC
Is this host ID in RHV is the same number that is given to sanlock to use as the sanlock host_id?  (For sanlock it must be a number between 1-2000.)  If we're talking about the same host_id, then no this will not work because these host_id numbers are not unique, they can be used for different hosts (even if only by mistake).  The issue prompting this bz is one case where two hosts are both trying to use the same host_id number (10).  A unique host name is how we detect this mistake of two hosts trying to use the same host_id.

Comment 7 Yaniv Kaul 2017-11-01 16:15:38 UTC
(In reply to David Teigland from comment #6)
> Is this host ID in RHV is the same number that is given to sanlock to use as
> the sanlock host_id?  (For sanlock it must be a number between 1-2000.)  If
> we're talking about the same host_id, then no this will not work because
> these host_id numbers are not unique, they can be used for different hosts
> (even if only by mistake).  The issue prompting this bz is one case where
> two hosts are both trying to use the same host_id number (10).  A unique
> host name is how we detect this mistake of two hosts trying to use the same
> host_id.

No, I was referring to:
[root@lago-basic-suite-master-host-0 vdsm]# cat /etc/vdsm/vdsm.id 
55B25364-93F2-43F9-B562-D46A69831E81

Comment 8 Allon Mureinik 2017-11-19 10:38:33 UTC
(In reply to Yaniv Kaul from comment #5)
> We have a host ID which is unique in RHV. Can't we use it?

I think so, if we build some mechanism to pipe it through.
David - any objection?

Comment 9 David Teigland 2017-11-20 16:23:38 UTC
Sounds good to me.  In sanlock terms, that would be the "host name", i.e. a unique string ID, whereas the "host ID" is a unique integer ID (between 1-2000).

sanlock will currently accept a host name from sanlock.conf or the command line (-e), but the command line method is not accessible when starting sanlock from the systemd unit file.  A future version could add a library interface for setting this.

Comment 10 Allon Mureinik 2017-11-21 10:31:46 UTC
(In reply to David Teigland from comment #9)
> sanlock will currently accept a host name from sanlock.conf or the command
> line (-e), but the command line method is not accessible when starting
> sanlock from the systemd unit file.  A future version could add a library
> interface for setting this.

Is this targetted for any upcoming RHEL version, or just a future RFE in the backlog?

Comment 12 Nir Soffer 2017-12-01 18:15:40 UTC
I discussed this with David today, we think the right way is to configure sanlock
with the vdsm host id and optional host name (same name used in engine?).

The format can be:

    vdsm-host-uuid[/truncated...name]

Vdsm will have to verify sanlock configuration during startup. All can be done
using the current sanlock configurator (vdsm-tool configure --module sanlock)

Comment 13 Tal Nisan 2018-08-14 14:38:46 UTC
(In reply to Nir Soffer from comment #12)
> I discussed this with David today, we think the right way is to configure
> sanlock
> with the vdsm host id and optional host name (same name used in engine?).
> 
> The format can be:
> 
>     vdsm-host-uuid[/truncated...name]
> 
> Vdsm will have to verify sanlock configuration during startup. All can be
> done
> using the current sanlock configurator (vdsm-tool configure --module sanlock)

So what should be done in order to close this bug?

Comment 14 Nir Soffer 2018-09-07 21:06:57 UTC
(In reply to Tal Nisan from comment #13)

This is not only for debugging, it is needed to avoid delays after unclean
shutdown. We must configure sanlock with a unique uuid that cannot be used on any
other host, see bug 1609029.

The host hardware id should be fine for this:
# cat /etc/vdsm/vdsm.id 
55B25364-93F2-43F9-B562-D46A69831E81

I would not add trncated host name since the host name can change on engine side,
but we want stable value that never change in sanlock configuration and on storage.

To locate host details using the host hardware id seen in storage, one can use
the SDK search feature:

    hosts_service = system_service.hosts_service()
    hosts = hosts_service.list(
        search="hw_id=%s" % hardware_id),
        case_sensitive=False,
    )

We need to change the sanlock configurator to configure sanlock host name.
https://github.com/oVirt/vdsm/blob/master/lib/vdsm/tool/configurators/sanlock.py

On new install, this must be done before sanlock accessed any storage, during
"vdsm-tool configure --force".

On upgrades, the host must be moved to maintenance, then "vdsm-tool configure
--force" will stop sanlock, configure it, and start it again.

On upgrades, if we need to configure sanlock, and the host is not in maintenance,
fail the upgrade. Trying to stop sanlock when it is connected to storage will 
either fail after long time, or if systemd kill sanlock, will cause a reboot.

When starting vdsm, the value of our_host_name must be checked. If the value 
doe snot match the hardware id, vdsm should refused to start.

Comment 16 Sandro Bonazzola 2019-01-28 09:41:29 UTC
This bug has not been marked as blocker for oVirt 4.3.0.
Since we are releasing it tomorrow, January 29th, this bug has been re-targeted to 4.3.1.

Comment 25 Marina Kalinin 2019-12-17 16:46:03 UTC
Hi David,

Is it possible to print the host uuid and hostname in sanlock.log at host boot, since it is requires too many changes on the RHV side and has some risks in it introducing such a change now.

Comment 26 David Teigland 2019-12-17 17:06:47 UTC
The uuid (and truncated name) are already logged, e.g.

sanlock daemon started 3.7.0 host 789788eb-2a22-4cc9-bd54-b2786c704ddf.null-04.la

so it sounds like you are interested in seeing the entire hostname also?  I could add that, would this look ok?:

sanlock daemon started 3.7.0 host 789788eb-2a22-4cc9-bd54-b2786c704ddf.null-04.la (null-04.lab.msp.redhat.com)

Comment 27 Nir Soffer 2019-12-17 17:11:45 UTC
(In reply to David Teigland from comment #26)
> sanlock daemon started 3.7.0 host
> 789788eb-2a22-4cc9-bd54-b2786c704ddf.null-04.la (null-04.lab.msp.redhat.com)

Looks good. The host uuid is not very useful today because we use random uuid
for every boot, but once we configure sanlock properly with /etc/vdsm/vdsm.id
(easy) or the host uuid from engine (harder), it will be very useful.

Comment 28 David Teigland 2019-12-17 17:30:04 UTC
pushed to master
https://pagure.io/sanlock/c/cff348800722f7dadf030ffe7494c2df714996e3?branch=master

If this is the only code change for this bug, then the bz should probably be reassigned to sanlock and a future RHEL8 release.

Comment 29 Nir Soffer 2019-12-17 17:50:36 UTC
I think a separate sanlock bug would be better, since we really need to configure
sanlock host uuid properly. This can avoid delays when resuming from unclean
shutdown, and will make it easier to debug issues in the cluster.

For example, if we use the uuid from /etc/vdsm/vdsm.id, when we find that host
uuid owning a lease, we can locate the host in engine:

$ cat /etc/vdsm/vdsm.id
324e9b61-49aa-4220-9e52-f7ee3c88c3f3

https://engine/ovirt-engine/api/hosts/?search=hw_id=324e9b61-49aa-4220-9e52-f7ee3c88c3f3

With the random uuid changed on every boot, we don't have a good way to find the host.

Comment 30 Marina Kalinin 2020-01-03 20:21:18 UTC
(In reply to Nir Soffer from comment #29)
> I think a separate sanlock bug would be better, since we really need to
> configure
> sanlock host uuid properly. This can avoid delays when resuming from unclean
> shutdown, and will make it easier to debug issues in the cluster.
> 
> For example, if we use the uuid from /etc/vdsm/vdsm.id, when we find that
> host
> uuid owning a lease, we can locate the host in engine:
> 
> $ cat /etc/vdsm/vdsm.id
> 324e9b61-49aa-4220-9e52-f7ee3c88c3f3
> 
> https://engine/ovirt-engine/api/hosts/?search=hw_id=324e9b61-49aa-4220-9e52-
> f7ee3c88c3f3
> 
> With the random uuid changed on every boot, we don't have a good way to find
> the host.

Hi Nir,

What are the implications to get this change in?
I do not think it should be high risk, since this lock is renewed regularly and new uuid can be used when needed. So if it is easy to implement, we would like to see it in 4.4.z.

Comment 31 Marina Kalinin 2020-03-21 14:49:44 UTC
(In reply to David Teigland from comment #28)
> pushed to master
> https://pagure.io/sanlock/c/
> cff348800722f7dadf030ffe7494c2df714996e3?branch=master
> 
> If this is the only code change for this bug, then the bz should probably be
> reassigned to sanlock and a future RHEL8 release.

Hi David,

Sorry for not replying earlier. Though I read this before.
Is this in RHEL8? Or waiting on some action from me?
I can file a bug, if needed, please let me know.

Comment 32 Nir Soffer 2020-03-21 15:43:07 UTC
(In reply to Marina Kalinin from comment #31)
> Sorry for not replying earlier. Though I read this before.
> Is this in RHEL8? Or waiting on some action from me?
> I can file a bug, if needed, please let me know.

See comment 29. The sanlock change does not fix the issue which is vdsm not 
configuring sanlock uuid.

Comment 14 explains what should be done.

Comment 33 David Teigland 2020-03-23 14:21:31 UTC
There was one small sanlock change in comment 28 which logs the full hostname.  It hadn't been worth doing an update with just that change, but it will be in 8.3 along with other sanlock fixes (related to the client connection issues.)

Comment 34 Nir Soffer 2020-11-30 20:11:56 UTC
While testing recovery from unclean shutdown related to bug 1902468,
I configured sanlock host name to see how it affects the timing.

I measured the time from starting a host connected to 20 storage
domains after unclean shutdown.

Without host name: 5m:32s
With host name: 1m:47s

For reference, after clean shutdown activation takes 1m:05s.

Here is the tested configuration:

$ cat /etc/sanlock/sanlock.conf
# Setting for vdsm
our_host_name = d8635601-ea8b-4c5c-a624-41bd72b862d6
max_worker_threads = 40

(max_worker_threads is not available in RHEL 8.3).

Comment 38 Evelina Shames 2020-12-28 12:58:58 UTC
Verified on rhv-4.4.4-6: 
> 'our_host_name' is specified in sanlock.log (on spm host) instead of a random uuid.

Nir, is there something else I need to check?

Comment 39 Nir Soffer 2021-01-03 17:02:31 UTC
(In reply to Evelina Shames from comment #38)
> Verified on rhv-4.4.4-6: 
> > 'our_host_name' is specified in sanlock.log (on spm host) instead of a random uuid.

Sanlock must be configured on all hosts not only on spm.

> Nir, is there something else I need to check?

Here is how to verify this change:

## After adding host to engine 4.4.4 or upgrading a host sanlock is configured

/etc/sanlock/sanlock.conf contains our_host_name option,
set to host hardware id:

$ cat /etc/sanlock/sanlock.conf 
# Configuration for vdsm
our_host_name = 095bce26-9d5b-4ae2-b3bf-dddfc9cdf541
...

$ cat /etc/vdsm/vdsm.id 
095bce26-9d5b-4ae2-b3bf-dddfc9cdf541

Sanlock reports this value in sanlock log:

2021-01-03 11:40:54 9 [964]: sanlock daemon started 3.8.2 host 095bce26-9d5b-4ae2-b3bf-dddfc9cdf541 (host3)


## vdsm-tool is-configured succeeds

$ sudo vdsm-tool is-configured
Current revision of multipath.conf detected, preserving
sanlock is configured for vdsm
abrt is already configured for vdsm
Managed volume database is already configured
lvm is configured for vdsm
libvirt is already configured for vdsm


## If our_host_name is does not match host hardware id:

$ cat /etc/sanlock/sanlock.conf 
# Configuration for vdsm
our_host_name = invalid-host-id
max_worker_threads = 50

vdsm-tool is-configured will fail:

$ sudo vdsm-tool is-configured --module sanlock
sanlock config file needs options: {'our_host_name': '095bce26-9d5b-4ae2-b3bf-dddfc9cdf541'}
Modules sanlock are not configured
 Error:  

One of the modules is not configured to work with VDSM.
To configure the module use the following:
'vdsm-tool configure [--module module-name]'.

If all modules are not configured try to use:
'vdsm-tool configure --force'
(The force flag will stop the module's service and start it
afterwards automatically to load the new configuration.)

Changing the configuration back to the right value will
make vdsm-tool happy again.

## When sanlock is not configured, vdsmd service will fail to start

$ sudo systemctl stop vdsmd
$ sudo systemctl start vdsmd
Job for vdsmd.service failed because the control process exited with error code.
See "systemctl status vdsmd.service" and "journalctl -xe" for details.

Looking in journalcl we should see the error about configuring sanlock:

Jan 03 18:54:25 host3 vdsmd_init_common.sh[32952]: vdsm: Running check_is_configured
Jan 03 18:54:25 host3 vdsmd_init_common.sh[32952]: Error:
Jan 03 18:54:25 host3 vdsmd_init_common.sh[32952]: One of the modules is not configured to work with VDSM.
Jan 03 18:54:25 host3 vdsmd_init_common.sh[32952]: To configure the module use the following:
Jan 03 18:54:25 host3 vdsmd_init_common.sh[32952]: 'vdsm-tool configure [--module module-name]'.
Jan 03 18:54:25 host3 vdsmd_init_common.sh[32952]: If all modules are not configured try to use:
Jan 03 18:54:25 host3 vdsmd_init_common.sh[32952]: 'vdsm-tool configure --force'
Jan 03 18:54:25 host3 vdsmd_init_common.sh[32952]: (The force flag will stop the module's service and start it
Jan 03 18:54:25 host3 vdsmd_init_common.sh[32952]: afterwards automatically to load the new configuration.)
Jan 03 18:54:25 host3 vdsmd_init_common.sh[32952]:  
Jan 03 18:54:25 host3 vdsmd_init_common.sh[32952]: lvm is configured for vdsm
Jan 03 18:54:25 host3 vdsmd_init_common.sh[32952]: Managed volume database is already configured
Jan 03 18:54:25 host3 vdsmd_init_common.sh[32952]: Current revision of multipath.conf detected, preserving
Jan 03 18:54:25 host3 vdsmd_init_common.sh[32952]: abrt is already configured for vdsm
Jan 03 18:54:25 host3 vdsmd_init_common.sh[32952]: libvirt is already configured for vdsm
Jan 03 18:54:25 host3 vdsmd_init_common.sh[32952]: sanlock config file needs options: {'our_host_name': '095bce26-9d5b-4ae2-b>
Jan 03 18:54:25 host3 vdsmd_init_common.sh[32952]: Modules sanlock are not configured
Jan 03 18:54:26 host3 vdsmd_init_common.sh[32952]:  vdsm: stopped during execute check_is_configured task (task returned with>


## To configure a host manually

1. Move host to maintainance
2. Configure vdsm using vdsm-tool

$ sudo vdsm-tool configure --force --module sanlock

Checking configuration status...

sanlock config file needs options: {'our_host_name': '095bce26-9d5b-4ae2-b3bf-dddfc9cdf541'}

Running configure...
Configuring sanlock config file
Previous sanlock.conf copied to /etc/sanlock/sanlock.conf.20210103184638
Reconfiguration of sanlock is done.

Done configuring modules to VDSM.


## Configuring sanlock creates a backup of /etc/sanlock/sanlock.conf

$ ls -lh /etc/sanlock/sanlock.conf*
-rw-r--r--. 1 root root 102 Jan  3 18:46 /etc/sanlock/sanlock.conf
-rw-r--r--. 1 root root  81 Jan  3 18:46 /etc/sanlock/sanlock.conf.20210103184638

$ cat /etc/sanlock/sanlock.conf.20210103184638 
# Configuration for vdsm
our_host_name = invalid-host-id
max_worker_threads = 50


## Sanlock.conf selinux labels are kept during upgrade

$ ls -lhZ /etc/sanlock/sanlock.conf
-rw-r--r--. 1 root root system_u:object_r:sanlock_conf_t:s0 102 Jan  3 18:46 /etc/sanlock/sanlock.conf

Comment 40 Evelina Shames 2021-01-03 18:45:42 UTC
(In reply to Nir Soffer from comment #39)
> (In reply to Evelina Shames from comment #38)
> > Verified on rhv-4.4.4-6: 
> > > 'our_host_name' is specified in sanlock.log (on spm host) instead of a random uuid.
> 
> Sanlock must be configured on all hosts not only on spm.
> 
> > Nir, is there something else I need to check?
> 
> Here is how to verify this change:
> 
> ## After adding host to engine 4.4.4 or upgrading a host sanlock is
> configured
> 
> /etc/sanlock/sanlock.conf contains our_host_name option,
> set to host hardware id:
> 
> $ cat /etc/sanlock/sanlock.conf 
> # Configuration for vdsm
> our_host_name = 095bce26-9d5b-4ae2-b3bf-dddfc9cdf541
> ...
> 
> $ cat /etc/vdsm/vdsm.id 
> 095bce26-9d5b-4ae2-b3bf-dddfc9cdf541
> 
> Sanlock reports this value in sanlock log:
> 
> 2021-01-03 11:40:54 9 [964]: sanlock daemon started 3.8.2 host
> 095bce26-9d5b-4ae2-b3bf-dddfc9cdf541 (host3)
> 
> 
> ## vdsm-tool is-configured succeeds
> 
> $ sudo vdsm-tool is-configured
> Current revision of multipath.conf detected, preserving
> sanlock is configured for vdsm
> abrt is already configured for vdsm
> Managed volume database is already configured
> lvm is configured for vdsm
> libvirt is already configured for vdsm
> 
> 
> ## If our_host_name is does not match host hardware id:
> 
> $ cat /etc/sanlock/sanlock.conf 
> # Configuration for vdsm
> our_host_name = invalid-host-id
> max_worker_threads = 50
> 
> vdsm-tool is-configured will fail:
> 
> $ sudo vdsm-tool is-configured --module sanlock
> sanlock config file needs options: {'our_host_name':
> '095bce26-9d5b-4ae2-b3bf-dddfc9cdf541'}
> Modules sanlock are not configured
>  Error:  
> 
> One of the modules is not configured to work with VDSM.
> To configure the module use the following:
> 'vdsm-tool configure [--module module-name]'.
> 
> If all modules are not configured try to use:
> 'vdsm-tool configure --force'
> (The force flag will stop the module's service and start it
> afterwards automatically to load the new configuration.)
> 
> Changing the configuration back to the right value will
> make vdsm-tool happy again.
> 
> ## When sanlock is not configured, vdsmd service will fail to start
> 
> $ sudo systemctl stop vdsmd
> $ sudo systemctl start vdsmd
> Job for vdsmd.service failed because the control process exited with error
> code.
> See "systemctl status vdsmd.service" and "journalctl -xe" for details.
> 
> Looking in journalcl we should see the error about configuring sanlock:
> 
> Jan 03 18:54:25 host3 vdsmd_init_common.sh[32952]: vdsm: Running
> check_is_configured
> Jan 03 18:54:25 host3 vdsmd_init_common.sh[32952]: Error:
> Jan 03 18:54:25 host3 vdsmd_init_common.sh[32952]: One of the modules is not
> configured to work with VDSM.
> Jan 03 18:54:25 host3 vdsmd_init_common.sh[32952]: To configure the module
> use the following:
> Jan 03 18:54:25 host3 vdsmd_init_common.sh[32952]: 'vdsm-tool configure
> [--module module-name]'.
> Jan 03 18:54:25 host3 vdsmd_init_common.sh[32952]: If all modules are not
> configured try to use:
> Jan 03 18:54:25 host3 vdsmd_init_common.sh[32952]: 'vdsm-tool configure
> --force'
> Jan 03 18:54:25 host3 vdsmd_init_common.sh[32952]: (The force flag will stop
> the module's service and start it
> Jan 03 18:54:25 host3 vdsmd_init_common.sh[32952]: afterwards automatically
> to load the new configuration.)
> Jan 03 18:54:25 host3 vdsmd_init_common.sh[32952]:  
> Jan 03 18:54:25 host3 vdsmd_init_common.sh[32952]: lvm is configured for vdsm
> Jan 03 18:54:25 host3 vdsmd_init_common.sh[32952]: Managed volume database
> is already configured
> Jan 03 18:54:25 host3 vdsmd_init_common.sh[32952]: Current revision of
> multipath.conf detected, preserving
> Jan 03 18:54:25 host3 vdsmd_init_common.sh[32952]: abrt is already
> configured for vdsm
> Jan 03 18:54:25 host3 vdsmd_init_common.sh[32952]: libvirt is already
> configured for vdsm
> Jan 03 18:54:25 host3 vdsmd_init_common.sh[32952]: sanlock config file needs
> options: {'our_host_name': '095bce26-9d5b-4ae2-b>
> Jan 03 18:54:25 host3 vdsmd_init_common.sh[32952]: Modules sanlock are not
> configured
> Jan 03 18:54:26 host3 vdsmd_init_common.sh[32952]:  vdsm: stopped during
> execute check_is_configured task (task returned with>
> 
> 
> ## To configure a host manually
> 
> 1. Move host to maintainance
> 2. Configure vdsm using vdsm-tool
> 
> $ sudo vdsm-tool configure --force --module sanlock
> 
> Checking configuration status...
> 
> sanlock config file needs options: {'our_host_name':
> '095bce26-9d5b-4ae2-b3bf-dddfc9cdf541'}
> 
> Running configure...
> Configuring sanlock config file
> Previous sanlock.conf copied to /etc/sanlock/sanlock.conf.20210103184638
> Reconfiguration of sanlock is done.
> 
> Done configuring modules to VDSM.
> 
> 
> ## Configuring sanlock creates a backup of /etc/sanlock/sanlock.conf
> 
> $ ls -lh /etc/sanlock/sanlock.conf*
> -rw-r--r--. 1 root root 102 Jan  3 18:46 /etc/sanlock/sanlock.conf
> -rw-r--r--. 1 root root  81 Jan  3 18:46
> /etc/sanlock/sanlock.conf.20210103184638
> 
> $ cat /etc/sanlock/sanlock.conf.20210103184638 
> # Configuration for vdsm
> our_host_name = invalid-host-id
> max_worker_threads = 50
> 
> 
> ## Sanlock.conf selinux labels are kept during upgrade
> 
> $ ls -lhZ /etc/sanlock/sanlock.conf
> -rw-r--r--. 1 root root system_u:object_r:sanlock_conf_t:s0 102 Jan  3 18:46
> /etc/sanlock/sanlock.conf

Verified on rhv-4.4.4-6 with the above steps on both SPM and HSM hosts.

Comment 41 Tzahi Ashkenazi 2021-01-05 08:55:13 UTC
Tested on:
         Engine : https://rhev-red-03.rdu2.scalelab.redhat.com
          40 Storage domains > iscsi connectivity 

         Hosts : 
           F02-h25-000-r620.rdu2.scalelab.redhat.com
       Version :
         vdsm-4.40.40-1.el8ev.x86_64
         Rhv-release-4.4.4-7-001.noarch
         sanlock-3.8.2-1.el8.x86_64
         max_worker_threads = 50
         +---------------------------------+-----------------+-----------+
         | F02-h25-000-r620 - vdsm-4.40.40 |                 | Duration  |          
         +---------------------------------+-----------------+-----------+
         | Scenario                        |                 |           |
         | Clean shutdown                  | deactivate host | 17.76 s   |
         |                                 | activate host   | 139 s     |
         | unclean shutdown                | activate host   | 140 s     |
         +---------------------------------+-----------------+-----------+


        Host: 
          F01-h08-000-1029u.rdu2.scalelab.redhat.com
          Version : 
            vdsm-4.40.37-1.el8ev.x86_64
            Rhv-release-4.4.4-2-001.noarch
            sanlock-3.8.2-1.el8.x86_64
         +----------------------------------+-----------------+-----------+
         | F01-h08-000-1029u - vdsm-4.40.37 |                 | Duration  |            
         +----------------------------------+-----------------+-----------+
         | Scenario                         |                 |           |
         | Clean shutdown                   | deactivate host | 19.51 s   |
         |                                  | activate host   | 136 s     |
         | unclean shutdown                 | activate host   | 975 s     |
         +----------------------------------+-----------------+-----------+

   p.s 
   once we will get the official sanlock build we be able to test it

Comment 45 errata-xmlrpc 2021-02-02 13:59:36 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (RHV RHEL Host (ovirt-host) 4.4.z [ovirt-4.4.4]), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:0382


Note You need to log in before you can comment on or make changes to this bug.