Bug 1912923

Summary: Support 50 concurrent add_lockspace operations [rhel-8.3.0.z]
Product: Red Hat Enterprise Linux 8 Reporter: RHEL Program Management Team <pgm-rhel-tools>
Component: sanlockAssignee: David Teigland <teigland>
Status: CLOSED ERRATA QA Contact: cluster-qe <cluster-qe>
Severity: high Docs Contact:
Priority: high    
Version: 8.3CC: aefrat, agk, cluster-maint, cmarthal, jbrassow, mtessun, nsoffer, rhandlin, tashkena, teigland, vjuranek
Target Milestone: rcKeywords: Triaged, ZStream
Target Release: 8.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: sanlock-3.8.2-3.el8_3 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1902468 Environment:
Last Closed: 2021-04-06 14:15:39 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1902468    
Bug Blocks: 1903358    

Comment 4 David Teigland 2021-01-06 19:20:49 UTC
built here https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=34105561

Comment 7 Tzahi Ashkenazi 2021-01-11 14:30:10 UTC
Comparison between sanlock version 3.8.2-1  to sanlock  3.8.2-2 on the following : 

* 40 Storage domains > iscsi connectivity 
Version : 
 vdsm-4.40.40-1.el8ev.x86_64
 Rhv-release-4.4.4-7-001.noarch
 Red Hat Enterprise Linux release 8.3 (Ootpa)

Tested configuration:

$ cat /etc/sanlock/sanlock.conf
# Configuration for vdsm
our_host_name = 4c4c4544-0048-4e10-804e-b3c04f515731
max_worker_threads = 50

# sanlock client status -D
daemon 4c4c4544-0048-4e10-804e-b3c04f515731
    our_host_name=4c4c4544-0048-4e10-804e-b3c04f515731
    use_watchdog=1
    high_priority=0
    mlock_level=1
    quiet_fail=1
    debug_renew=0
    debug_clients=0
    debug_cmds=0xfffffffffe06ffff
    renewal_history_size=180
    gid=179
    uid=179
    sh_retries=8
    max_sectors_kb_ignore=0
    max_sectors_kb_align=0
    max_sectors_kb_num=1024
    max_worker_threads=50
    write_init_io_timeout=60
    use_aio=1
    kill_grace_seconds=40
    helper_pid=1363
    helper_kill_fd=7
    helper_full_count=0
    helper_last_status=78116
    monotime=78140
    version_str=3.8.2
    version_num=3.8.2
    version_hex=03080200
    smproto_hex=00000001


Test procedure :
================

Clean shutdown:
 A.Deactivate host: 
   1. Deactivate host in the DC ( move host to maintenance from the UI > host view ) 
   2. start to measure the time, Wait until all lockspaces are removed,  using the command > "watch sanlock client status"
 B.Activate host:
   1. Activate host from the UI in the DC ( move host to active from the UI > host view ) 
   2. start to measure the time until all lockspaces were added. using the command > "watch sanlock client status"

Unclean shutdown:
  1. While host is active, perform cold power off  ( using idrac /IPMI )
  2. power on the server again using the idrac /IPMI 
  3. start to Measure the time at the moment the SSH service is available again Wait until all lockspaces were added, using the command > "watch sanlock client status"

Notes :

When activating hosts, you will see all lockapces with ADD marker. Wait until
all lockapaces do not have ADD marker.

When deactivating host you will see all lockspaces with REM marker. Wait until
the lockspaces are removed.

Results :
Comparison between sanlock version 3.8.2-1  to sanlock  3.8.2-2 on the following : 

+----------------------------------+-----------------+-----------------+-----------------+
| F02-h25-000-r620 - VDSM-4.40.40  |                 |             Duration              |
+----------------------------------+-----------------+-----------------+-----------------+
| Scenario                         |                 | sanlock 3.8.2-1 | sanlock 3.8.2-2 |                  
|                                  |                 |                 |                 |
| Clean shutdown                   | deactivate host | 17.76 s         | 6.56 s          |
|                                  | activate host   | 139 s           | 56.74 s         |
| unClean shutdown                 | activate host   | 140 s           | 70.72 s         |
+----------------------------------+-----------------+-----------------+-----------------+


there is a huge improvement in the time duration on the above results between sanlock  3.8.2-1 vs 3.8.2-2

Comment 8 Nir Soffer 2021-01-12 10:51:39 UTC
Corey, based on comment 7 I think we can mark the bug as verified.

Comment 9 Corey Marthaler 2021-01-12 14:10:19 UTC
Indeed. Moving to Verified based on results in comment #7.

Thanks.

Comment 10 David Teigland 2021-01-12 17:19:34 UTC
new rebuild of the same thing for a new build target, "rhpkg build --target rhel-8.3.0-z-candidate"
hopefully errata tool likes this one

Comment 11 David Teigland 2021-01-12 23:03:47 UTC
fix rpm name in the fixed in version field to sanlock-3.8.2-3.el8_3 as shown in
https://brewweb.engineering.redhat.com/brew/buildinfo?buildID=1456071
errata tool took that one

Comment 23 errata-xmlrpc 2021-04-06 14:15:39 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (sanlock bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:1090