RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 854151 - 3.1 - vdsm: iscsi connections are not disconnected when host is put in maintenance and backend is sending disconnectStorageServer
Summary: 3.1 - vdsm: iscsi connections are not disconnected when host is put in mainte...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: vdsm
Version: 6.3
Hardware: x86_64
OS: Linux
medium
high
Target Milestone: beta
: ---
Assignee: Eduardo Warszawski
QA Contact: Dafna Ron
URL:
Whiteboard: storage
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-09-04 08:19 UTC by Dafna Ron
Modified: 2022-07-09 05:39 UTC (History)
8 users (show)

Fixed In Version: vdsm-4.9.6-38.0
Doc Type: Bug Fix
Doc Text:
Previously, when a host was put into maintenance mode, the back end would send disconnectStorageServer but the VDSM failed to actually disconnect. If there was a problem with the target, an error would remain even after the host was put back into up mode as the disconnection never occurred as it should have. Now, the disconnectStorageServer call works correctly, allowing issues to be repaired as necessary and the host to be re-activated without errors.
Clone Of:
Environment:
Last Closed: 2012-12-04 19:09:33 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
logs (1.23 MB, application/x-gzip)
2012-09-04 08:19 UTC, Dafna Ron
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2012:1508 0 normal SHIPPED_LIVE Important: rhev-3.1.0 vdsm security, bug fix, and enhancement update 2012-12-04 23:48:05 UTC

Description Dafna Ron 2012-09-04 08:19:42 UTC
Created attachment 609580 [details]
logs

Description of problem:

when we put host in maintenance backend sends disconnectStorageServer but vdsm is not disconnecting the connections. 
if the system is functioning correctly we would not notice that but when there is a problem with the targets and we do not disconnect the connections, even after we fix the issue we would still be detecting a problem in the storage (devices would be unknown). 

Version-Release number of selected component (if applicable):

si16

How reproducible:

100%

Steps to Reproduce:
1. run iscsiadm -m session on the host
2. put host in maintenance
3. run iscsiadm -m session again
4. check the vdsm log for disconnectStorageServer
  
Actual results:

although disconnectStorageServer was sent vdsm does not disconnect the connections. as a result, if there was a problem in the storage (in my case I removed my hosts from the access list) vdsm will keep detecting a problem because its reading from cache.
Expected results:

we should disconnect the iscsi connection when putting a host in maintenance. 

Additional info: logs

in the logs attached, I had a problem with the storage and even after vdsm restart I still cannot detect all the devices. 

when I start the host we see right away that we cannot detect one of the domains: 

Thread-21::ERROR::2012-09-04 11:04:15,795::sp::1566::Storage.StoragePool::(updateMonitoringThreads) Unexpected error while trying to monitor domain `0dc1433f-72e6-4b62-9845-dc022a191f4f`
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/sp.py", line 1561, in updateMonitoringThreads
    .startMonitoring(sdCache.produce(sdUUID), self.id)
  File "/usr/share/vdsm/storage/sdc.py", line 110, in produce
    dom.getRealDomain()
  File "/usr/share/vdsm/storage/sdc.py", line 51, in getRealDomain
    return self._cache._realProduce(self._sdUUID)
  File "/usr/share/vdsm/storage/sdc.py", line 123, in _realProduce
    dom = self._findDomain(sdUUID)
  File "/usr/share/vdsm/storage/sdc.py", line 147, in _findDomain
    raise se.StorageDomainDoesNotExist(sdUUID)
StorageDomainDoesNotExist: Storage domain does not exist: (u'0dc1433f-72e6-4b62-9845-dc022a191f4f',)
Thread-22::INFO::2012-09-04 11:04:15,796::safelease::160::SANLock::(acquireHostId) Acquiring host id for domain 6a1f9f02-2f53-4d67-8f6a-567c0d01777c (id: 1)


I put the host in maintenance again and I can see that disconnectStorage server is sent: 

Thread-212::INFO::2012-09-04 11:06:01,165::logUtils::37::dispatcher::(wrapper) Run and protect: disconnectStorageServer(domType=3, spUUID='f570527f-004a-4cab-8bee-129fa589bec5', conList=[{'connection': '10.35.64.10', 'iqn': 'dafna-si16-02', 'portal': '1', 'user': '', 'password': '******', 'id': 'bc1efcff-3f14-4502-9728-89fbc29e0100', 'port': '3260'}, {'connection': '10.35.64.11', 'iqn': 'iqn.1992-08.com.netapp:sn.151709391', 'portal': '2000', 'user': '', 'password': '******', 'id': 'c5a034c9-db6e-4741-ac6e-b84495e7cc37', 'port': '3260'}, {'connection': '10.35.160.7', 'iqn': 'iqn.2008-05.com.xtremio:001b21b545c0', 'portal': '1', 'user': '', 'password': '******', 'id': 'ff6680e9-2550-4a30-b947-1648617c62cc', 'port': '3260'}, {'connection': '10.35.64.10', 'iqn': 'dafna-si16-03', 'portal': '1', 'user': '', 'password': '******', 'id': '6a41fd87-0726-43a3-9306-32fb3e241ed5', 'port': '3260'}, {'connection': '10.35.64.10', 'iqn': 'dafna-si16-01', 'portal': '1', 'user': '', 'password': '******', 'id': 'bed3f09f-d7d4-4f8f-bcd6-e7f55df5be31', 'port': '3260'}], options=None)
Thread-212::DEBUG::2012-09-04 11:06:01,176::misc::1080::SamplingMethod::(__call__) Trying to enter sampling method (storage.sdc.refreshStorage)
Thread-212::DEBUG::2012-09-04 11:06:01,176::misc::1082::SamplingMethod::(__call__) Got in to sampling method
Thread-212::DEBUG::2012-09-04 11:06:01,177::misc::1080::SamplingMethod::(__call__) Trying to enter sampling method (storage.iscsi.rescan)
Thread-212::DEBUG::2012-09-04 11:06:01,177::misc::1082::SamplingMethod::(__call__) Got in to sampling method
Thread-212::DEBUG::2012-09-04 11:06:01,177::__init__::1164::Storage.Misc.excCmd::(_log) '/usr/bin/sudo -n /sbin/iscsiadm -m session -R' (cwd None)
Thread-212::DEBUG::2012-09-04 11:06:01,370::__init__::1164::Storage.Misc.excCmd::(_log) SUCCESS: <err> = ''; <rc> = 0
Thread-212::DEBUG::2012-09-04 11:06:01,371::misc::1090::SamplingMethod::(__call__) Returning last result
Thread-212::DEBUG::2012-09-04 11:06:01,378::__init__::1164::Storage.Misc.excCmd::(_log) '/usr/bin/sudo -n /sbin/multipath' (cwd None)
Thread-212::DEBUG::2012-09-04 11:06:01,816::__init__::1164::Storage.Misc.excCmd::(_log) SUCCESS: <err> = ''; <rc> = 0
Thread-212::DEBUG::2012-09-04 11:06:01,817::lvm::460::OperationMutex::(_invalidateAllPvs) Operation 'lvm invalidate operation' got the operation mutex
Thread-212::DEBUG::2012-09-04 11:06:01,817::lvm::462::OperationMutex::(_invalidateAllPvs) Operation 'lvm invalidate operation' released the operation mutex
Thread-212::DEBUG::2012-09-04 11:06:01,818::lvm::472::OperationMutex::(_invalidateAllVgs) Operation 'lvm invalidate operation' got the operation mutex
Thread-212::DEBUG::2012-09-04 11:06:01,818::lvm::474::OperationMutex::(_invalidateAllVgs) Operation 'lvm invalidate operation' released the operation mutex
Thread-212::DEBUG::2012-09-04 11:06:01,819::lvm::493::OperationMutex::(_invalidateAllLvs) Operation 'lvm invalidate operation' got the operation mutex
Thread-212::DEBUG::2012-09-04 11:06:01,819::lvm::495::OperationMutex::(_invalidateAllLvs) Operation 'lvm invalidate operation' released the operation mutex
Thread-212::DEBUG::2012-09-04 11:06:01,819::misc::1090::SamplingMethod::(__call__) Returning last result


but running iscsiadm -m session will show that the connections are alive: 

[root@localhost ~]# iscsiadm -m session
tcp: [10] 10.35.64.10:3260,1 dafna-si16-01
tcp: [11] 10.35.64.10:3260,1 dafna-si16-02
tcp: [12] 10.35.64.10:3260,1 dafna-si16-03
tcp: [8] 10.35.64.11:3260,2000 iqn.1992-08.com.netapp:sn.151709391
tcp: [9] 10.35.160.7:3260,1 iqn.2008-05.com.xtremio:001b21b545c0
[root@localhost ~]# date
Tue Sep  4 11:07:17 IDT 2012
[root@localhost ~]# 


activating the host again will show that it cannot detect all the domains again:

StorageDomainDoesNotExist: Storage domain does not exist: (u'0dc1433f-72e6-4b62-9845-dc022a191f4f',)

after I manually disconnect the iscsi connection, I activate the host and it can detect the domain and everything is fine. 


root@localhost ~]# iscsiadm -m session
tcp: [10] 10.35.64.10:3260,1 dafna-si16-01
tcp: [11] 10.35.64.10:3260,1 dafna-si16-02
tcp: [12] 10.35.64.10:3260,1 dafna-si16-03
tcp: [8] 10.35.64.11:3260,2000 iqn.1992-08.com.netapp:sn.151709391
tcp: [9] 10.35.160.7:3260,1 iqn.2008-05.com.xtremio:001b21b545c0
[root@localhost ~]# iscsiadm -m node -u
Logging out of session [sid: 10, target: dafna-si16-01, portal: 10.35.64.10,3260]
Logging out of session [sid: 11, target: dafna-si16-02, portal: 10.35.64.10,3260]
Logging out of session [sid: 12, target: dafna-si16-03, portal: 10.35.64.10,3260]
Logging out of session [sid: 8, target: iqn.1992-08.com.netapp:sn.151709391, portal: 10.35.64.11,3260]
Logging out of session [sid: 9, target: iqn.2008-05.com.xtremio:001b21b545c0, portal: 10.35.160.7,3260]
Logout of [sid: 10, target: dafna-si16-01, portal: 10.35.64.10,3260] successful.
Logout of [sid: 11, target: dafna-si16-02, portal: 10.35.64.10,3260] successful.
Logout of [sid: 12, target: dafna-si16-03, portal: 10.35.64.10,3260] successful.
Logout of [sid: 8, target: iqn.1992-08.com.netapp:sn.151709391, portal: 10.35.64.11,3260] successful.
Logout of [sid: 9, target: iqn.2008-05.com.xtremio:001b21b545c0, portal: 10.35.160.7,3260] successful.
[root@localhost ~]# iscsiadm -m session
iscsiadm: No active sessions.
[root@localhost ~]# date
Tue Sep  4 11:10:06 IDT 2012


as you can see the vgs returns with all devices: 

[root@localhost ~]# vgs -o +pv_name
  VG                                   #PV #LV #SN Attr   VSize   VFree   PV                                           
  0dc1433f-72e6-4b62-9845-dc022a191f4f   7  59   0 wz--n- 352.38g 241.50g /dev/mapper/1Dafna-si16-011346574            
  0dc1433f-72e6-4b62-9845-dc022a191f4f   7  59   0 wz--n- 352.38g 241.50g /dev/mapper/3514f0c56958002de                
  0dc1433f-72e6-4b62-9845-dc022a191f4f   7  59   0 wz--n- 352.38g 241.50g /dev/mapper/3514f0c5695800328                
  0dc1433f-72e6-4b62-9845-dc022a191f4f   7  59   0 wz--n- 352.38g 241.50g /dev/mapper/3514f0c5695800329                
  0dc1433f-72e6-4b62-9845-dc022a191f4f   7  59   0 wz--n- 352.38g 241.50g /dev/mapper/3514f0c569580032a                
  0dc1433f-72e6-4b62-9845-dc022a191f4f   7  59   0 wz--n- 352.38g 241.50g /dev/mapper/3514f0c569580032b                
  0dc1433f-72e6-4b62-9845-dc022a191f4f   7  59   0 wz--n- 352.38g 241.50g /dev/mapper/3514f0c569580032c                
  6a1f9f02-2f53-4d67-8f6a-567c0d01777c   7   6   0 wz--n- 347.38g 343.50g /dev/mapper/360a98000572d45366b4a6d4156565377
  6a1f9f02-2f53-4d67-8f6a-567c0d01777c   7   6   0 wz--n- 347.38g 343.50g /dev/mapper/3514f0c56958002ec                
  6a1f9f02-2f53-4d67-8f6a-567c0d01777c   7   6   0 wz--n- 347.38g 343.50g /dev/mapper/3514f0c56958002eb                
  6a1f9f02-2f53-4d67-8f6a-567c0d01777c   7   6   0 wz--n- 347.38g 343.50g /dev/mapper/3514f0c56958002e9                
  6a1f9f02-2f53-4d67-8f6a-567c0d01777c   7   6   0 wz--n- 347.38g 343.50g /dev/mapper/3514f0c56958002ed                
  6a1f9f02-2f53-4d67-8f6a-567c0d01777c   7   6   0 wz--n- 347.38g 343.50g /dev/mapper/3514f0c5695800320                
  6a1f9f02-2f53-4d67-8f6a-567c0d01777c   7   6   0 wz--n- 347.38g 343.50g /dev/mapper/3514f0c5695800322                
  7d1dd321-16a3-4557-99c2-7d524b60f33f   9   6   0 wz--n- 416.62g 412.75g /dev/mapper/1Dafna-si16-021346574            
  7d1dd321-16a3-4557-99c2-7d524b60f33f   9   6   0 wz--n- 416.62g 412.75g /dev/mapper/3514f0c56958002e3                
  7d1dd321-16a3-4557-99c2-7d524b60f33f   9   6   0 wz--n- 416.62g 412.75g /dev/mapper/3514f0c56958002e6                
  7d1dd321-16a3-4557-99c2-7d524b60f33f   9   6   0 wz--n- 416.62g 412.75g /dev/mapper/3514f0c56958002e4                
  7d1dd321-16a3-4557-99c2-7d524b60f33f   9   6   0 wz--n- 416.62g 412.75g /dev/mapper/3514f0c56958002e5                
  7d1dd321-16a3-4557-99c2-7d524b60f33f   9   6   0 wz--n- 416.62g 412.75g /dev/mapper/3514f0c5695800326                
  7d1dd321-16a3-4557-99c2-7d524b60f33f   9   6   0 wz--n- 416.62g 412.75g /dev/mapper/3514f0c5695800327                
  7d1dd321-16a3-4557-99c2-7d524b60f33f   9   6   0 wz--n- 416.62g 412.75g /dev/mapper/3514f0c5695800331                
  7d1dd321-16a3-4557-99c2-7d524b60f33f   9   6   0 wz--n- 416.62g 412.75g /dev/mapper/3514f0c5695800333                
  b1f82e00-4647-4df0-8190-6ba4e811f5a5   8  23   0 wz--n- 397.00g 350.75g /dev/mapper/3514f0c56958002ee                
  b1f82e00-4647-4df0-8190-6ba4e811f5a5   8  23   0 wz--n- 397.00g 350.75g /dev/mapper/3514f0c56958002ef                
  b1f82e00-4647-4df0-8190-6ba4e811f5a5   8  23   0 wz--n- 397.00g 350.75g /dev/mapper/3514f0c56958002f3                
  b1f82e00-4647-4df0-8190-6ba4e811f5a5   8  23   0 wz--n- 397.00g 350.75g /dev/mapper/3514f0c56958002f2                
  b1f82e00-4647-4df0-8190-6ba4e811f5a5   8  23   0 wz--n- 397.00g 350.75g /dev/mapper/3514f0c56958002f0                
  b1f82e00-4647-4df0-8190-6ba4e811f5a5   8  23   0 wz--n- 397.00g 350.75g /dev/mapper/3514f0c56958002f1                
  b1f82e00-4647-4df0-8190-6ba4e811f5a5   8  23   0 wz--n- 397.00g 350.75g /dev/mapper/3514f0c569580031d                
  b1f82e00-4647-4df0-8190-6ba4e811f5a5   8  23   0 wz--n- 397.00g 350.75g /dev/mapper/3514f0c569580031e                
  bc7fde7a-4d43-4dd4-874a-bff5ca517bae   9   7   0 wz--n- 416.62g 410.75g /dev/mapper/1Dafna-si16-031346574            
  bc7fde7a-4d43-4dd4-874a-bff5ca517bae   9   7   0 wz--n- 416.62g 410.75g /dev/mapper/3514f0c56958002e2                
  bc7fde7a-4d43-4dd4-874a-bff5ca517bae   9   7   0 wz--n- 416.62g 410.75g /dev/mapper/3514f0c56958002e7                
  bc7fde7a-4d43-4dd4-874a-bff5ca517bae   9   7   0 wz--n- 416.62g 410.75g /dev/mapper/3514f0c56958002e8                
  bc7fde7a-4d43-4dd4-874a-bff5ca517bae   9   7   0 wz--n- 416.62g 410.75g /dev/mapper/3514f0c56958002ea                
  bc7fde7a-4d43-4dd4-874a-bff5ca517bae   9   7   0 wz--n- 416.62g 410.75g /dev/mapper/3514f0c5695800323                
  bc7fde7a-4d43-4dd4-874a-bff5ca517bae   9   7   0 wz--n- 416.62g 410.75g /dev/mapper/3514f0c5695800324                
  bc7fde7a-4d43-4dd4-874a-bff5ca517bae   9   7   0 wz--n- 416.62g 410.75g /dev/mapper/3514f0c569580032e                
  bc7fde7a-4d43-4dd4-874a-bff5ca517bae   9   7   0 wz--n- 416.62g 410.75g /dev/mapper/3514f0c569580032f                
  vg0                                    1   3   0 wz--n- 136.24g      0  /dev/sda2                                    
[root@localhost ~]# date
Tue Sep  4 11:10:49 IDT 2012

Comment 3 Eduardo Warszawski 2012-09-30 07:49:45 UTC
http://gerrit.ovirt.org/#/c/8275/

Comment 5 Dafna Ron 2012-10-21 13:33:42 UTC
verified on si21.1

Comment 8 errata-xmlrpc 2012-12-04 19:09:33 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2012-1508.html


Note You need to log in before you can comment on or make changes to this bug.