Bug 1063871 - [TestOnly][LOG] host lvm IO error after switching datacenter
Summary: [TestOnly][LOG] host lvm IO error after switching datacenter
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: vdsm
Classification: oVirt
Component: General
Version: ---
Hardware: Unspecified
OS: Unspecified
low
medium
Target Milestone: ovirt-4.0.6
: 4.18.15
Assignee: Nir Soffer
QA Contact: Kevin Alon Goldblatt
URL:
Whiteboard:
Depends On: deactivate_lv_on_domain_deactivation 1331978
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-02-11 15:09 UTC by Petr Beňas
Modified: 2017-01-18 07:24 UTC (History)
13 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-01-18 07:24:47 UTC
oVirt Team: Storage
Embargoed:
ykaul: ovirt-4.0.z?
rule-engine: planning_ack?
rule-engine: devel_ack+
rule-engine: testing_ack+


Attachments (Terms of Use)

Description Petr Beňas 2014-02-11 15:09:56 UTC
Description of problem:
After reassigning host from datacenter to another, it's lvm is messed up. 

Version-Release number of selected component (if applicable):
vdsm-4.13.2-0.8.el6ev.x86_64

How reproducible:
100%

Steps to Reproduce:
1. Add host to the Default datacenter using NFS storage and run a VM on it. 
2. Create new datacenter using iSCSI. 
3. Stop the VM, put the host into maintanance and reassign it to the other datacenter.
4. Run another VM (from the new datacenter) on the host. 
5 Ssh to the host and run lvs

Actual results:
# lvs
  /dev/mapper/1IET_00010001: read failed after 0 of 4096 at 21474770944: Input/output error
  /dev/mapper/1IET_00010001: read failed after 0 of 4096 at 21474828288: Input/output error
  /dev/mapper/1IET_00010001: read failed after 0 of 4096 at 0: Input/output error
  /dev/mapper/1IET_00010001: read failed after 0 of 4096 at 4096: Input/output error
  /dev/4ae49b1b-550b-4a17-b545-dcbfaef98303/metadata: read failed after 0 of 4096 at 536805376: Input/output error
  /dev/4ae49b1b-550b-4a17-b545-dcbfaef98303/metadata: read failed after 0 of 4096 at 536862720: Input/output error
  /dev/4ae49b1b-550b-4a17-b545-dcbfaef98303/metadata: read failed after 0 of 4096 at 0: Input/output error
  /dev/4ae49b1b-550b-4a17-b545-dcbfaef98303/metadata: read failed after 0 of 4096 at 4096: Input/output error
  /dev/4ae49b1b-550b-4a17-b545-dcbfaef98303/leases: read failed after 0 of 4096 at 2147418112: Input/output error
  /dev/4ae49b1b-550b-4a17-b545-dcbfaef98303/leases: read failed after 0 of 4096 at 2147475456: Input/output error
  /dev/4ae49b1b-550b-4a17-b545-dcbfaef98303/leases: read failed after 0 of 4096 at 0: Input/output error
  /dev/4ae49b1b-550b-4a17-b545-dcbfaef98303/leases: read failed after 0 of 4096 at 4096: Input/output error
  /dev/4ae49b1b-550b-4a17-b545-dcbfaef98303/ids: read failed after 0 of 4096 at 134152192: Input/output error
  /dev/4ae49b1b-550b-4a17-b545-dcbfaef98303/ids: read failed after 0 of 4096 at 134209536: Input/output error
  /dev/4ae49b1b-550b-4a17-b545-dcbfaef98303/ids: read failed after 0 of 4096 at 0: Input/output error
  /dev/4ae49b1b-550b-4a17-b545-dcbfaef98303/ids: read failed after 0 of 4096 at 4096: Input/output error
  /dev/4ae49b1b-550b-4a17-b545-dcbfaef98303/inbox: read failed after 0 of 4096 at 134152192: Input/output error
  /dev/4ae49b1b-550b-4a17-b545-dcbfaef98303/inbox: read failed after 0 of 4096 at 134209536: Input/output error
  /dev/4ae49b1b-550b-4a17-b545-dcbfaef98303/inbox: read failed after 0 of 4096 at 0: Input/output error
  /dev/4ae49b1b-550b-4a17-b545-dcbfaef98303/inbox: read failed after 0 of 4096 at 4096: Input/output error
  /dev/4ae49b1b-550b-4a17-b545-dcbfaef98303/outbox: read failed after 0 of 4096 at 134152192: Input/output error
  /dev/4ae49b1b-550b-4a17-b545-dcbfaef98303/outbox: read failed after 0 of 4096 at 134209536: Input/output error
  /dev/4ae49b1b-550b-4a17-b545-dcbfaef98303/outbox: read failed after 0 of 4096 at 0: Input/output error
  /dev/4ae49b1b-550b-4a17-b545-dcbfaef98303/outbox: read failed after 0 of 4096 at 4096: Input/output error
  /dev/4ae49b1b-550b-4a17-b545-dcbfaef98303/master: read failed after 0 of 4096 at 1073676288: Input/output error
  /dev/4ae49b1b-550b-4a17-b545-dcbfaef98303/master: read failed after 0 of 4096 at 1073733632: Input/output error
  /dev/4ae49b1b-550b-4a17-b545-dcbfaef98303/master: read failed after 0 of 4096 at 0: Input/output error
  /dev/4ae49b1b-550b-4a17-b545-dcbfaef98303/master: read failed after 0 of 4096 at 4096: Input/output error
  LV                                   VG                                   Attr       LSize   Pool Origin Data%  Move Log Cpy%Sync Convert
  b3274055-9e41-488f-954a-625a3389a27a 4ae49b1b-550b-4a17-b545-dcbfaef98303 -wi-------   6.00g                                             
  ids                                  4ae49b1b-550b-4a17-b545-dcbfaef98303 -wi-a----- 128.00m                                             
  inbox                                4ae49b1b-550b-4a17-b545-dcbfaef98303 -wi-a----- 128.00m                                             
  leases                               4ae49b1b-550b-4a17-b545-dcbfaef98303 -wi-a-----   2.00g                                             
  master                               4ae49b1b-550b-4a17-b545-dcbfaef98303 -wi-a-----   1.00g                                             
  metadata                             4ae49b1b-550b-4a17-b545-dcbfaef98303 -wi-a----- 512.00m                                             
  outbox                               4ae49b1b-550b-4a17-b545-dcbfaef98303 -wi-a----- 128.00m                                             
  lv_home                              vg_slot7                             -wi-ao----  20.00g                                             
  lv_root                              vg_slot7                             -wi-ao----  50.00g                                             
  lv_shit                              vg_slot7                             -wi-ao----  10.00g                                             
  lv_swap                              vg_slot7                             -wi-ao----   9.82g                                  

Expected results:
Proper clean-up when the host is removed from datacenter

Additional info:
Not sure if one datacenter has to use NFS and the other iSCSI in order to reproduce. Maybe it would happen as well if the same type, but different instance of storage is used.

Comment 1 Ayal Baron 2014-02-11 18:43:55 UTC
Nir, is this resolved with http://gerrit.ovirt.org/#/c/24088/ ?

Petr, a possible source of problems would be that your tgtd server is not configured to create LUNs with unique IDs.
what you need to do is edit your targets.conf and add the scsi_id and scsi_sn fields.

Example:
<target MasterBackup>
     allow-in-use yes
<backing-store /dev/vg0/MasterBackup>
         lun 1
         scsi_id MasterBackup
         scsi_sn 444444444401
</backing-store>
</target>

Comment 2 Nir Soffer 2014-02-11 21:33:16 UTC
(In reply to Ayal Baron from comment #1)
> Nir, is this resolved with http://gerrit.ovirt.org/#/c/24088/ ?

I don't see any connection.

This is what happen when you move a host to maintainace - we disconnect from storage but leave junk devices behind.

Comment 3 Ayal Baron 2014-02-11 22:09:21 UTC
(In reply to Nir Soffer from comment #2)
> (In reply to Ayal Baron from comment #1)
> > Nir, is this resolved with http://gerrit.ovirt.org/#/c/24088/ ?
> 
> I don't see any connection.
> 
> This is what happen when you move a host to maintainace - we disconnect from
> storage but leave junk devices behind.

ok, so we need to clean it up...

Comment 4 Nir Soffer 2014-02-14 22:10:30 UTC
This is just warnings from lvm commands, and it does not have any effect on the funcionality of the system. No reason for high priority.

Comment 5 Federico Simoncelli 2014-05-05 08:32:36 UTC
(In reply to Ayal Baron from comment #3)
> (In reply to Nir Soffer from comment #2)
> > (In reply to Ayal Baron from comment #1)
> > > Nir, is this resolved with http://gerrit.ovirt.org/#/c/24088/ ?
> > 
> > I don't see any connection.
> > 
> > This is what happen when you move a host to maintainace - we disconnect from
> > storage but leave junk devices behind.
> 
> ok, so we need to clean it up...

The solution is to deactivate all the lvs of the vg on the iscsi connection that we are about to disconnect.

Comment 7 Sandro Bonazzola 2015-10-26 12:30:08 UTC
this is an automated message. oVirt 3.6.0 RC3 has been released and GA is targeted to next week, Nov 4th 2015.
Please review this bug and if not a blocker, please postpone to a later release.
All bugs not postponed on GA release will be automatically re-targeted to

- 3.6.1 if severity >= high
- 4.0 if severity < high

Comment 8 Yaniv Kaul 2016-03-10 10:39:33 UTC
Closing old tickets, in medium/low severity. If you believe it should be re-opened, please do so and add justification.

(Also, probably will be solved when dep RFEs will be implemented)

Comment 9 Nir Soffer 2016-05-01 12:35:45 UTC
Fixed in bug 1331978

Comment 10 Allon Mureinik 2016-05-05 12:47:01 UTC
(In reply to Nir Soffer from comment #9)
> Fixed in bug 1331978
Reopening based on that comment, so QE can verify this scenario when bug 1331978 is fixed.

Comment 11 Yaniv Lavi 2016-05-09 10:57:53 UTC
oVirt 4.0 Alpha has been released, moving to oVirt 4.0 Beta target.

Comment 12 Yaniv Lavi 2016-05-23 13:13:18 UTC
oVirt 4.0 beta has been released, moving to RC milestone.

Comment 13 Allon Mureinik 2016-08-29 11:21:40 UTC
There is no engineering item here - moving to ON_QA after talking to Aharon, but note that bug 1331978 needs to be fixed in order to verify this.

Comment 14 Kevin Alon Goldblatt 2016-11-21 12:38:25 UTC
Tested with the following code:
----------------------------------------
vdsm-4.18.999-759.git435a852.el7.centos.x86_64
rhevm-4.0.6-0.1.el7ev.noarch


Tested with the following scenario:
----------------------------------------
Steps to Reproduce:
1. Add host to the Default datacenter using NFS storage and run a VM on it. 
2. Create new datacenter using iSCSI. 
3. Stop the VM, put the host into maintanance and reassign it to the other datacenter.
4. Run another VM (from the new datacenter) on the host. 
5 Ssh to the host and run lvs



VM RUNS FINE AND NO ERRORS ARE REPORTED BY LVS ON THE HOST!

MOVING TO VERIFIED!


Note You need to log in before you can comment on or make changes to this bug.