Description of problem: What is AVT incase of Engenio storage array type? A feature of the controller firmware that helps to manage each volume in a storage array. When you use AVT with a multi-path driver, AVT helps to make sure that an I/O data path always is available for the volumes in the storage array. Procedure to duplicate the issue: 1) It's a 1x1 setup with 1 host connected to 1 array through Fabric 2) Install Storage Foundation-version:4.1 of Veritas 3) Run all the DMP settings on the array which are as below, set controller [a] HostNVSRAMByte[6,0x24]=0x01,0x01; // enable AVT set controller [a] HostNVSRAMByte[6,0x27]=0x18,0x18; // AVT_EXCLUSION_EXTENT set controller [b] HostNVSRAMByte[6,0x24]=0x01,0x01; // enable AVT set controller [b] HostNVSRAMByte[6,0x27]=0x18,0x18; // AVT_EXCLUSION_EXTENT 4) Map 32 LUNs to host from array 5) Reboot the host 6) The host takes more than 40 minutes to boot back up 7) The long boot is because the lvm is issuing READS to all the discovered SCSI devices at sector/LBA 2088832(0x1fdf80) which is triggering AVT. When AVT is triggered, it moves the LUN to the path where the READ is coming For instance, a) The host has a dual channel (Qlogic/emulex) Fibre HBA. T b) The first port of the Fibre HBA is connected to ControllerA and second port is connected to ControllerB through Fabric c) ControllerA owns LUN0/sdb (preferred pathA) and ControllerB owns LUN1/sdc (preferred pathB). The host has an internal SCSI hard disk and so sda has been assigned to it d) During the boot time, the lvm is issuing READ to sdb and sdc. Since, the READ command is going on pathA to sdc, it is going to trigger AVT and the AVT is going to move that LUN1 to pathA to give a good status back to the initiator e) It happens same when LVM issues READ to sdb on pathB and AVT will be triggered f) Because of the above thrashing during the boot time, there is a delay in the boot and it takes about 40+ minutes and it is entirely based on the number of LUNs mapped to the host. The higher the number of LUNs, longer is the boot time 8) The Engenio storage has a feature which is "AVT Exclusion extent" which specifies that if the host/initiator issues READ to the first and last 8192sector of the disk, it will not trigger AVT 8) The array is configured with volume size of 1GB 9) The volume size is 1GB, the sector range should be < 8192 at the beginning of the disk and > 2088960 sector of the end of the disk 10) Since the host is issuing READ at sector 2088832 (1FDF80) which is not falling in the specified range, it is triggering AVT which in turn leads to the long boot time of the host 11) The above mentioned LBA offset is also verified from the FC trace 12) The issue is happening on RHEL4u2 Version-Release number of selected component (if applicable): 1) RHEL4u2 is being used in the configuration 2) Qlogic HBA driver-version: 8.01.03 How reproducible: -- Always Steps to Reproduce: 1. Connect a host(Dell PowerEdgeServer 2650/any) to an Engenio Storage array through Fabric (It's fine even if there is Fibre switch) 2. Install Storage Foundation-version:4.1 of Veritas w/ any Maintanence Packs 3. Even if there is no Veritas also, the issue should still happen 4. Run all the DMP settings on the array which are as below, set controller [a] HostNVSRAMByte[6,0x1a]=0x01,0x01; // return 12 bytes of WWN in inquiry set controller [a] HostNVSRAMByte[6,0x23]=0x01,0x01; // enable "report preferred path" set controller [a] HostNVSRAMByte[6,0x24]=0x01,0x01; // enable AVT set controller [a] HostNVSRAMByte[6,0x25]=0x80,0x80; // enable DMP support set controller [a] HostNVSRAMByte[6,0x27]=0x18,0x18; // AVT_EXCLUSION_EXTENT set controller [b] HostNVSRAMByte[6,0x1a]=0x01,0x01; // return 12 bytes of WWN in inquiry set controller [b] HostNVSRAMByte[6,0x23]=0x01,0x01; // enable "report preferred path" set controller [b] HostNVSRAMByte[6,0x24]=0x01,0x01; // enable AVT set controller [b] HostNVSRAMByte[6,0x25]=0x80,0x80; // enable DMP support set controller [b] HostNVSRAMByte[6,0x27]=0x18,0x18; // AVT_EXCLUSION_EXTENT 5. Map 32 LUNs to host from array 6. Reboot the host 7. The host takes about more than 40+ minutes to boot back up Actual results: -- The LVM is triggering AVT which leads to thrashing of LUNs which in turn delays the boot time of the host Expected results: -- A small fix can be done to avoid issuing READ to the SCSI devices coming from Engenio storage array (LSI) which basically shows up as LSI For an Engenio Storage array, the o/p from '/proc/scsi/scsi/' Host: scsi2 Channel: 00 Id: 10 Lun: 16 Vendor: LSI Model: INF-01-00 Rev: 9617 Type: Direct-Access ANSI SCSI revision: 03 Host: scsi2 Channel: 00 Id: 10 Lun: 17 Vendor: LSI Model: INF-01-00 Rev: 9617 Type: Direct-Access ANSI SCSI revision: 03 Host: scsi2 Channel: 00 Id: 10 Lun: 18 Vendor: LSI Model: INF-01-00 Rev: 9617 Type: Direct-Access ANSI SCSI revision: 03 Host: scsi2 Channel: 00 Id: 10 Lun: 19 Vendor: LSI Model: INF-01-00 Rev: 9617 Type: Direct-Access ANSI SCSI revision: 03 Additional info: The messages from '/var/log/messages' file. An internal developed failover (RDAC) driver was used to debug to see what modules from OS are issuing READS and the below are, Apr 19 22:36:56 deer kernel: disk:sdb process: lvm opcode READ_10 LBA:0x0:TranLen:0x80 Apr 19 22:36:56 deer kernel: disk:sdb process: lvm opcode READ_10 LBA:0x1fff80:TranLen:0x8 Apr 19 22:36:56 deer kernel: disk:sdb process: lvm opcode READ_10 LBA:0x0:TranLen:0x8 Apr 19 22:36:56 deer kernel: disk:sdb process: lvm opcode READ_10 LBA:0x1fdf80:TranLen:0x8 Apr 19 22:36:57 deer kernel: disk:sdb process: lvm opcode READ_10 LBA:0x0:TranLen:0x8 Apr 19 22:36:57 deer kernel: disk:sdb process: lvm opcode READ_10 LBA:0x1fdf80:TranLen:0x8 Apr 19 22:36:57 deer kernel: disk:sdb process: lvm opcode READ_10 LBA:0x0:TranLen:0x8 I am also attaching the messages' file w/ this bugzilla and the line#'s to be looked at in the 'messages' file are as below, 1. line#3358 and onwards 2. line#3553, a command being issued from lvm.static 3. line#3590, a command being issued from 'mount' Issues with the modules in the OS: 1. lvm 2. lvm.static 3. mount
Created attachment 128017 [details] 'messages' file from the RHEL4u2 box
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release.