Bug 853672

Summary: SAN controller reboot or service relocate causes one node to be fenced
Product: Red Hat Enterprise Linux 6 Reporter: feiwang
Component: clusterAssignee: Fabio Massimo Di Nitto <fdinitto>
Status: CLOSED INSUFFICIENT_DATA QA Contact: Cluster QE <mspqa-list>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 6.3CC: ccaulfie, cluster-maint, lhh, rpeterso, teigland
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-10-05 13:28:07 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description feiwang 2012-09-02 03:25:45 UTC
Description of problem:
SAN controller reboot or service relocate causes one node to be fenced 

Version-Release number of selected component (if applicable):
RHEL6.3 2.6.32-279.el6.x86_64

How reproducible:
always

Steps to Reproduce:
1.configure two services, rg1(GFS2 file system directly based on multipath device) and rg2(GFS2 file system based on CLVM device), which is to export these GFS2 file systems as NFS file system 
2.run IO on these exported NFS file system on a client
3.run a script to reboot each SAN controller twice alternatively or mannually switch services between these two nodes
  
Actual results:
arcx3650chmar is fenced(the FC ports are disabled on the SAN switch) after running controller twice reboot scripts around 2 hours or switching services for three or four times

Expected results:
no node is expected to be fenced

Additional info:
1.These two nodes are using SAN boot OS 
2./etc/cluster/cluster.conf is as follows:

<?xml version="1.0"?>
<cluster config_version="15" name="iotl0130">
        <fence_daemon clean_start="0" post_fail_delay="0" post_join_delay="3"/>
        <clusternodes>
                <clusternode name="arcx3650chmap" nodeid="1" votes="1">
                        <fence>
                                <method name="1">
                                        <device name="arcsw325106x" port="8"/>
                                        <device name="arcsw325106x" port="9"/>
                                </method>
                        </fence>
                </clusternode>
                <clusternode name="arcx3650chmar" nodeid="2" votes="1">
                        <fence>
                                <method name="1">
                                        <device name="arcsw325207h" port="20"/>
                                        <device name="arcsw325207h" port="21"/>
                                </method>
                        </fence>
                </clusternode>
        </clusternodes>
        <cman expected_votes="1" quorum_dev_poll="250000" two_node="1">
                <multicast addr="227.0.0.66"/>
        </cman>
        <fencedevices>
                <fencedevice agent="fence_brocade" ipaddr="9.11.215.250" login="osl" name="arcsw325106x" passwd="open1sys"/>
                <fencedevice agent="fence_brocade" ipaddr="9.11.215.174" login="osl" name="arcsw325207h" passwd="open1sys"/>
        </fencedevices>
        <rm>
                <failoverdomains>
                        <failoverdomain name="wangfei" nofailback="0" ordered="0" restricted="1">
                                <failoverdomainnode name="arcx3650chmap" priority="1"/>
                                <failoverdomainnode name="arcx3650chmar" priority="2"/>
                        </failoverdomain>
                </failoverdomains>
                <resources>
                        <nfsclient name="nfs_client" options="no_root_squash,rw,sync" target="*"/>
                        <ip address="9.11.111.150" sleeptime="10"/>
                        <ip address="9.11.111.156" sleeptime="10"/>
                        <nfsexport name="nfs_export"/>
                        <clusterfs device="/dev/mapper/mpathb" force_unmount="0" fsid="110" fstype="gfs2" mountpoint="/mnt/d1" name="gfs_d1" options=""/>
                        <clusterfs device="/dev/mapper/mpathc" force_unmount="0" fsid="111" fstype="gfs2" mountpoint="/mnt/d2" name="gfs_d2" options=""/>
                        <clusterfs device="/dev/mapper/mpathd" force_unmount="0" fsid="112" fstype="gfs2" mountpoint="/mnt/d3" name="gfs_d3" options=""/>
                        <clusterfs device="/dev/mapper/gfs-gfs_lv1" force_unmount="0" fsid="107" fstype="gfs2" mountpoint="/mnt/lv1" name="gfs_lv1" options=""/>
                        <clusterfs device="/dev/mapper/gfs-gfs_lv2" force_unmount="0" fsid="108" fstype="gfs2" mountpoint="/mnt/lv2" name="gfs_lv2" options=""/>
                        <clusterfs device="/dev/mapper/gfs-gfs_lv3" force_unmount="0" fsid="109" fstype="gfs2" mountpoint="/mnt/lv3" name="gfs_lv3" options=""/>
                </resources>
                <service autostart="1" domain="wangfei" exclusive="0" name="rg1" recovery="relocate">
                        <ip ref="9.11.111.150"/>
                        <clusterfs ref="gfs_d1">
                                <nfsexport ref="nfs_export">
                                        <nfsclient ref="nfs_client"/>
                                </nfsexport>
                        </clusterfs>
                        <clusterfs ref="gfs_d2">
                                <nfsexport ref="nfs_export">
                                        <nfsclient ref="nfs_client"/>
                                </nfsexport>
                        </clusterfs>
                        <clusterfs ref="gfs_d3">
                                <nfsexport ref="nfs_export">
                                        <nfsclient ref="nfs_client"/>
                                </nfsexport>
                        </clusterfs>
                </service>
                <service autostart="1" domain="wangfei" exclusive="0" name="rg2" recovery="relocate">
                        <ip ref="9.11.111.156"/>
                        <clusterfs ref="gfs_lv1">
                                <nfsexport ref="nfs_export">
                                        <nfsclient ref="nfs_client"/>
                                </nfsexport>
                        </clusterfs>
                        <clusterfs ref="gfs_lv2">
                                <nfsexport ref="nfs_export">
                                        <nfsclient ref="nfs_client"/>
                                </nfsexport>
                        </clusterfs>
                        <clusterfs ref="gfs_lv3">
                                <nfsexport ref="nfs_export">
                                        <nfsclient ref="nfs_client"/>
                                </nfsexport>
                        </clusterfs>
                </service>
        </rm>
        <totem token="250000"/>
        <quorumd interval="12" label="iotl0130" min_score="1" tko="10" votes="1">
                <heuristic interval="10" program="ping 9.11.110.1 -c1 -t1" score="1" tko="3"/>
        </quorumd>
</cluster>

3.multipath.conf is as follows:

# multipath.conf written by anaconda
defaults {
    polling_interval        30
    user_friendly_names yes
}

multipaths {
}

blacklist {

    device {
        vendor "*"
        product "Universal Xport"
    }
    device {
        vendor "IBM-ESXS"
    }
    device {
        vendor "LSILOGIC"
    }
    device {
        vendor "ATA"
    }
    device {
        vendor "VMware"
    }
    devnode "^(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*"
    devnode "^hd[a-z]"
    devnode "^cciss!c[0-9]d[0-9]*"
}

# Make sure our multipath devices are enabled.

blacklist_exceptions {
}

devices {
      device {
              vendor                  "IBM"
              product                 "2145"
              path_grouping_policy    group_by_prio
              getuid_callout "/lib/udev/scsi_id --whitelisted --device=/dev/%n"
              features                "1 queue_if_no_path"
              prio                    alua
              path_checker            tur
              failback                immediate
              no_path_retry           "5"
              rr_min_io               1
              dev_loss_tmo            120
      }
}

4./var/log/messages information when the node is fenced:
Sep  1 07:44:17 arcx3650chmap rgmanager[5061]: Stopping service service:rg1
Sep  1 07:44:18 arcx3650chmap rgmanager[27775]: [ip] Removing IPv4 address 9.11.111.150/23 from eth0
Sep  1 07:44:18 arcx3650chmap avahi-daemon[4471]: Withdrawing address record for 9.11.111.150 on eth0.
Sep  1 07:44:19 arcx3650chmap ntpd[4872]: Deleting interface #7 eth0, 9.11.111.150#123, interface stats: received=0, sent=0, dropped=0, active_time=316582 secs
Sep  1 07:44:28 arcx3650chmap rgmanager[28147]: [nfsclient] Removing export: *:/mnt/d3
Sep  1 07:44:28 arcx3650chmap rgmanager[28238]: [nfsclient] Removing export: *:/mnt/d2
Sep  1 07:44:28 arcx3650chmap rgmanager[28329]: [nfsclient] Removing export: *:/mnt/d1
Sep  1 07:44:29 arcx3650chmap rgmanager[5061]: Service service:rg1 is stopped
Sep  1 07:45:54 arcx3650chmap rgmanager[5061]: Stopping service service:rg2
Sep  1 07:45:54 arcx3650chmap rgmanager[29509]: [ip] Removing IPv4 address 9.11.111.156/23 from eth0
Sep  1 07:45:54 arcx3650chmap avahi-daemon[4471]: Withdrawing address record for 9.11.111.156 on eth0.
Sep  1 07:45:55 arcx3650chmap ntpd[4872]: Deleting interface #8 eth0, 9.11.111.156#123, interface stats: received=0, sent=0, dropped=0, active_time=316678 secs
Sep  1 07:46:04 arcx3650chmap rgmanager[29554]: [nfsclient] Removing export: *:/mnt/lv3
Sep  1 07:46:04 arcx3650chmap rgmanager[29645]: [nfsclient] Removing export: *:/mnt/lv2
Sep  1 07:46:04 arcx3650chmap rgmanager[29736]: [nfsclient] Removing export: *:/mnt/lv1
Sep  1 07:46:05 arcx3650chmap rgmanager[5061]: Service service:rg2 is stopped
Sep  1 07:52:13 arcx3650chmap qdiskd[4126]: Writing eviction notice for node 2
Sep  1 07:52:25 arcx3650chmap qdiskd[4126]: Node 2 evicted
Sep  1 07:53:02 arcx3650chmap kernel: INFO: task delete_workqueu:7314 blocked for more than 120 seconds.
Sep  1 07:53:02 arcx3650chmap kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Sep  1 07:53:02 arcx3650chmap kernel: delete_workqu D 0000000000000000     0  7314      2 0x00000080
Sep  1 07:53:02 arcx3650chmap kernel: ffff8801b7d5b9f0 0000000000000046 ffff8801b7d5b960 ffffffffa043ef4d
Sep  1 07:53:02 arcx3650chmap kernel: 0000000000000000 ffff8801985fc800 ffff8801b7d5ba20 ffffffffa043d708
Sep  1 07:53:02 arcx3650chmap kernel: ffff8801b49165f8 ffff8801b7d5bfd8 000000000000fb88 ffff8801b49165f8
Sep  1 07:53:02 arcx3650chmap kernel: Call Trace:
Sep  1 07:53:02 arcx3650chmap kernel: [<ffffffffa043ef4d>] ? dlm_put_lockspace+0x1d/0x40 [dlm]
Sep  1 07:53:02 arcx3650chmap kernel: [<ffffffffa043d708>] ? dlm_lock+0x98/0x1e0 [dlm]
Sep  1 07:53:02 arcx3650chmap kernel: [<ffffffffa05ea570>] ? gfs2_glock_holder_wait+0x0/0x20 [gfs2]
Sep  1 07:53:02 arcx3650chmap kernel: [<ffffffffa05ea57e>] gfs2_glock_holder_wait+0xe/0x20 [gfs2]
Sep  1 07:53:02 arcx3650chmap kernel: [<ffffffff814fe97f>] __wait_on_bit+0x5f/0x90
Sep  1 07:53:02 arcx3650chmap kernel: [<ffffffffa05ea570>] ? gfs2_glock_holder_wait+0x0/0x20 [gfs2]
Sep  1 07:53:02 arcx3650chmap kernel: [<ffffffff814fea28>] out_of_line_wait_on_bit+0x78/0x90
Sep  1 07:53:02 arcx3650chmap kernel: [<ffffffff81092110>] ? wake_bit_function+0x0/0x50
Sep  1 07:53:02 arcx3650chmap kernel: [<ffffffffa05ec4f5>] gfs2_glock_wait+0x45/0x90 [gfs2]
Sep  1 07:53:02 arcx3650chmap kernel: [<ffffffffa05ed8f7>] gfs2_glock_nq+0x237/0x3d0 [gfs2]
Sep  1 07:53:02 arcx3650chmap kernel: [<ffffffff81091f97>] ? bit_waitqueue+0x17/0xd0
Sep  1 07:53:02 arcx3650chmap kernel: [<ffffffffa05edba4>] gfs2_glock_nq_m+0x114/0x180 [gfs2]
Sep  1 07:53:02 arcx3650chmap kernel: [<ffffffffa05eaf12>] ? gfs2_holder_init+0x52/0x60 [gfs2]
Sep  1 07:53:02 arcx3650chmap kernel: [<ffffffffa05e287c>] gfs2_dir_exhash_dealloc+0x2ac/0x470 [gfs2]
Sep  1 07:53:02 arcx3650chmap kernel: [<ffffffffa0606fbd>] gfs2_delete_inode+0x2bd/0x2f0 [gfs2]
Sep  1 07:53:02 arcx3650chmap kernel: [<ffffffffa0606d8d>] ? gfs2_delete_inode+0x8d/0x2f0 [gfs2]
Sep  1 07:53:02 arcx3650chmap kernel: [<ffffffffa0606d00>] ? gfs2_delete_inode+0x0/0x2f0 [gfs2]
Sep  1 07:53:02 arcx3650chmap kernel: [<ffffffff811961de>] generic_delete_inode+0xde/0x1d0
Sep  1 07:53:02 arcx3650chmap kernel: [<ffffffff81196335>] generic_drop_inode+0x65/0x80
Sep  1 07:53:02 arcx3650chmap kernel: [<ffffffffa06057ae>] gfs2_drop_inode+0x2e/0x30 [gfs2]
Sep  1 07:53:02 arcx3650chmap kernel: [<ffffffff81195182>] iput+0x62/0x70
Sep  1 07:53:02 arcx3650chmap kernel: [<ffffffff81191ce0>] dentry_iput+0x90/0x100
Sep  1 07:53:02 arcx3650chmap kernel: [<ffffffff81191e41>] d_kill+0x31/0x60
Sep  1 07:53:02 arcx3650chmap kernel: [<ffffffff8119386c>] dput+0x7c/0x150
Sep  1 07:53:02 arcx3650chmap kernel: [<ffffffff81193a35>] d_prune_aliases+0xf5/0x120
Sep  1 07:53:02 arcx3650chmap kernel: [<ffffffffa05eb5b0>] ? delete_work_func+0x0/0x90 [gfs2]
Sep  1 07:53:02 arcx3650chmap kernel: [<ffffffffa05eb601>] delete_work_func+0x51/0x90 [gfs2]
Sep  1 07:53:02 arcx3650chmap kernel: [<ffffffff8108c760>] worker_thread+0x170/0x2a0
Sep  1 07:53:02 arcx3650chmap kernel: [<ffffffff810920d0>] ? autoremove_wake_function+0x0/0x40
Sep  1 07:53:02 arcx3650chmap kernel: [<ffffffff8108c5f0>] ? worker_thread+0x0/0x2a0
Sep  1 07:53:02 arcx3650chmap kernel: [<ffffffff81091d66>] kthread+0x96/0xa0
Sep  1 07:53:02 arcx3650chmap kernel: [<ffffffff8100c14a>] child_rip+0xa/0x20
Sep  1 07:53:02 arcx3650chmap kernel: [<ffffffff81091cd0>] ? kthread+0x0/0xa0
Sep  1 07:53:02 arcx3650chmap kernel: [<ffffffff8100c140>] ? child_rip+0x0/0x20
Sep  1 07:53:02 arcx3650chmap kernel: INFO: task delete_workqueu:7315 blocked for more than 120 seconds.
Sep  1 07:53:02 arcx3650chmap kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Sep  1 07:53:02 arcx3650chmap kernel: delete_workqu D 0000000000000001     0  7315      2 0x00000080
Sep  1 07:53:02 arcx3650chmap kernel: ffff8801b7d5db10 0000000000000046 ffff8801b7d5da80 ffffffffa043ef4d
Sep  1 07:53:02 arcx3650chmap kernel: 0000000000000000 ffff8801985fc800 ffff8801b7d5db40 ffffffffa043d708
Sep  1 07:53:02 arcx3650chmap kernel: ffff8801b4233058 ffff8801b7d5dfd8 000000000000fb88 ffff8801b4233058
Sep  1 07:53:02 arcx3650chmap kernel: Call Trace:
Sep  1 07:53:02 arcx3650chmap kernel: [<ffffffffa043ef4d>] ? dlm_put_lockspace+0x1d/0x40 [dlm]
Sep  1 07:53:02 arcx3650chmap kernel: [<ffffffffa043d708>] ? dlm_lock+0x98/0x1e0 [dlm]
Sep  1 07:53:02 arcx3650chmap kernel: [<ffffffffa05ea570>] ? gfs2_glock_holder_wait+0x0/0x20 [gfs2]
Sep  1 07:53:02 arcx3650chmap kernel: [<ffffffffa05ea57e>] gfs2_glock_holder_wait+0xe/0x20 [gfs2]
Sep  1 07:53:02 arcx3650chmap kernel: [<ffffffff814fe97f>] __wait_on_bit+0x5f/0x90
Sep  1 07:53:02 arcx3650chmap kernel: [<ffffffffa05ea570>] ? gfs2_glock_holder_wait+0x0/0x20 [gfs2]
Sep  1 07:53:02 arcx3650chmap kernel: [<ffffffff814fea28>] out_of_line_wait_on_bit+0x78/0x90
Sep  1 07:53:02 arcx3650chmap kernel: [<ffffffff81092110>] ? wake_bit_function+0x0/0x50
Sep  1 07:53:02 arcx3650chmap kernel: [<ffffffffa05ec4f5>] gfs2_glock_wait+0x45/0x90 [gfs2]
Sep  1 07:53:02 arcx3650chmap kernel: [<ffffffffa05ed8f7>] gfs2_glock_nq+0x237/0x3d0 [gfs2]
Sep  1 07:53:02 arcx3650chmap kernel: [<ffffffff8107e00c>] ? lock_timer_base+0x3c/0x70
Sep  1 07:53:02 arcx3650chmap kernel: [<ffffffffa0606d95>] gfs2_delete_inode+0x95/0x2f0 [gfs2]
Sep  1 07:53:02 arcx3650chmap kernel: [<ffffffffa0606d8d>] ? gfs2_delete_inode+0x8d/0x2f0 [gfs2]
Sep  1 07:53:02 arcx3650chmap kernel: [<ffffffffa0606d00>] ? gfs2_delete_inode+0x0/0x2f0 [gfs2]
Sep  1 07:53:02 arcx3650chmap kernel: [<ffffffff811961de>] generic_delete_inode+0xde/0x1d0
Sep  1 07:53:02 arcx3650chmap kernel: [<ffffffff81196335>] generic_drop_inode+0x65/0x80
Sep  1 07:53:02 arcx3650chmap kernel: [<ffffffffa06057ae>] gfs2_drop_inode+0x2e/0x30 [gfs2]
Sep  1 07:53:02 arcx3650chmap kernel: [<ffffffff81195182>] iput+0x62/0x70
Sep  1 07:53:02 arcx3650chmap kernel: [<ffffffff81191ce0>] dentry_iput+0x90/0x100
Sep  1 07:53:02 arcx3650chmap kernel: [<ffffffff81191e41>] d_kill+0x31/0x60
Sep  1 07:53:02 arcx3650chmap kernel: [<ffffffff8119386c>] dput+0x7c/0x150
Sep  1 07:53:02 arcx3650chmap kernel: [<ffffffff81193a35>] d_prune_aliases+0xf5/0x120
Sep  1 07:53:02 arcx3650chmap kernel: [<ffffffffa05eb5b0>] ? delete_work_func+0x0/0x90 [gfs2]
Sep  1 07:53:02 arcx3650chmap kernel: [<ffffffffa05eb601>] delete_work_func+0x51/0x90 [gfs2]
Sep  1 07:53:02 arcx3650chmap kernel: [<ffffffff8108c760>] worker_thread+0x170/0x2a0
Sep  1 07:53:02 arcx3650chmap kernel: [<ffffffff810920d0>] ? autoremove_wake_function+0x0/0x40
Sep  1 07:53:02 arcx3650chmap kernel: [<ffffffff8108c5f0>] ? worker_thread+0x0/0x2a0
Sep  1 07:53:02 arcx3650chmap kernel: [<ffffffff81091d66>] kthread+0x96/0xa0
Sep  1 07:53:02 arcx3650chmap kernel: [<ffffffff8100c14a>] child_rip+0xa/0x20
Sep  1 07:53:02 arcx3650chmap kernel: [<ffffffff81091cd0>] ? kthread+0x0/0xa0
Sep  1 07:53:02 arcx3650chmap kernel: [<ffffffff8100c140>] ? child_rip+0x0/0x20
Sep  1 07:53:31 arcx3650chmap corosync[4075]:   [TOTEM ] A processor failed, forming new configuration.
Sep  1 07:53:33 arcx3650chmap corosync[4075]:   [QUORUM] Members[1]: 1
Sep  1 07:53:33 arcx3650chmap corosync[4075]:   [TOTEM ] A processor joined or left the membership and a new membership was formed.
Sep  1 07:53:33 arcx3650chmap corosync[4075]:   [CPG   ] chosen downlist: sender r(0) ip(9.11.111.201) ; members(old:2 left:1)
Sep  1 07:53:33 arcx3650chmap corosync[4075]:   [MAIN  ] Completed service synchronization, ready to provide service.
Sep  1 07:53:33 arcx3650chmap rgmanager[5061]: State change: arcx3650chmar DOWN
Sep  1 07:53:33 arcx3650chmap fenced[4255]: fencing node arcx3650chmar
Sep  1 07:53:33 arcx3650chmap kernel: dlm: closing connection to node 2
Sep  1 07:53:33 arcx3650chmap kernel: GFS2: fsid=iotl0130:gfs_lv3.0: jid=1: Trying to acquire journal lock...
Sep  1 07:53:33 arcx3650chmap kernel: GFS2: fsid=iotl0130:gfs_lv1.0: jid=1: Trying to acquire journal lock...
Sep  1 07:53:43 arcx3650chmap fenced[4255]: fence arcx3650chmar success
Sep  1 07:53:43 arcx3650chmap kernel: GFS2: fsid=iotl0130:gfs_lv1.0: jid=1: Looking at journal...
Sep  1 07:53:43 arcx3650chmap kernel: GFS2: fsid=iotl0130:gfs_lv3.0: jid=1: Looking at journal...
Sep  1 07:53:43 arcx3650chmap kernel: GFS2: fsid=iotl0130:gfs_lv1.0: jid=1: Acquiring the transaction lock...
Sep  1 07:53:43 arcx3650chmap kernel: GFS2: fsid=iotl0130:gfs_lv1.0: jid=1: Replaying journal...
Sep  1 07:53:43 arcx3650chmap rgmanager[5061]: Taking over service service:rg1 from down member arcx3650chmar
Sep  1 07:53:43 arcx3650chmap kernel: GFS2: fsid=iotl0130:gfs_lv3.0: jid=1: Acquiring the transaction lock...
Sep  1 07:53:43 arcx3650chmap kernel: GFS2: fsid=iotl0130:gfs_lv3.0: jid=1: Replaying journal...
Sep  1 07:53:43 arcx3650chmap rgmanager[5061]: Taking over service service:rg2 from down member arcx3650chmar
Sep  1 07:53:44 arcx3650chmap kernel: GFS2: fsid=iotl0130:gfs_lv3.0: jid=1: Replayed 14190 of 17804 blocks
Sep  1 07:53:44 arcx3650chmap kernel: GFS2: fsid=iotl0130:gfs_lv3.0: jid=1: Found 1806 revoke tags
Sep  1 07:53:44 arcx3650chmap kernel: GFS2: fsid=iotl0130:gfs_lv3.0: jid=1: Journal replayed in 2s
Sep  1 07:53:44 arcx3650chmap kernel: GFS2: fsid=iotl0130:gfs_lv3.0: jid=1: Done
Sep  1 07:53:44 arcx3650chmap kernel: GFS2: fsid=iotl0130:gfs_lv2.0: jid=1: Trying to acquire journal lock...
Sep  1 07:53:44 arcx3650chmap kernel: GFS2: fsid=iotl0130:gfs_d2.0: jid=1: Trying to acquire journal lock...
Sep  1 07:53:44 arcx3650chmap kernel: GFS2: fsid=iotl0130:gfs_d2.0: jid=1: Looking at journal...

5.I always see the following message in the /var/log/message file:
Sep  1 08:09:56 arcx3650chmap kernel: lpfc 0000:1c:00.0: 0:(0):0713 SCSI layer issued Device Reset (2, 0) return x2002
Sep  1 08:09:56 arcx3650chmap kernel: lpfc 0000:1c:00.1: 1:(0):0713 SCSI layer issued Device Reset (1, 0) return x2002
Sep  1 08:33:44 arcx3650chmap kernel: rport-3:0-4: blocked FC remote port time out: removing target and saving binding
Sep  1 08:33:44 arcx3650chmap kernel: rport-4:0-3: blocked FC remote port time out: removing target and saving binding
Sep  1 08:33:44 arcx3650chmap kernel: lpfc 0000:1c:00.1: 1:(0):0203 Devloss timeout on WWPN 50:05:07:68:01:30:5a:36 NPort xb0f880 Data: x0 x7 x0
Sep  1 08:33:44 arcx3650chmap kernel: lpfc 0000:1c:00.0: 0:(0):0203 Devloss timeout on WWPN 50:05:07:68:01:40:5a:36 NPort xb0e880 Data: x0 x7 x0
Sep  1 08:49:56 arcx3650chmap kernel: lpfc 0000:1c:00.1: 1:(0):0713 SCSI layer issued Device Reset (1, 0) return x2002
Sep  1 08:49:56 arcx3650chmap kernel: lpfc 0000:1c:00.0: 0:(0):0713 SCSI layer issued Device Reset (2, 0) return x2002
Sep  1 09:24:13 arcx3650chmap kernel: rport-3:0-5: blocked FC remote port time out: removing target and saving binding
Sep  1 09:24:13 arcx3650chmap kernel: lpfc 0000:1c:00.0: 0:(0):0203 Devloss timeout on WWPN 50:05:07:68:01:40:5d:81 NPort xb0ea80 Data: x0 x7 x0
Sep  1 09:24:13 arcx3650chmap kernel: rport-4:0-4: blocked FC remote port time out: removing target and saving binding
Sep  1 09:24:13 arcx3650chmap kernel: lpfc 0000:1c:00.1: 1:(0):0203 Devloss timeout on WWPN 50:05:07:68:01:30:5d:81 NPort xb0fa80 Data: x0 x7 x0
Sep  1 10:36:49 arcx3650chmap kernel: lpfc 0000:1c:00.1: 1:(0):0713 SCSI layer issued Device Reset (2, 0) return x2002
Sep  1 10:36:49 arcx3650chmap kernel: lpfc 0000:1c:00.0: 0:(0):0713 SCSI layer issued Device Reset (3, 0) return x2002
Sep  1 10:36:49 arcx3650chmap kernel: lpfc 0000:1c:00.0: 0:(0):0713 SCSI layer issued Device Reset (3, 0) return x2002
Sep  1 10:36:49 arcx3650chmap kernel: lpfc 0000:1c:00.0: 0:(0):0723 SCSI layer issued Target Reset (3, 0) return x2002
Sep  1 10:36:49 arcx3650chmap kernel: lpfc 0000:1c:00.1: 1:(0):0713 SCSI layer issued Device Reset (2, 0) return x2002
Sep  1 10:36:49 arcx3650chmap kernel: lpfc 0000:1c:00.1: 1:(0):0723 SCSI layer issued Target Reset (2, 0) return x2002
Sep  1 10:36:49 arcx3650chmap qdiskd[4126]: qdisk cycle took more than 12 seconds to complete (14.410000)
Sep  1 16:52:02 arcx3650chmap kernel: lpfc 0000:1c:00.0: 0:(0):0713 SCSI layer issued Device Reset (3, 0) return x2002
Sep  1 16:52:02 arcx3650chmap kernel: lpfc 0000:1c:00.1: 1:(0):0713 SCSI layer issued Device Reset (2, 0) return x2002
Sep  1 16:52:02 arcx3650chmap kernel: lpfc 0000:1c:00.0: 0:(0):0713 SCSI layer issued Device Reset (3, 0) return x2002
6. FC HBA information:
arcx3650chmap with Emulex LPe12002 FV2.00A4 DV8.3.5.68.5p
arcx3650chmar with QLE2562 FW:v5.06.05 DVR:v8.04.00.04.06.3-k

Comment 2 Fabio Massimo Di Nitto 2012-09-02 11:42:24 UTC
Thank you for contacting Red Hat. 

I'd like to make sure you get the right resources and attention you need and help get you the right contacts for this issue.  Red Hat has a Global Support organization and a depth of online resources dedicated to addressing technical questions, like yours, from start to finish.

Please start here for knowledgebase, videos, Groups discussions, and many more technical resources:

http://access.redhat.com

Additionally, you can find Red Hat Support phone numbers and case management information here:

https://access.redhat.com/support/

If you have any difficulties or questions, please contact our customer service team, so they may help you further.

https://access.redhat.com/support/contact/customerService.html

Thank you and best regards,
Fabio

Comment 3 Fabio Massimo Di Nitto 2012-09-18 06:49:37 UTC
Hi,

can you please confirm that a ticket with our Global Support Services has been filed?