Bug 158365
Summary: | Breaking ethernet bonding causes node to reset. | ||
---|---|---|---|
Product: | [Retired] Red Hat Cluster Suite | Reporter: | David Milburn <dmilburn> |
Component: | clumanager | Assignee: | Lon Hohberger <lhh> |
Status: | CLOSED NOTABUG | QA Contact: | Cluster QE <mspqa-list> |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | 3 | CC: | cluster-maint, tao |
Target Milestone: | --- | Keywords: | FutureFeature |
Target Release: | --- | ||
Hardware: | i686 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Enhancement | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2006-02-02 17:22:34 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
David Milburn
2005-05-20 21:51:37 UTC
Based on the description, one of two things is happening: (1) Unplugging the ethernet cables from a bonded interface behaves differently from using a non-bonded interface, and the following code is getting run after all paths are lost: /* * Reboot if we didn't send a heartbeat in interval*TKO_COUNT */ if (!debug && __cmp_tv(&maxtime, &diff) == 1) { clulog(LOG_EMERG, "Failed to send a heartbeat within " "failover time - REBOOTING\n"); sync(); reboot(RB_AUTOBOOT); } The membership daemon doesn't know about fencing, so it can't make a judgement based on that, which brings us to the other possibility: (2) There are no power switches configured so the quorum daemon (which handles fencing) is extremely paranoid. You *can't* gracefully shut down in this case. After the failover time, if we try to shut down, the other node will be trying to mount the file systems we still have mounted, resulting in file system (and probably data) corruption. A reboot-as-fast-as-possible doesn't guarantee data integrity, but it's certainly better than a slow shutdown. It's not a bug either way. However: (a) Behavior (2) won't change unless power switches are installed and configured; data integrity trumps the nicety of a clean shutdown. (b) We could alter the behavior of (1) to just log a nasty message at EMERG log level and not reboot. If there are no power switches configured, the behavior won't visibly change at all. Ah, after rereading this and parts of the related issue, there is more to it than this. Upon further investigation (this is a really old issue), this isn't related to bonding at all. dmesg from one of the sysreports: scsi1 : Adaptec AIC79XX PCI-X SCSI HBA DRIVER, Rev 1.3.10-RH1 <Adaptec 29320A Ultra320 SCSI adapter> aic7901: Ultra320 Wide Channel A, SCSI Id=7, PCI-X 67-100Mhz, 512 SCBs blk: queue c4da7418, I/O limit 4095Mb (mask 0xffffffff) (scsi1:A:5:0): refuses WIDE negotiation. Using 8bit transfers (scsi1:A:5:0): refuses synchronous negotiation. Using asynchronous transfers (scsi1:A:8): 160.000MB/s transfers (80.000MHz DT|IU|QAS, 16bit) (scsi1:A:9): 160.000MB/s transfers (80.000MHz DT|IU|QAS, 16bit) Vendor: SUN Model: StorEdge 3120 D Rev: 1159 Type: Processor ANSI SCSI revision: 02 blk: queue c4da7218, I/O limit 4095Mb (mask 0xffffffff) Vendor: FUJITSU Model: MAP3735N SUN72G Rev: 0401 Type: Direct-Access ANSI SCSI revision: 04 blk: queue c4da7618, I/O limit 4095Mb (mask 0xffffffff) Vendor: FUJITSU Model: MAP3735N SUN72G Rev: 0401 Type: Direct-Access ANSI SCSI revision: 04 blk: queue c4da7818, I/O limit 4095Mb (mask 0xffffffff) scsi1:A:8:0: Tagged Queuing enabled. Depth 32 scsi1:A:9:0: Tagged Queuing enabled. Depth 32 Attached scsi disk sdb at scsi1, channel 0, id 8, lun 0 Attached scsi disk sdc at scsi1, channel 0, id 9, lun 0 SCSI device sdb: 143374738 512-byte hdwr sectors (73408 MB) sdb: sdb1 sdb2 sdb3 ================================== /etc/sysconfig/rawdevices from the same node: # raw device bindings # format: <rawdev> <major> <minor> # <rawdev> <blockdev> # example: /dev/raw/raw1 /dev/sda1 # /dev/raw/raw2 8 5 /dev/raw/raw1 /dev/sdb1 /dev/raw/raw2 /dev/sdb2 ================================== A couple of facts: * The StorEdge 3120 is a SCSI JBOD, with zero RAID capabilities. Here's more information about that array: http://www.sun.com/storage/workgroup/3000/3100/3120scsi/ * The Fujitsu MAP3735N is a 73GB hard disk drive. You can buy them here: http://froogle.google.com/froogle?q=MAP3735+FUJITSU What does this mean? Dmesg shows that the controllers are seeing the disks individually, and that they are not in any sort of RAID configuration. The behavior (bus reset followed by errors) observed is common on multi-initiator parallel SCSI configurations. A quick look at the RHCS documentation says that doing a multi-initiator parallel SCSI configuration will not work: http://www.redhat.com/docs/manuals/enterprise/RHEL-3-Manual/cluster-suite/ch-hardware.html "Testing has shown that it is difficult, if not impossible, to configure reliable multi-initiator parallel SCSI configurations at data rates above 80MB/sec using standard SCSI adapters. Further tests have shown that these configurations cannot support online repair because the bus does not work reliably when the HBA terminators are disabled, and external terminators are used. For these reasons, multi-initiator SCSI configurations using standard adapters are not supported. Either single-initiator SCSI bus adapters (connected to multi-ported storage) or Fibre Channel adapters are required." The customer should disable SCSI bus resets on the host bus adapters; this is usually done in the BIOS for the HBA. This *might* prevent the reboot from occurring in the future (if it does, services will remain available!), but is not guaranteed because some SCSI device drivers perform SCSI bus resets while initializing. (I do not know if the AIC7xxx driver performs a SCSI bus reset or not.) The cluster can not shut down gracefully if access to shared storage is not available, mostly because unmounting will not work. Unmounting a file system requires updating certain metadata on the superblock, which can not be done if the storage is not accessible. You can make clumanager take different actions on shared storage access failures using the cludb command, but this is not a widely-tested nor supported thing to do (the ability is there primarily for testing purposes, not production use). See the man page for 'cludb' for more details. Good luck! |