Bug 979437
Summary: | rhevm-setup fails and drops network | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Virtualization Manager | Reporter: | James Laska <jlaska> | ||||||||
Component: | ovirt-engine-setup | Assignee: | Alex Lourie <alourie> | ||||||||
Status: | CLOSED INSUFFICIENT_DATA | QA Contact: | Pavel Stehlik <pstehlik> | ||||||||
Severity: | high | Docs Contact: | |||||||||
Priority: | unspecified | ||||||||||
Version: | 3.2.0 | CC: | aberezin, acathrow, alourie, aweiteka, bazulay, iheim, jkt, jturner, Rhev-m-bugs, sbonazzo | ||||||||
Target Milestone: | --- | Keywords: | Regression, Triaged | ||||||||
Target Release: | 3.3.0 | ||||||||||
Hardware: | x86_64 | ||||||||||
OS: | Linux | ||||||||||
Whiteboard: | integration | ||||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||||
Doc Text: | Story Points: | --- | |||||||||
Clone Of: | Environment: | ||||||||||
Last Closed: | 2013-09-24 07:32:24 UTC | Type: | Bug | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Attachments: |
|
Description
James Laska
2013-06-28 14:05:15 UTC
@James 1. The AIO failed to finish the installation because 2. The network goes through some problems: Jun 28 08:55:42 ibm-x3250m4-06 kernel: Ethernet Channel Bonding Driver: v3.6.0 (September 26, 2009) Jun 28 08:55:42 ibm-x3250m4-06 kernel: bonding: bond4 is being created... Jun 28 08:55:42 ibm-x3250m4-06 kernel: bonding: bond1 is being created... Jun 28 08:55:42 ibm-x3250m4-06 kernel: bonding: bond2 is being created... Jun 28 08:55:42 ibm-x3250m4-06 kernel: bonding: bond3 is being created... Jun 28 08:55:43 ibm-x3250m4-06 kernel: ADDRCONF(NETDEV_UP): eth1: link is not ready Jun 28 08:55:43 ibm-x3250m4-06 kernel: 8021q: adding VLAN 0 to HW filter on device eth1 Jun 28 08:55:43 ibm-x3250m4-06 multipathd: force queue_without_daemon (operator) Jun 28 08:55:43 ibm-x3250m4-06 multipathd: --------shut down------- Jun 28 08:55:43 ibm-x3250m4-06 kernel: device-mapper: multipath round-robin: version 1.0.0 loaded Jun 28 08:55:43 ibm-x3250m4-06 kernel: device-mapper: table: 253:2: multipath: error getting device Jun 28 08:55:43 ibm-x3250m4-06 kernel: device-mapper: ioctl: error adding target to table Jun 28 08:55:43 ibm-x3250m4-06 kernel: device-mapper: table: 253:2: multipath: error getting device Jun 28 08:55:43 ibm-x3250m4-06 kernel: device-mapper: ioctl: error adding target to table Jun 28 08:55:43 ibm-x3250m4-06 multipathd: 1ATA_ST500NM0011_39M4517_42C0468IBM_Z1M11JGD: ignoring map Jun 28 08:55:43 ibm-x3250m4-06 kernel: device-mapper: table: 253:2: multipath: error getting device Jun 28 08:55:43 ibm-x3250m4-06 kernel: device-mapper: ioctl: error adding target to table Jun 28 08:55:43 ibm-x3250m4-06 kernel: device-mapper: table: 253:2: multipath: error getting device Jun 28 08:55:43 ibm-x3250m4-06 kernel: device-mapper: ioctl: error adding target to table Jun 28 08:55:43 ibm-x3250m4-06 multipathd: 1ATA_ST500NM0011_39M4517_42C0468IBM_Z1M12LMF: ignoring map Jun 28 08:55:43 ibm-x3250m4-06 multipathd: path checkers start up Jun 28 08:55:45 ibm-x3250m4-06 dhclient[15585]: DHCPDISCOVER on usb0 to 255.255.255.255 port 67 interval 8 (xid=0x1fe6e1e4) Jun 28 08:55:45 ibm-x3250m4-06 dhclient[15585]: DHCPOFFER from 169.254.95.118 Looks like from that moment on, there's no network on the system, hence AIO fails and SSH connection drops. Can you please provide more details on the systems hardware? Can you try running the setup on a different machine? Thanks. Thanks for your feedback! > Can you please provide more details on the systems hardware? Certainly, the hardware I've verified the failure on all has... * RAM: 16G * DISK: 2T * CPU: 1 x Intel(R) Xeon(R) CPU E3-1220 V2 (quad-core) * NIC: Ethernet 1 Intel 82574L Ethernet Controller > Can you try running the setup on a different machine? I've reproduced this problem on several different systems in beaker. The problem occurs while rhevm-setup is running, and before completion. No other workflow on the system is running. Perhaps the all-in-one configuration package may be a contributing cause? Hi James Thanks for the info. Would you mind please showing me the output of the 'ip a' command? Thanks. Created attachment 777051 [details] ip a (In reply to Alex Lourie from comment #3) > Would you mind please showing me the output of the 'ip a' command? Not at all, please see attached outputl. Interestingly enough, I believe I have 2 classes of hardware that I'ev tested rhevm-setup on. One class works, the other doesn't and triggered this bug report. == Works == * https://beaker.engineering.redhat.com/view/qeblade40.rhq.lab.eng.bos.redhat.com > # lspci | grep Ethernet > 03:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet (rev 20) > 03:00.1 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet (rev 20) > 04:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet (rev 20) > 04:00.1 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet (rev 20) > 08:00.0 Ethernet controller: Intel Corporation 82571EB Gigabit Ethernet Controller (Copper) (rev 06) > 08:00.1 Ethernet controller: Intel Corporation 82571EB Gigabit Ethernet Controller (Copper) (rev 06) > 09:00.0 Ethernet controller: Intel Corporation 82571EB Gigabit Ethernet Controller (Copper) (rev 06) > 09:00.1 Ethernet controller: Intel Corporation 82571EB Gigabit Ethernet Controller (Copper) (rev 06) == Doesn't work == * https://beaker.engineering.redhat.com/view/ibm-x3250m4-01.lab.bos.redhat.com > # lspci | grep Ethernet > 06:00.0 Ethernet controller: Intel Corporation 82574L Gigabit Network Connection > 0b:00.0 Ethernet controller: Intel Corporation 82574L Gigabit Network Connection Created attachment 781135 [details]
ip a # From a working system
Attaching `ip a` output from qeblade40, a system that runs rhevm-setup with no difficulty.
> 10:37:04 alourie: is there a way you add this problematic machine as a host to another engine? > 10:37:21 alourie: I want to see whether it gets configured correctly At the suggestion of alourie, I was able to deploy a RHEL host on the same hardware used to trigger this bug, and add it as a host to existing RHEV-M instance. I didn't encounter any issues. Please note, that the all-in-one setup didn't allow for me to add the new host to the 'local_datacenter'. I therefore added the new host to the 'Default' datacenter. The RHEV-M system is available for inspection at https://qeblade40.rhq.lab.eng.bos.redhat.com (admin/redhat) @james Was this the ibm-3250 machine that you added to the engine? (In reply to Alex Lourie from comment #8) > Was this the ibm-3250 machine that you added to the engine? It was, yes. Can you attach host-deploy, engine and server logs? Sandro The logs are attached as an archive file. Any updates? We seem to be able to trigger this problem without difficulty on a specific class of hardware. James We are still investigating. James Could you please test it with the latest 3.3 build? We changed a lot of logic in the code for that version, I want to know whether it works on this system. Thanks. Suggesting to close this issue due to missing information. (In reply to Alex Lourie from comment #19) > Suggesting to close this issue due to missing information. I agree. Please reopen if you're able to reproduce the issue with the latest builds. |