| Summary: | Introspection on BM doesn't complete, the introspected node throws "http://192.0.2.1:8088/inspector.ipxe... Connection reset" | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | Alexander Chuzhoy <sasha> | ||||||||||
| Component: | rhosp-director | Assignee: | Dmitry Tantsur <dtantsur> | ||||||||||
| Status: | CLOSED WORKSFORME | QA Contact: | Raviv Bar-Tal <rbartal> | ||||||||||
| Severity: | urgent | Docs Contact: | |||||||||||
| Priority: | urgent | ||||||||||||
| Version: | 10.0 (Newton) | CC: | atelang, bfournie, dbecker, dsneddon, fbaudin, kbasil, mburns, mcornea, morazi, nyechiel, oblaut, rbartal, rhel-osp-director-maint, sasha, vchundur, yrachman | ||||||||||
| Target Milestone: | --- | ||||||||||||
| Target Release: | 10.0 (Newton) | ||||||||||||
| Hardware: | Unspecified | ||||||||||||
| OS: | Unspecified | ||||||||||||
| Whiteboard: | |||||||||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||||||
| Doc Text: | Story Points: | --- | |||||||||||
| Clone Of: | Environment: | ||||||||||||
| Last Closed: | 2016-10-14 09:05:25 UTC | Type: | Bug | ||||||||||
| Regression: | --- | Mount Type: | --- | ||||||||||
| Documentation: | --- | CRM: | |||||||||||
| Verified Versions: | Category: | --- | |||||||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||||||
| Attachments: |
|
||||||||||||
|
Description
Alexander Chuzhoy
2016-10-06 22:18:55 UTC
The issue reproduces on different BM setups. Created attachment 1208060 [details]
sosreport from the undercloud node
Can you fetch http://192.0.2.1:8088/inspector.ipxe from undercloud? Is it the same issue where flushing iptables rules helped you? Created attachment 1208154 [details]
inspector.ipxe
After flushing iptables on the undercloud, I didn't see this error, but the introspection didn't finish either - was gathering data in loop. Sorry, I just realized my question was confusing. I meant, can you try fetching http://192.0.2.1:8088/inspector.ipxe using curl from the undercloud? Will it work? The problem after flushing iptables - do you see any errors in the machine's console? Does it fail to post data back? Tried it on another BM setup (different IP): [root@undercloud ~]# curl http://192.168.0.1:8088/inspector.ipxe #!ipxe :retry_dhcp dhcp || goto retry_dhcp :retry_boot imgfree kernel --timeout 60000 http://192.168.0.1:8088/agent.kernel ipa-inspection-callback-url=http://192.168.0.1:5050/v1/continue ipa-inspection-collectors=default,extra-hardware,logs systemd.journald.forward_to_console=yes BOOTIF=${mac} ipa-debug=1 ipa-inspection-dhcp-all-interfaces=1 ipa-collect-lldp=1 initrd=agent.ramdisk || goto retry_boot initrd --timeout 60000 http://192.168.0.1:8088/agent.ramdisk || goto retry_boot boot After flushing the iptables (sudo iptables -F), no improvement. I also tried to replace /tftpboot/undionly.kpxe with http://boot.ipxe.org/undionly.kpxe We have fail to reproduce this issue on tlv systems, both the original system and second system I have(Raviv). Shasha - can you reproduce the same problem on your system? The problem still reproduces on my setup: openstack-ironic-common-6.2.1-0.20160930163405.3f54fec.el7ost.noarch puppet-ironic-9.4.0-1.el7ost.noarch python-ironicclient-1.7.0-1.el7ost.noarch instack-undercloud-5.0.0-0.20160930175750.9d2a655.el7ost.noarch python-ironic-inspector-client-1.9.0-1.el7ost.noarch openstack-ironic-conductor-6.2.1-0.20160930163405.3f54fec.el7ost.noarch openstack-ironic-api-6.2.1-0.20160930163405.3f54fec.el7ost.noarch openstack-ironic-inspector-4.2.1-0.20160922151040.36900fb.el7ost.noarch python-ironic-lib-2.1.0-1.el7ost.noarch Ok, can you see any signs of the HTTP request arriving? Please check both httpd logs and tcpdump. If yes, do you see signs of a response (again, tcpdump or anything alike)? Nothing got appended to: /var/log/httpd/ipxe_vhost_access.log /var/log/httpd/ipxe_vhost_error.log snap from journalctl -f -u openstack-ironic-inspector-dnsmasq: Oct 11 12:43:04 undercloud.localdomain dnsmasq-dhcp[31036]: DHCPREQUEST(br-ctlplane) 192.168.0.100 00:0a:f7:79:93:2a Oct 11 12:43:04 undercloud.localdomain dnsmasq-dhcp[31036]: DHCPACK(br-ctlplane) 192.168.0.100 00:0a:f7:79:93:2a Oct 11 12:43:04 undercloud.localdomain dnsmasq-dhcp[31036]: DHCPREQUEST(br-ctlplane) 192.168.0.102 00:0a:f7:79:93:1a Oct 11 12:43:04 undercloud.localdomain dnsmasq-dhcp[31036]: DHCPACK(br-ctlplane) 192.168.0.102 00:0a:f7:79:93:1a Oct 11 12:43:04 undercloud.localdomain dnsmasq-dhcp[31036]: DHCPREQUEST(br-ctlplane) 192.168.0.101 00:0a:f7:79:93:18 Oct 11 12:43:04 undercloud.localdomain dnsmasq-dhcp[31036]: DHCPACK(br-ctlplane) 192.168.0.101 00:0a:f7:79:93:18 Oct 11 12:43:04 undercloud.localdomain dnsmasq-dhcp[31036]: DHCPREQUEST(br-ctlplane) 192.168.0.100 00:0a:f7:79:93:2a Oct 11 12:43:04 undercloud.localdomain dnsmasq-dhcp[31036]: DHCPACK(br-ctlplane) 192.168.0.100 00:0a:f7:79:93:2a Oct 11 12:43:04 undercloud.localdomain dnsmasq-dhcp[31036]: DHCPREQUEST(br-ctlplane) 192.168.0.102 00:0a:f7:79:93:1a Oct 11 12:43:04 undercloud.localdomain dnsmasq-dhcp[31036]: DHCPACK(br-ctlplane) 192.168.0.102 00:0a:f7:79:93:1a Oct 11 12:43:04 undercloud.localdomain dnsmasq-dhcp[31036]: DHCPREQUEST(br-ctlplane) 192.168.0.101 00:0a:f7:79:93:18 Oct 11 12:43:04 undercloud.localdomain dnsmasq-dhcp[31036]: DHCPACK(br-ctlplane) 192.168.0.101 00:0a:f7:79:93:18 Oct 11 12:43:38 undercloud.localdomain dnsmasq-dhcp[31036]: DHCPDISCOVER(br-ctlplane) 00:0a:f7:7f:24:88 Oct 11 12:43:38 undercloud.localdomain dnsmasq-dhcp[31036]: DHCPOFFER(br-ctlplane) 192.168.0.103 00:0a:f7:7f:24:88 Oct 11 12:43:41 undercloud.localdomain dnsmasq-dhcp[31036]: DHCPDISCOVER(br-ctlplane) 00:0a:f7:7f:24:96 Oct 11 12:43:41 undercloud.localdomain dnsmasq-dhcp[31036]: DHCPOFFER(br-ctlplane) 192.168.0.104 00:0a:f7:7f:24:96 Oct 11 12:43:44 undercloud.localdomain dnsmasq-dhcp[31036]: DHCPDISCOVER(br-ctlplane) 00:0a:f7:7f:24:9e Oct 11 12:43:44 undercloud.localdomain dnsmasq-dhcp[31036]: DHCPOFFER(br-ctlplane) 192.168.0.105 00:0a:f7:7f:24:9e Oct 11 12:43:47 undercloud.localdomain dnsmasq-dhcp[31036]: DHCPDISCOVER(br-ctlplane) 00:0a:f7:7f:24:5e Oct 11 12:43:47 undercloud.localdomain dnsmasq-dhcp[31036]: DHCPOFFER(br-ctlplane) 192.168.0.106 00:0a:f7:7f:24:5e Oct 11 12:43:47 undercloud.localdomain dnsmasq-dhcp[31036]: DHCPREQUEST(br-ctlplane) 192.168.0.103 00:0a:f7:7f:24:88 Oct 11 12:43:47 undercloud.localdomain dnsmasq-dhcp[31036]: DHCPACK(br-ctlplane) 192.168.0.103 00:0a:f7:7f:24:88 Oct 11 12:43:47 undercloud.localdomain dnsmasq-dhcp[31036]: DHCPREQUEST(br-ctlplane) 192.168.0.103 00:0a:f7:7f:24:88 Oct 11 12:43:47 undercloud.localdomain dnsmasq-dhcp[31036]: DHCPACK(br-ctlplane) 192.168.0.103 00:0a:f7:7f:24:88 Oct 11 12:43:47 undercloud.localdomain dnsmasq-dhcp[31036]: DHCPREQUEST(br-ctlplane) 192.168.0.105 00:0a:f7:7f:24:9e Oct 11 12:43:47 undercloud.localdomain dnsmasq-dhcp[31036]: DHCPACK(br-ctlplane) 192.168.0.105 00:0a:f7:7f:24:9e Oct 11 12:43:47 undercloud.localdomain dnsmasq-dhcp[31036]: DHCPREQUEST(br-ctlplane) 192.168.0.103 00:0a:f7:7f:24:88 Oct 11 12:43:47 undercloud.localdomain dnsmasq-dhcp[31036]: DHCPACK(br-ctlplane) 192.168.0.103 00:0a:f7:7f:24:88 Oct 11 12:43:47 undercloud.localdomain dnsmasq-dhcp[31036]: DHCPREQUEST(br-ctlplane) 192.168.0.104 00:0a:f7:7f:24:96 Oct 11 12:43:47 undercloud.localdomain dnsmasq-dhcp[31036]: DHCPACK(br-ctlplane) 192.168.0.104 00:0a:f7:7f:24:96 Oct 11 12:43:47 undercloud.localdomain dnsmasq-dhcp[31036]: DHCPREQUEST(br-ctlplane) 192.168.0.105 00:0a:f7:7f:24:9e Oct 11 12:43:47 undercloud.localdomain dnsmasq-dhcp[31036]: DHCPACK(br-ctlplane) 192.168.0.105 00:0a:f7:7f:24:9e Oct 11 12:43:47 undercloud.localdomain dnsmasq-dhcp[31036]: DHCPREQUEST(br-ctlplane) 192.168.0.104 00:0a:f7:7f:24:96 Oct 11 12:43:47 undercloud.localdomain dnsmasq-dhcp[31036]: DHCPACK(br-ctlplane) 192.168.0.104 00:0a:f7:7f:24:96 Oct 11 12:43:47 undercloud.localdomain dnsmasq-dhcp[31036]: DHCPREQUEST(br-ctlplane) 192.168.0.105 00:0a:f7:7f:24:9e Oct 11 12:43:47 undercloud.localdomain dnsmasq-dhcp[31036]: DHCPACK(br-ctlplane) 192.168.0.105 00:0a:f7:7f:24:9e Oct 11 12:43:47 undercloud.localdomain dnsmasq-dhcp[31036]: DHCPREQUEST(br-ctlplane) 192.168.0.104 00:0a:f7:7f:24:96 Oct 11 12:43:47 undercloud.localdomain dnsmasq-dhcp[31036]: DHCPACK(br-ctlplane) 192.168.0.104 00:0a:f7:7f:24:96 Oct 11 12:43:47 undercloud.localdomain dnsmasq-dhcp[31036]: DHCPREQUEST(br-ctlplane) 192.168.0.103 00:0a:f7:7f:24:88 Oct 11 12:43:47 undercloud.localdomain dnsmasq-dhcp[31036]: DHCPACK(br-ctlplane) 192.168.0.103 00:0a:f7:7f:24:88 Oct 11 12:43:47 undercloud.localdomain dnsmasq-dhcp[31036]: DHCPREQUEST(br-ctlplane) 192.168.0.105 00:0a:f7:7f:24:9e Oct 11 12:43:47 undercloud.localdomain dnsmasq-dhcp[31036]: DHCPACK(br-ctlplane) 192.168.0.105 00:0a:f7:7f:24:9e Oct 11 12:43:47 undercloud.localdomain dnsmasq-dhcp[31036]: DHCPREQUEST(br-ctlplane) 192.168.0.104 00:0a:f7:7f:24:96 Oct 11 12:43:47 undercloud.localdomain dnsmasq-dhcp[31036]: DHCPACK(br-ctlplane) 192.168.0.104 00:0a:f7:7f:24:96 Snap from "tcpdump -i any port 67 or port 68 or port 69": 12:39:32.902350 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from 00:0a:f7:7f:24:9e (oui Unknown), length 548 12:39:32.902350 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from 00:0a:f7:7f:24:9e (oui Unknown), length 548 12:39:32.902714 IP 192.168.0.5.bootps > 255.255.255.255.bootpc: BOOTP/DHCP, Reply, length 300 12:39:32.902712 IP 192.168.0.5.bootps > 255.255.255.255.bootpc: BOOTP/DHCP, Reply, length 300 12:39:32.929055 IP 192.168.0.5.bootps > 255.255.255.255.bootpc: BOOTP/DHCP, Reply, length 331 12:39:32.929055 IP 192.168.0.5.bootps > 255.255.255.255.bootpc: BOOTP/DHCP, Reply, length 331 12:39:32.930964 IP 192.168.0.18.ah-esp-encap > undercloud.localdomain.tftp: 30 RRQ "undionly.kpxe" octet tsize 0 12:39:32.931080 IP 192.168.0.18.ah-esp-encap > undercloud.localdomain.tftp: 30 RRQ "undionly.kpxe" octet tsize 0 12:39:32.945874 IP 192.168.0.18.acp-port > undercloud.localdomain.tftp: 35 RRQ "undionly.kpxe" octet blksize 1456 12:39:32.945874 IP 192.168.0.18.acp-port > undercloud.localdomain.tftp: 35 RRQ "undionly.kpxe" octet blksize 1456 12:39:33.219670 IP undercloud.localdomain.bootps > 255.255.255.255.bootpc: BOOTP/DHCP, Reply, length 302 12:39:33.219706 IP undercloud.localdomain.bootps > 255.255.255.255.bootpc: BOOTP/DHCP, Reply, length 302 12:39:34.243014 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from 00:0a:f7:7f:24:5e (oui Unknown), length 548 12:39:34.243014 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from 00:0a:f7:7f:24:5e (oui Unknown), length 548 12:39:34.243589 IP 192.168.0.5.bootps > 255.255.255.255.bootpc: BOOTP/DHCP, Reply, length 300 12:39:34.243587 IP 192.168.0.5.bootps > 255.255.255.255.bootpc: BOOTP/DHCP, Reply, length 300 12:39:34.278999 IP 192.168.0.5.bootps > 255.255.255.255.bootpc: BOOTP/DHCP, Reply, length 331 12:39:34.278999 IP 192.168.0.5.bootps > 255.255.255.255.bootpc: BOOTP/DHCP, Reply, length 331 12:39:36.225804 IP undercloud.localdomain.bootps > 255.255.255.255.bootpc: BOOTP/DHCP, Reply, length 302 12:39:36.225826 IP undercloud.localdomain.bootps > 255.255.255.255.bootpc: BOOTP/DHCP, Reply, length 302 12:39:37.973809 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from 00:0a:f7:7f:24:88 (oui Unknown), length 396 12:39:37.973809 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from 00:0a:f7:7f:24:88 (oui Unknown), length 396 12:39:37.974516 IP 192.168.0.5.bootps > 255.255.255.255.bootpc: BOOTP/DHCP, Reply, length 365 12:39:37.974516 IP 192.168.0.5.bootps > 255.255.255.255.bootpc: BOOTP/DHCP, Reply, length 365 12:39:37.974655 IP undercloud.localdomain.bootps > 255.255.255.255.bootpc: BOOTP/DHCP, Reply, length 327 12:39:37.974664 IP undercloud.localdomain.bootps > 255.255.255.255.bootpc: BOOTP/DHCP, Reply, length 327 12:39:39.163206 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from 00:0a:f7:7f:24:96 (oui Unknown), length 396 12:39:39.163206 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from 00:0a:f7:7f:24:96 (oui Unknown), length 396 12:39:39.163643 IP undercloud.localdomain.bootps > 255.255.255.255.bootpc: BOOTP/DHCP, Reply, length 327 12:39:39.163658 IP undercloud.localdomain.bootps > 255.255.255.255.bootpc: BOOTP/DHCP, Reply, length 327 12:39:39.164144 IP 192.168.0.5.bootps > 255.255.255.255.bootpc: BOOTP/DHCP, Reply, length 365 12:39:39.164144 IP 192.168.0.5.bootps > 255.255.255.255.bootpc: BOOTP/DHCP, Reply, length 365 12:39:39.612137 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from 00:0a:f7:7f:24:9e (oui Unknown), length 396 12:39:39.612137 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from 00:0a:f7:7f:24:9e (oui Unknown), length 396 12:39:39.612672 IP 192.168.0.5.bootps > 255.255.255.255.bootpc: BOOTP/DHCP, Reply, length 365 12:39:39.612672 IP 192.168.0.5.bootps > 255.255.255.255.bootpc: BOOTP/DHCP, Reply, length 365 12:39:39.612942 IP undercloud.localdomain.bootps > 255.255.255.255.bootpc: BOOTP/DHCP, Reply, length 327 12:39:39.612953 IP undercloud.localdomain.bootps > 255.255.255.255.bootpc: BOOTP/DHCP, Reply, length 327 12:39:40.897747 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from 00:0a:f7:7f:24:5e (oui Unknown), length 396 12:39:40.897747 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from 00:0a:f7:7f:24:5e (oui Unknown), length 396 12:39:40.898260 IP undercloud.localdomain.bootps > 255.255.255.255.bootpc: BOOTP/DHCP, Reply, length 327 12:39:40.898274 IP undercloud.localdomain.bootps > 255.255.255.255.bootpc: BOOTP/DHCP, Reply, length 327 12:39:40.898420 IP 192.168.0.5.bootps > 255.255.255.255.bootpc: BOOTP/DHCP, Reply, length 365 12:39:40.898420 IP 192.168.0.5.bootps > 255.255.255.255.bootpc: BOOTP/DHCP, Reply, length 365 > Nothing got appended to:
> /var/log/httpd/ipxe_vhost_access.log
> /var/log/httpd/ipxe_vhost_error.log
Ok, could you please check iPXE port (8088) with tcpdump as well?
For the record, I've also done a successful introspection on our hardware lab with the latest puddle. I think it must be something subtle about your environment. CC'ing Dan in case he has any ideas.
Alexander - could you please add undercloud.conf to bug? Thank you. Also, in looking at the attached sosreport it looks like the undercloud is using 192.168.0.1 (see below) on provisioning net, yet the nodes are trying to access 192.0.2.1? Perhaps the sosreport is out of sync? sosreport-undercloud.localdomain-20161006192524 bfournie$ cat ip_addr 1: lo inet 127.0.0.1/8 scope host lo\ valid_lft forever preferred_lft forever 1: lo inet6 ::1/128 scope host \ valid_lft forever preferred_lft forever 2: eth0 inet 10.19.184.221/24 brd 10.19.184.255 scope global dynamic eth0\ valid_lft 81936sec preferred_lft 81936sec 2: eth0 inet6 2620:52:0:13b8:5054:ff:fe3e:c994/64 scope global noprefixroute dynamic \ valid_lft 2591945sec preferred_lft 604745sec 2: eth0 inet6 fe80::5054:ff:fe3e:c994/64 scope link \ valid_lft forever preferred_lft forever 3: eth1 inet6 fe80::5054:ff:fe18:6679/64 scope link \ valid_lft forever preferred_lft forever 5: br-ctlplane inet 192.168.0.1/24 brd 192.168.0.255 scope global br-ctlplane\ valid_lft forever preferred_lft forever 5: br-ctlplane inet6 fe80::5054:ff:fe18:6679/64 scope link \ valid_lft forever preferred_lft forever Created attachment 1209621 [details]
undercloud.conf from the undercloud node
Bob, The details were taken from a setup that also hits the same bug. This setup uses 192.168.0.0/24 (instead of 192.0.2.0) for provisioning net. Created attachment 1209628 [details]
the requested tcpdump
(In reply to Alexander Chuzhoy from comment #21) > Bob, > The details were taken from a setup that also hits the same bug. > This setup uses 192.168.0.0/24 (instead of 192.0.2.0) for provisioning net. Where are the baremetal environments where this problem was reproduced? I'm wondering if they have something in common (same hardware, same Ethernet switches, etc.). I see the issue on a setup consisting of Dell PowerEdge R320. Broadcom Corporation NetXtreme BCM5720 Gigabit Ethernet PCIe We also have PowerEdge R320, so the nodes are probably not to blame. Not sure about the switch though.. After switching to a newer version of RHEL 7.3 on the undercloud, I don't reproduce the issue. Thanks Sasha, let us assume that it was some bug in RHEL. Please feel free to reopen if you hit it again. A very slow undercloud can cause this issue. |