Bug 859467
Summary: | All VM's lost their routing tables | ||
---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Jean-Tsung Hsiao <jhsiao> |
Component: | openvswitch | Assignee: | Thomas Graf <tgraf> |
Status: | CLOSED NOTABUG | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
Severity: | high | Docs Contact: | |
Priority: | unspecified | ||
Version: | 17 | CC: | chrisw, markmc, rkhan, tgraf |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2012-10-08 13:33:27 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Jean-Tsung Hsiao
2012-09-21 15:52:24 UTC
Referring to the "Actual results" about, just ran an experiment and the time to reproduce can be as short as 30 minutes. Experiment: "ovs-vsctl show" indicated that vnet2 and vnet3 were having no tags at all. And their corresponding VM's were ping each other. As an experiment, ran "ovs-vsctl del-port ovsbridge0 vnet2", and "ovs-vsctl del-port ovsbridge0 vnet3". Then, add them back with tag=20. Initially, both corresponding VM's were pinging each other. But, in about 30 minutes, pings failed with "network unreachable". (In reply to comment #1) > Referring to the "Actual results" about, just ran an experiment and the time > to reproduce can be as short as 30 minutes. > > Experiment: > > "ovs-vsctl show" indicated that vnet2 and vnet3 were having no tags at all. > And their corresponding VM's were ping each other. Just making sure I understand this correctly. After about 30 minutes ovs-vsctl show no longer lists the tag=? And this only happens for vnet2 and vnet3? Is this correct? The original issue happened to all four of them. Then, I used virt-mager to "shutoff and then run" the two VM's corresponding to vnet2 and vnet3. After that the two VM's were able to each other since then. NOTE: I left the other pair(vnet0 and vnet1) alone. After I submitted the initial description, I ran "ovs-vsctl show" and found out vnet2 and vnet3 were having no tags while vnet0 and vnet 1 were still having tag=10. So, I realized that if I delete vnet2 and vnet3, then add them back with tag=20, I can reproduce the issue. So, that's what I did. I deleted them, then add them back with tag=20. Initially, VM's corresponding to the two interaces were able to ping each other. But, in about 30 minutes, pings failed and "netstat -r" showed empty table. Hopefully, you can get a better picture now. Thanks! Jean Hi Thomas, Both vnet2 and vnet3 still had tag=20 after 30 minute based on "ovs-vsctl show". Thanks! Jean As I mentioned above, I left vnet0 and vnet1 alone last night. So, their corresponding VM's had empty routing tables and pings failed. The "ovs-vsctl" showed both had tag=10. As an experiment, I delted them and add them back without tags. Then, at each corresponding VM, ran ifdown and ifup. Bingo! Both route tables were back and pings have been successful since then. Another experiment: This experiment re-produced the missing routable issue right away. * Originally, vnet0 was added to ovsbridge0 without VLAN taging. Below is the IP routable of the corresponding VM: root@test1232 ~]# ip route 10.10.8.0/22 dev eth0 proto kernel scope link src 10.10.10.232 169.254.0.0/16 dev eth0 scope link metric 1002 default via 10.10.11.254 dev eth0 * Ran "ovs-vsctl del-port ovsbridge0 vnet0" * Ran "ovs-vsctl add-port ovsbridge0 vnet0 tag=10" * Ran "ifdown eth0" * Ran "ifup eth0". This failed as "ping 10.10.11.254" failed. * "ip route" returned empty. I found out that the issue of losing IP route was related to DHCPREQUEST failure --- each VM's was configured using DHCP. The log indicated that VM sent out DHCPREQUEST about every 11 minutes to renew the lease. Without tagging, the renew was successful every time. But, once the VLAN tag was turned on, the renew would fail next time. This could be due to the fact that he ACK's from DHCP server got dropped with tag on. Note: The switch likely is not setup to handle tagging. Configuring VM's with "--bootproto static" network option will eliminate the issue. Problem related to VLAN tagged frames being dropped on the way to the DHCP server. |