Description of problem: Right now, DNS requests get always "hijacked" by OVN, and if there is a static record that matches, it will respond with the static entry. If there is no matching static record, instances without connectivity to the "8.8.8.8" DNS server that is default in the OVN DHCP packet cannot resolve DNS [0]. It is not possible to directly use different DNS services (like Designate in Openstack) to the one provided by OVN. The goal of this BZ is to ask for a configuration parameter (or similar) that could allow this kind of behavior by default. This is important in Openstack as it is an expected behavior of Neutron. [1] Version-Release number of selected component (if applicable): How reproducible: 100% Actual results: Instaces using the OVN networking cannot use the host networking for DNS resolution. Expected results: Instaces using the OVN networking have DNS resolution requests forwarded to the host. Additional info: [0] Openstack upstream bug with example: https://bugs.launchpad.net/neutron/+bug/1902950 [1] DNS resolution for instances: https://docs.openstack.org/neutron/latest/ovn/gaps.html
Hi Elvira, I'm a bit unclear on what the desired behavior from OVN is here. Reading the description, it sounds like the problems described may be fixed through configuration. First, the 8.8.8.8 DNS server that OVN offers via DHCP is based on a configuration option; OVN does not offer a DNS server in its DHCP replies by default. A northbound Logical_Switch_Port can have its "dhcp_options:dns_server" set to an alternative. This way, you can apply alternative servers for those ports that cannot reach 8.8.8.8. Second, if you don't want OVN hijacking the DNS requests, the Logical_Switch can have its "dns_records" column empty. This way, OVN will not try to answer the DNS requests from ports on this logical switch. Unfortunately, this is applied to the whole switch, so if you want different behaviors for different ports on the same logical switch, that would have to be an enhancement. Can you clarify if either of the configuration options I described would fix the issue?
Hi Elvira, any idea what would be the proper way forward on this?
I think your second option, cleaning the dns_record table, might be good for this situation. I have a meeting tomorrow where I will talk about this option with other team member to see if modifying this per switch is enough granularity for us. Thanks a lot!
Hi Elvira. How did this end up going? Are there any actions required in OVN?
So we looked into what would be needed to make this on the Neutron side, and although we can change the DNS server as you said, this doesn't solve the fact that isolated VMs won't be able to access that DNS server. We still think this could be addressed by making a feature in OVN that allows forwarding DNS queries to DNS resolvers that are in the same node as the VM (in Openstack, known as Compute Nodes). If we fix this on the Neutron side, we will have to make an external agent that connects the VM to the host network, and it would have to be connected to the SB DB too. This would affect performance at scale and therefore we are not sure we should follow that path. In the next upstream gathering are asking the community whether this configuration support would be important for them even if it would have this kind of impact. But if we allowed this directly from core OVN performance wouldn't be impacted like that. Furthermore, Daniel commented that ovn-k8s might also benefit from this configuration setting. Maybe it is worth checking!
Thanks for the update. From what I understand, the idea is that DNS queries (e.g. udp.dst == 53) should be forwarded to a specific port on the logical switch. This port, presumably a localnet port, would result in the DNS query being sent to a DNS cache on the local compute. This can't be done with ACLs since they only allow an "allow/drop" verdict, rather than allowing to send to an arbitrary switch port. Would that satisfy the requirements?
@Mark, you mean the localnet port would somehow be attached to host network? I wonder if the following would work: introduce a controller() action that would be triggered for udp.dst==53 for ports / switches tagged accordingly, where the controller action handler would use standard libresolv mechanism from libc to resolve names, then form a reply and send it back into the queue.
@Mark, in your scheme, is the "localnet port" of some special type? (Perhaps it's localport, not localnet?)
Sorry for the delayed response. I was using "localnet" as an example, not necessarily to be prescriptive. If it were a localport instead, that's valid, too.
I believe this was discussed by OVN core team and the decision was to clarify why designate cannot carry an agent running on a localport on each chassis (similar to metadata agent) that would pin a hole to host networking realm (unix socket?) to reply to requests using host dns functionality. Since OVN is (presumably) handling requests using its own db first, then passes them over further (to localport in the scenario), the agent won't need to have access to OVN db and will only have to handle requests for other host names, unknown by OVN. I will let OVN folks clarify if I'm missing something.
Looking at this bz again. === Elvira wrote, > If we fix this on the Neutron side, we will have to make an external agent that connects the VM to the host network, and it would have to be connected to the SB DB too I don't think this is true (but I'm very open to be corrected). Why would the agent require access to SB DB? The agent would run on localport, then for every DNS request, it would, through a unix socket of some sort, access the hypervisor resolver. (Meaning, we'll also need a component running on the hypervisor side that would serve the replies to unix socket requests.) Where is the need to access SB DB in this scenario? === Also of note: If we allow OVN controller to resolve the request using its own /etc/resolv.conf configuration, then another question to consider is whether you always want resolution to use OVN controller's resolver. Note that OVN controller may itself run in a separated network namespace (e.g. in kubernetes environment).
I already talked about it with Ihar, but just to have this on the BZ thread: I was probably wrong, since my only knowledge about this was from trying to understand what we would need by looking at the documentation. If we come to a nice design for an agent I think this core OVN BZ won't be needed.
I'm closing this since it seems like this will not require any changes in OVN.