Bug 2189428
Summary: | CentOS Stream 8 LXD and cloud images taking 2 minutes to get IP | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 8 | Reporter: | Lucas <lucasbaile14> | ||||
Component: | systemd | Assignee: | systemd-maint | ||||
Status: | CLOSED DUPLICATE | QA Contact: | Frantisek Sumsal <fsumsal> | ||||
Severity: | unspecified | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | CentOS Stream | CC: | bstinson, jwboyer, mosvald, systemd-maint-list | ||||
Target Milestone: | rc | ||||||
Target Release: | --- | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2023-04-25 09:15:03 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Thank you for reporting this problem. Checking the attached log, at first glance there is no apparent issue with dhclient itself or any dhcp server side issue (getting from DHCPDISCOVER to DHCPACK is very fast): ~~~ Apr 20 15:54:21 test-lxd network[467]: Bringing up interface eth0: Apr 20 15:54:21 test-lxd dhclient[630]: Created duid "\000\004\337\252\266\0058}C\201\204}L\255\204\034SK". Apr 20 15:54:21 test-lxd dhclient[630]: DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 8 (xid=0xca1db832) Apr 20 15:54:21 test-lxd dhclient[630]: DHCPREQUEST on eth0 to 255.255.255.255 port 67 (xid=0xca1db832) Apr 20 15:54:21 test-lxd dhclient[630]: DHCPOFFER from 10.155.213.1 Apr 20 15:54:21 test-lxd dhclient[630]: DHCPACK from 10.155.213.1 (xid=0xca1db832) Apr 20 15:54:23 test-lxd NET[658]: /usr/sbin/dhclient-script : updated /etc/resolv.conf Apr 20 15:54:23 test-lxd dbus-daemon[468]: [system] Activating via systemd: service name='org.freedesktop.hostname1' unit='dbus-org.freedesktop.hostname1.service' requested by ':1.8' (uid=0 pid=660 comm="hostnamectl --transient " label="unconfined") Apr 20 15:54:23 test-lxd systemd[1]: Starting Hostname Service... Apr 20 15:54:23 test-lxd dbus-daemon[468]: [system] Successfully activated service 'org.freedesktop.hostname1' Apr 20 15:54:23 test-lxd systemd[1]: Started Hostname Service. Apr 20 15:54:23 test-lxd dhclient[630]: bound to 10.155.213.114 -- renewal in 1754 seconds. Apr 20 15:54:23 test-lxd network[611]: Determining IP information for eth0... done. Apr 20 15:54:23 test-lxd network[467]: [ OK ] ~~~ On the other hand, I see the delay seems to arise earlier here: ~~~ Apr 20 15:52:51 test-lxd systemd-udevd[437]: link_config: autonegotiation is unset or enabled, the speed and duplex are not writable. Apr 20 15:52:51 test-lxd systemd[1]: Started udev Coldplug all Devices. Apr 20 15:52:51 test-lxd systemd-udevd[438]: link_config: autonegotiation is unset or enabled, the speed and duplex are not writable. ... note the delay happens here: Apr 20 15:54:21 test-lxd systemd[1]: systemd-journal-flush.service: start operation timed out. Terminating. Apr 20 15:54:21 test-lxd systemd[1]: systemd-journal-flush.service: Main process exited, code=killed, status=15/TERM Apr 20 15:54:21 test-lxd systemd[1]: systemd-journal-flush.service: Failed with result 'timeout'. ~~~ I found a similar problem on the net: https://unix.stackexchange.com/questions/740162/systemd-journal-flush-service-failed-with-result-timeout-failed-to-start-flush There are already opened BZs for this for RHEL7 (WONTFIX) & RHEL8: RHEL7 BZ: [Bug 1649857 - systemd-journal-flush.service fails in timeout causing boot to be delayed by 1min30] https://bugzilla.redhat.com/show_bug.cgi?id=1649857 RHEL8 BZ: [Bug 2174645 - Failed to start Flush Journal to Persistent Storage (systemd-journal-flush.service)] https://bugzilla.redhat.com/show_bug.cgi?id=2174645 When looking at the state of RHEL8 BZ I can see it is targeted to get fixed in RHEL8.8. There is also a comment asking about CentOS 8 fix first: https://bugzilla.redhat.com/show_bug.cgi?id=2174645#c26 Please, refer to the information shared in those links and BZs. Given the information above, I am going to change the component of this to systemd and close it as a duplicate of bug 2174645. *** This bug has been marked as a duplicate of bug 2174645 *** Thanks! I was totally barking at the wrong tree here due to this behavior affecting initially the networking, but I can see now that it's on an earlier point. I'll play around with the workaround from stackexchange to see if that's feasible for our flow, and will also keep an eye on this bug 2174645 Appreciate the help and fast reply! |
Created attachment 1959737 [details] Logs from the fresh LXD container Description of problem: We use centos stream 8 images on both LXD containers and on cloud VMs (Hetzner mostly in our case), and the base image currently seems to have an issue where its taking up to 2 minutes to finish the network startup. The issue seems related to the dhclient startup, but I might be wrong. I've attached all the logs from a fresh LXD container startup where the issue can be seen clearly. The container boots at 15:52:50 and the logs simply stop at 15:52:51 until 15:54:21 where they eventually resume, resulting in a startup finishing around 15:54:23, where it finally has an accessible IP, and thus the network is finally working. To add a bit more on that: We are also using podman centos stream 8 images (quay.io/centos/centos:stream8) and these are not experiencing the same issue. Version-Release number of selected component (if applicable): How reproducible: It happens 100% of times I create LXD containers. Steps to Reproduce: The behavior is the same for our LXD containers and our cloud VMs. Just start a fresh container using a Centos Stream 8 base image and it takes around 1~2 minutes to startup its network. For the LXD scenario, the image being used is `images:centos/8-Stream`, like this: ``` lxc launch images:centos/8-Stream test-lxd ``` Actual results: The network service takes 2 minutes on average to startup and get any IP Expected results: In the previous version (and other similar images) that usually happens withing a few short seconds