| Summary: | slow boot for firewalld | ||
|---|---|---|---|
| Product: | [Fedora] Fedora | Reporter: | Artem <artem.goncharov> |
| Component: | firewalld | Assignee: | Thomas Woerner <twoerner> |
| Status: | CLOSED EOL | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
| Severity: | unspecified | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 23 | CC: | artem.goncharov, extras-qa, jpopelka, jsynacek, lionghostshop, oliver.henshaw, twoerner |
| Target Milestone: | --- | Keywords: | Reopened |
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | firewalld-0.4.0-2.fc23 | Doc Type: | Bug Fix |
| Doc Text: | Story Points: | --- | |
| Clone Of: | 1112924 | Environment: | |
| Last Closed: | 2016-12-20 17:30:54 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Attachments: | |||
|
Description
Artem
2015-12-31 11:58:33 UTC
Please add the following lines to the firewalld.service file in /usr/lib/systemd/system/firewalld.service: After=syslog.target After=polkit.service After=abrtd.service These lines make sure that the start time of polkit, syslog and abrtd are not added to the start time of firewalld. Please report back if this is helping in your case and please attach the output of "systemd-analyze dump" There is also something strange in your log: NetworkManager is started at the same time as firewalld. But there is Before=NetworkManager.service in the firewalld.service file, which seems to be ignored. Please ignore the comment about NetworkManager.service. Seems to have a positive effect, at least now it's just 5s
[gtema@nuc ~]$ systemd-analyze blame
11.102s vncserver@:0.service
8.819s postgresql.service
7.508s packagekit.service
7.497s dev-sda2.device
7.445s systemd-udev-settle.service
5.233s chronyd.service
5.122s firewalld.service
5.101s systemd-logind.service
5.100s upower.service
5.048s avahi-daemon.service
4.991s rsyslog.service
4.973s rtkit-daemon.service
3.868s abrtd.service
3.547s wpa_supplicant.service
3.510s systemd-journal-flush.service
3.241s accounts-daemon.service
2.851s systemd-udevd.service
2.005s cups.service
1.876s polkit.service
1.570s abrt-ccpp.service
1.173s plymouth-start.service
1.069s systemd-fsck
1.003s dmraid-activation.service
Created attachment 1111490 [details]
systemd-analyze dump
firewalld, polkit and abrt are all dbus activated. It doesn't really matter if you make the ordering explicit (by using the After=), or leave it implicit. What you did is that you made the abrt and polkit services *explicitly* start *before* firewalld. Then, you just compare "boot times" of the firewalld itself. Because the original time line doesn't show polkit and abrt, it's not directly visible. Please compare the times of all three services summed up. That should be, in both cases, roughly the same. Maybe on old ha ... rdware, with less CPUs for example, the overall boot time might be larger. Booted today I have the following blame:
[gtema@nuc ~]$ systemd-analyze blame
11.265s firewalld.service
9.779s accounts-daemon.service
8.264s systemd-udev-settle.service
8.123s dev-sda2.device
7.553s postgresql.service
6.702s abrtd.service
4.463s packagekit.service
4.365s systemd-journal-flush.service
4.134s rsyslog.service
3.838s upower.service
3.091s systemd-udevd.service
2.399s polkit.service
1.599s systemd-tmpfiles-setup-dev.service
1.527s cups.service
1.281s colord.service
1.169s avahi-daemon.service
1.014s systemd-sysctl.service
914ms abrt-ccpp.service
...
So firewall+abrtd+polkit ~21s
firewalld.service still has
After=syslog.target
After=polkit.service
After=abrtd.service
It seems there is something else that is added to the start time of firewalld. Please add the output of "systemd-analyze dump" and "systemd-analyze plot" again It does not occur always, normally start time takes about 5s (what is still quite long). But finally I got again ~10s Created attachment 1114935 [details]
systemd-analyze dump
Created attachment 1114936 [details]
systemd-analyze plot
firewalld-0.4.0-1.fc23 has been submitted as an update to Fedora 23. https://bodhi.fedoraproject.org/updates/FEDORA-2016-fc0691e6a7 firewalld-0.4.0-1.fc23 has been pushed to the Fedora 23 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-fc0691e6a7 firewalld-0.4.0-2.fc23 has been submitted as an update to Fedora 23. https://bodhi.fedoraproject.org/updates/FEDORA-2016-fc0691e6a7 firewalld-0.4.0-2.fc23 has been pushed to the Fedora 23 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-fc0691e6a7 not sure it is better now:
[gtema@nuc system]$ systemd-analyze
Startup finished in 727ms (kernel) + 2.705s (initrd) + 44.387s (userspace) = 47.820s
[gtema@nuc system]$ systemd-analyze blame
12.633s firewalld.service
11.311s systemd-udev-settle.service
10.251s abrtd.service
9.977s dev-sda2.device
9.497s accounts-daemon.service
7.876s postgresql.service
6.650s systemd-journal-flush.service
5.164s packagekit.service
5.015s systemd-journald.service
4.575s avahi-daemon.service
4.527s rsyslog.service
4.226s rtkit-daemon.service
4.199s upower.service
3.763s systemd-udevd.service
3.476s cups.service
2.564s polkit.service
2.517s dnf-makecache.service
1.935s systemd-fsck
1.685s sshd.service
1.398s wpa_supplicant.service
1.387s systemd-tmpfiles-setup-dev.service
1.337s udisks2.service
1.280s abrt-ccpp.service
1.212s chronyd.service
1.173s systemd-rfkill
1.068s plymouth-start.service
1.049s systemd-sysctl.service
Created attachment 1123566 [details]
systemd-analyze dump
Created attachment 1123568 [details]
systemd-analyze plot
Firewalld is slower on F22 than it was in F21 on my hdd system. The way several services take more than ten seconds reminds me of the old journal flushing bugs, e.g. bug #1006386 and duplicates. Perhaps the lack of the readahead service is to blame for this? Either there's more random io swamping the system and slowing boot or something is getting delayed and other processes are blocking on it. A another ssd systemd, firewalld starts faster in a real boot and in VMS but still contributes more time than most other services. I expect the above update in F23 may help there, but haven't tested yet. Below is a bootchart from systemd-bootchart and accompanying systemd-analyze output. Note that systemd-bootchart is started as a service, as it seems to finish too early when started as init, but it still starts early enough to show activity during firewalld activation. There doesn't seem to be much iowait during firewalld activation afaict, though there is a lot of read activity all through boot. setroubleshootd looks suspicious, maybe? systemd-219-27.fc22.x86_64 firewalld-0.3.14.2-4.fc22.noarch Created attachment 1127976 [details]
output of systemd-bootchart showing process activity during boot
Created attachment 1127978 [details]
output of systemd-analyze plot showing corresponding service activity during boot
Created attachment 1128573 [details]
"strace -tt -T -C" of firewalld service launch during boot
Here's an strace of F22 firewalld startup during boot. Is there anything suspicious in there? I know it's not as useful as an strace from F23 but maybe there's still some unexpected waits?
As I observed in bug #1112924 comment #6, python startup is very slow. It takes around 12 seconds to reach stat("/usr/lib/python2.7/site-packages/firewall/config") when "from firewall import config" is the seventh line of /usr/sbin/firewalld. firewalld-0.4.0-2.fc23 has been pushed to the Fedora 23 stable repository. If problems still persist, please make note of it in this bug report. Not fixed according to comment #16. just confirm one more time, that the bug is NOT fixed Created attachment 1171998 [details] '/usr/bin/strace -tt -T' of firewalld service launch during boot (F23 SSD) So the firewalld statup time improved to about 2.3 seconds after fitting the machine with an SSD. This still seems a little on the high side. Upgrading to F23 shows no further improvement. Repeating the strace logging shows that firewalld still spends a high proportion of its startup in very early python imports. With # ls /etc/systemd/system/firewalld.service.d/ orderring.conf strace.conf # cat orderring.conf [Unit] Requires=polkit.service abrtd.service After=polkit.service abrtd.service Before=tuned.service # cat strace.conf [Service] ExecStart= ExecStart=/usr/bin/strace -tt -T -o /run/log/firewalld /usr/sbin/firewalld --nofork --nopid $FIREWALLD_ARGS 'systemd-analyze blame' reports "3.653s firewalld.service" and the corresponding strace log is attached. I see 0.6 seconds between stat("/usr/bin/lib64/python3.4/os.py") and stat("/usr/lib64/python3.4/site-packages/dbus/...") and a further 0.4 seconds until stat("/usr/lib/python3.4/site-packages/firewall/config/...") So it takes about a second to reach the first firewalld-related import and "import os\n import sys" by itself takes over half a second. So based on this and comment #23, there's something in python startup that's heavily IO-bound on HDD, and possibly noticeably IO bound or CPU-bound on SSD. (In reply to Oliver Henshaw from comment #27) > So based on this and comment #23, there's something in python startup that's > heavily IO-bound on HDD, and possibly noticeably IO bound or CPU-bound on > SSD. Or perhaps it's not so simple. The CPU wait (overall or per-cpu) chart from bootchart is slightly misleading as it comes from /proc/schedstat and shows the total time spent waiting to run by all tasks - i.e. it includes times when another process is running. AIUI the iowait time from /proc/stat would be more useful here, probably the only things to worry about in bootchart are times when CPU utilisation is < 100% and wait-time is > 0%. So even though firewalld spends a lot of time waiting, other services like Xorg or abrt ancillary services are making progress in this time. If I order firewalld after polkit.service, abrtd.service, abrt-xorg.service, abrt-oops.service, abrt-ccpp.service and before sddm.service then some parts of early firewalld startup do happen much quicker but overall boot time doesn't change much. So I should relax all the ordering constraints (which I added to isolate and debug the problem) and let firewalld start in parallel with any other services and let the boot-time be as efficient as possible. A system with a faster CPU might become io-limited though, even with a SSD. This message is a reminder that Fedora 23 is nearing its end of life. Approximately 4 (four) weeks from now Fedora will stop maintaining and issuing updates for Fedora 23. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a Fedora 'version' of '23'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora 23 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior this bug is closed as described in the policy above. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. Fedora 23 changed to end-of-life (EOL) status on 2016-12-20. Fedora 23 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. If you are unable to reopen this bug, please file a new report against the current release. If you experience problems, please add a comment to this bug. Thank you for reporting this bug and we are sorry it could not be fixed. |