Bug 1666612
Summary: | Rules "uname -p" and "systemd-detect-virt" kill the system boot time on large systems | |||
---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Renaud Métrich <rmetrich> | |
Component: | systemd | Assignee: | Jan Synacek <jsynacek> | |
Status: | CLOSED ERRATA | QA Contact: | Frantisek Sumsal <fsumsal> | |
Severity: | high | Docs Contact: | ||
Priority: | high | |||
Version: | 7.6 | CC: | aghadge, bbreard, bubrown, cgoncalves, cww, hannsj_uhl, hmatsumo, iwienand, jniu, jonathon.parrish, jsynacek, kwalker, msekleta, naresh.sukhija_ext, pdancak, pdwyer, pete.perfetti, qe-baseos-daemons, rmadhuso, samukawa-oxa, tatsu-ab1, toneata | |
Target Milestone: | rc | Keywords: | ZStream | |
Target Release: | --- | |||
Hardware: | All | |||
OS: | Linux | |||
Whiteboard: | ||||
Fixed In Version: | systemd-219-64.el7 | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1701230 (view as bug list) | Environment: | ||
Last Closed: | 2019-08-06 12:43:57 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1655750, 1701230, 1707454 |
Description
Renaud Métrich
2019-01-16 08:19:32 UTC
So, in theory, couldn't we write a udev rule, which always runs first, runs "systemd-detect-virt" once and caches it into an environment variable? I guess we need to split the rule to two checks because current rule (where logical AND applies among conditions) can be triggered by all memory event rules and by add events on other subsystems. I think we need to change the rules like so, SUBSYSTEM!="memory", GOTO="memory_hotplug_end" ACTION!="add", GOTO="memory_hotplug_end" Good catch! Will give it a try this afternoon. Patch: -------- 8< ---------------- 8< ---------------- 8< ---------------- 8< -------- # diff -u /lib/udev/rules.d/40-redhat.rules.orig /lib/udev/rules.d/40-redhat.rules --- /lib/udev/rules.d/40-redhat.rules.orig 2019-01-16 10:59:24.813058034 +0100 +++ /lib/udev/rules.d/40-redhat.rules 2019-01-16 13:12:20.082219918 +0100 @@ -4,7 +4,8 @@ SUBSYSTEM=="cpu", ACTION=="add", TEST=="online", ATTR{online}=="0", ATTR{online}="1" # Memory hotadd request -SUBSYSTEM!="memory", ACTION!="add", GOTO="memory_hotplug_end" +SUBSYSTEM!="memory" GOTO="memory_hotplug_end" +ACTION!="add", GOTO="memory_hotplug_end" PROGRAM="/bin/uname -p", RESULT=="s390*", GOTO="memory_hotplug_end" ENV{.state}="online" -------- 8< ---------------- 8< ---------------- 8< ---------------- 8< -------- Result: -------- 8< ---------------- 8< ---------------- 8< ---------------- 8< -------- # journalctl -b | grep -w PROGRAM | awk -F "'" '{ print $2 }' | sort | uniq -c 1 /bin/sh -c 8 /bin/systemd-detect-virt 8 /bin/uname -p -------- 8< ---------------- 8< ---------------- 8< ---------------- 8< -------- This is far better, but not yet fully optimized (we should only have 1 call). Unfortunately my udev knowledge is limited and I couldn't find any example to only trigger the program once, store the result in a variable, then rely on that variable. It appears that I misunderstood how ENV{key} worked. It is bound to a device, which means it cannot be used globally (across devices) as a cache. And I'm not aware of anything like ENV, but which works globally. (In reply to Jan Synacek from comment #9) > It appears that I misunderstood how ENV{key} worked. It is bound to a > device, which means it cannot be used globally (across devices) as a cache. > And I'm not aware of anything like ENV, but which works globally. This is correct, also ENV is actually per event and once the event handling is completed the ENV values update udev database entry of the event device. What about creating a udev builtin for "uname -p" and "systemd-detect-virt" ? I guess we could execute the command once (or do the same as C code) then cache the value. (In reply to Renaud Métrich from comment #11) > What about creating a udev builtin for "uname -p" and "systemd-detect-virt" ? > I guess we could execute the command once (or do the same as C code) then > cache the value. A more general version would be to create a global "cache-and-env-like" variable to be assigned a defined value or an output from an external command just once. There would probably have to be an additional operation that would answer the question "does cache for <foo> exist?". Something like this: CACHE{virt}?, GOTO="end_label" PROGRAM="systemd-detect-virt", DO_SOMETHING..., CACHE{virt}="$RESULT" @Jan, I was wondering about that, but it looks like there is some precedent in the code for a global environment set around the "properties_list". static struct udev_ctrl_connection *handle_ctrl_msg(struct udev_ctrl *uctrl) { <snip> str = udev_ctrl_get_set_env(ctrl_msg); if (str != NULL) { char *key; key = strdup(str); if (key != NULL) { char *val; val = strchr(key, '='); if (val != NULL) { val[0] = '\0'; val = &val[1]; if (val[0] == '\0') { log_debug("udevd message (ENV) received, unset '%s'", key); udev_list_entry_add(&properties_list, key, NULL); } else { log_debug("udevd message (ENV) received, set '%s=%s'", key, val); udev_list_entry_add(&properties_list, key, val); } } else { log_error("wrong key format '%s'", key); } free(key); } worker_kill(udev); } <snip> I haven't had time to put a proof-of-concept together, but I was thinking of implementing a global-rule pass. Something like the following paths: <snip> static const char* const runonce_rules_dirs[] = { "/etc/udev/runonce.d", "/run/udev/runonce.d", UDEVLIBEXECDIR "/runonce.d", NULL}; <snip> This would be where we could put a rule such as gathering virtualization and uname information, as well as just about any other single-check behaviour, and propagate the environment variables to the rest of udev via properties_list. You get to utilize the attributes in the individual workers, and would be in keeping with the global attribute behaviour that is already there. If the team agrees that this is a decent approach, I could work on the POC and post upstream. This would all be contingent on upstream buy-in of course... - Kyle Walker (In reply to Kyle Walker from comment #13) > <snip> > static const char* const runonce_rules_dirs[] = { > "/etc/udev/runonce.d", > "/run/udev/runonce.d", > UDEVLIBEXECDIR "/runonce.d", > NULL}; > <snip> I think that I understand the idea. The question is: isn't this too complicated for what you're trying to achieve? I'm not really sure how to answer that question. My guess is that if you want to do it in a special file(s), one file should do it, as it would just state what gets cached and how. That way, you wouldn't introduce an entire directory structure just for the cache. But, I'm not sure. I guess we should first gather some data of where in the system this would actually be used and what improvements would it bring. (In reply to Jan Synacek from comment #14) > I think that I understand the idea. The question is: isn't this too > complicated for what you're trying to achieve? I'm not really sure how to > answer that question. My guess is that if you want to do it in a special > file(s), one file should do it, as it would just state what gets cached and > how. That way, you wouldn't introduce an entire directory structure just for > the cache. That is fair, and I think the safer path to go down. We can ship/maintain just a single file (UDEVLIBEXECDIR "runonce.rules"). > But, I'm not sure. I guess we should first gather some data of where in the > system this would actually be used and what improvements would it bring. After looking at the udev rules we have today, I think that these two lines would be the largest consumers. I mean, it would be simple enough to just make architecture and virtualization information available in the context of rules. Gather that information once during udevd initialization, and then add "VIRTUALIZATION" and "ARCHITECTURE" key names that can be used later. That just addresses the current situation, and doesn't really give us a path to address these types of issues in future, outside of adding more. This way we would give ourselves, and 3rd parties, a path to omit PROGRAM calls in match criteria for every device of that type where it isn't necessary. - Kyle Walker *** Bug 1668982 has been marked as a duplicate of this bug. *** Does this bug need to be RedHat only? It doesn't seem to include any particular customer information. There is discussion in https://bugs.centos.org/view.php?id=15735 but people can not see the progress on this bug (In reply to Michal Sekletar from comment #6) > I think we need to change the rules like so, > > SUBSYSTEM!="memory", GOTO="memory_hotplug_end" > ACTION!="add", GOTO="memory_hotplug_end" (pasting from the centos bug) I can confirm splitting the rule works as intended (http://paste.openstack.org/show/743429/) http://paste.openstack.org/show/743427/ is an example of udev testing that previously triggered this path, but doesn't with the change. I also regenerated the initramfs with this, and the runtime is about what it was before; i.e. there is no longer a regression --- [00:00:06.613] Welcome to CentOS Linux 7 (Core) dracut-033-554.el7 (Initramfs)! [00:00:35.658] Welcome to CentOS Linux 7 (Core)! --- (In reply to Renaud Métrich from comment #8) > # journalctl -b | grep -w PROGRAM | awk -F "'" '{ print $2 }' | sort | uniq > -c > 1 /bin/sh -c > 8 /bin/systemd-detect-virt > 8 /bin/uname -p > This is far better, but not yet fully optimized (we should only have 1 call). This about makes sense though, as it triggers once for each of the memory devices (e.g. https://bugzilla.redhat.com/attachment.cgi?id=1522941)? fix merged to staging branch -> https://github.com/lnykryn/systemd-rhel/pull/280 -> post *** Bug 1658839 has been marked as a duplicate of this bug. *** Just checking by when, the QA approvals are expected to be completed, I observe the ON_QA status since 27th march 2019 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2019:2091 |