Bug 846445
Summary: | Updating rhc-node rpm takes over an hour on nodes with a lot of gears | ||
---|---|---|---|
Product: | OKD | Reporter: | Thomas Wiest <twiest> |
Component: | Containers | Assignee: | Rob Millner <rmillner> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | libra bugs <libra-bugs> |
Severity: | medium | Docs Contact: | |
Priority: | high | ||
Version: | 2.x | CC: | jialiu, mfisher, mmcgrath |
Target Milestone: | --- | Keywords: | Triaged |
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | libra_ami #2027 | Doc Type: | Bug Fix |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2012-09-17 21:29:42 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Thomas Wiest
2012-08-07 20:07:39 UTC
Here's an example of a chcon that was run during the update: chcon -t libra_var_lib_t -l s0:c2,c560 -R /var/lib/stickshift/088b2f0e4a764ee7a492254406d7f657/[^.]* A patch went in today which should fix the issue where restorecon sets the wrong selinux context. The libra-tc script sets traffic control limits on the gears. Its unclear where or why it calls chcon; I'll have a look. The rhc-node %post script does an rhc-restorecon to fix selinux permissions in /var/lib/stickshift. Libra-tc sets up traffic control. Both iterate over each gear and are likely suffer when there's a large number of gears on the node. One thing worth looking at is if these scripts even need to be run as a result of a simple update. Created a node with 3000 gears and ran through the restarts in node's %post by hand with the following results: cgconfig restart: fast, but wipes out libra-cgroups libra-cgroups restart: 2021 seconds libra-tc restart: 226 seconds rhc-restorecon: 749 seconds rhc-ip-prep: already controlled, didn't measure, takes a long time. The conundrum is that changes to the cgroups and tc rules must take effect on upgrade, even on C9 nodes. Similarly, either rhc-restorecon or a fixfiles should be run if the selinux file configuration policy is updated (libra.fc). Maby the correct path is to touch /.autorelabel if libra.fc is new or has changed. rhc-restorecon does the wrong thing anyway, we should remove the automatic invocation. Commit 34af28a removes rhc-restorecon, and only runs cgconfig/libra-cgroups and libra-tc if they are not already initialized. Pull request: https://github.com/openshift/li/pull/265 Pull request merged. Verified this bug with rhc-node-0.97.6-1.el6_3.x86_64.rpm, and PASS. 1. Start an old instance (devenv-stage_232) 2. Create an app 3. Run the following command to create a dummy testing envrionment that about 2000 gears are existing on this node. $ for i in `seq 1 2000`; do useradd -b /var/lib/stickshift -c "libra guest" user$i; runuser -l user$i -s /bin/sh -c "cp -r /var/lib/stickshift/655c12fb14b14d9c820adb105e3c76e2/* /var/lib/stickshift/user${i}/"; done 4. Re-install rhc-node # time yum -y reinstall rhc-node <--snip--> Running Transaction Installing : rhc-node-0.96.14-1.el6_3.x86_64 1/1 Stopping system message bus: [ OK ] Starting system message bus: [ OK ] Shutting down oddjobd: [ OK ] Starting oddjobd: [ OK ] <--snip--> chcon -t libra_var_lib_t -l s0:c2,c200 -R /var/lib/stickshift/user1742/[^.]* chcon -t libra_tmp_t -l s0:c2,c200 -R /var/lib/stickshift/user1742/.tmp/* <--snip--> Stopping stickshift-proxy: [ OK ] Starting stickshift-proxy: [ OK ] Stopping stickshift-proxy: [ OK ] Starting stickshift-proxy: [ OK ] Verifying : rhc-node-0.96.14-1.el6_3.x86_64 <--snip--> real 8m9.938s user 0m47.861s sys 5m6.053s 5. Download the latest rhc-node package, then re-install it. # time yum install -y rhc-node-0.97.6-1.el6_3.x86_64.rpm Stopping system message bus: [ OK ] Starting system message bus: [ OK ] Shutting down oddjobd: [ OK ] Starting oddjobd: [ OK ] Stopping stickshift-proxy: [ OK ] Starting stickshift-proxy: [ OK ] Stopping stickshift-proxy: [ OK ] Starting stickshift-proxy: [ OK ] Cleanup : rhc-node-0.96.14-1.el6_3.x86_64 2/2 Verifying : rhc-node-0.97.6-1.el6_3.x86_64 1/2 Verifying : rhc-node-0.96.14-1.el6_3.x86_64 2/2 <--snip--> Updated: rhc-node.x86_64 0:0.97.6-1.el6_3 <--snip--> real 0m27.186s user 0m15.123s sys 0m4.418s The eclipsed time is shorter than before, it is very obvious. |