Bug 259421
Summary: | yum: Slow transaction test/transaction execution | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Christian Mandery <rhbugzilla> | ||||||
Component: | rpm | Assignee: | Panu Matilainen <pmatilai> | ||||||
Status: | CLOSED ERRATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||
Severity: | low | Docs Contact: | |||||||
Priority: | medium | ||||||||
Version: | 7 | CC: | helge.deller, james.antill, linux, pmatilai, pnasrat, tim.lauridsen | ||||||
Target Milestone: | --- | ||||||||
Target Release: | --- | ||||||||
Hardware: | i686 | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | 4.4.2.2-2.fc7 | Doc Type: | Bug Fix | ||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2007-10-24 07:14:00 UTC | Type: | --- | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Description
Christian Mandery
2007-08-28 09:36:44 UTC
Try something for me shut down all processes related to yum or rpm. then as root run: rpm --rebuilddb then re-run yum and see if you can make it hang like that. I stopped yum-updatesd and did not see anything in ps aux | grep yum and executed rpm --rebuilddb but it did not solve the issue. It still hangs some minutes. okay. a couple more things. could you run yum -d 5 install something and annotate where it hangs? Here's to output: [root@ls3163 ~]# yum -d 5 install apachetop Yum Version: 3.2.2 COMMAND: yum -d 5 install apachetop Installroot: / Ext Commands: apachetop Setting up Install Process Parsing package install arguments Setting up Package Sacks fedora 100% |=========================| 2.1 kB 00:00 updates 100% |=========================| 1.9 kB 00:00 Reading Local RPMDB No other apachetop installed, adding to list for potential install reduced installs : apachetop.i386 0:0.12.6-2.fc6 Building updates object Resolving Dependencies 1188477705.93 --> Running transaction check ---> Package apachetop.i386 0:0.12.6-2.fc6 set to be updated Checking deps for apachetop.i386 0-0.12.6-2.fc6 - u looking for (u'libc.so.6(GLIBC_2.2)', None, (None, None, None)) as a requirement of apachetop.i386 0-0.12.6-2.fc6 - u looking for (u'libc.so.6(GLIBC_2.4)', None, (None, None, None)) as a requirement of apachetop.i386 0-0.12.6-2.fc6 - u looking for (u'libgcc_s.so.1', None, (None, None, None)) as a requirement of apachetop.i386 0-0.12.6-2.fc6 - u looking for (u'libc.so.6(GLIBC_2.1)', None, (None, None, None)) as a requirement of apachetop.i386 0-0.12.6-2.fc6 - u looking for (u'rtld(GNU_HASH)', None, (None, None, None)) as a requirement of apachetop.i386 0-0.12.6-2.fc6 - u looking for (u'libstdc++.so.6(CXXABI_1.3)', None, (None, None, None)) as a requirement of apachetop.i386 0-0.12.6-2.fc6 - u looking for (u'libncurses.so.5', None, (None, None, None)) as a requirement of apachetop.i386 0-0.12.6-2.fc6 - u looking for (u'libc.so.6', None, (None, None, None)) as a requirement of apachetop.i386 0-0.12.6-2.fc6 - u looking for (u'libgcc_s.so.1(GCC_3.0)', None, (None, None, None)) as a requirement of apachetop.i386 0-0.12.6-2.fc6 - u looking for (u'libc.so.6(GLIBC_2.0)', None, (None, None, None)) as a requirement of apachetop.i386 0-0.12.6-2.fc6 - u looking for (u'libstdc++.so.6', None, (None, None, None)) as a requirement of apachetop.i386 0-0.12.6-2.fc6 - u looking for (u'libc.so.6(GLIBC_2.3)', None, (None, None, None)) as a requirement of apachetop.i386 0-0.12.6-2.fc6 - u looking for (u'libreadline.so.5', None, (None, None, None)) as a requirement of apachetop.i386 0-0.12.6-2.fc6 - u looking for (u'libstdc++.so.6(GLIBCXX_3.4)', None, (None, None, None)) as a requirement of apachetop.i386 0-0.12.6-2.fc6 - u looking for (u'libm.so.6', None, (None, None, None)) as a requirement of apachetop.i386 0-0.12.6-2.fc6 - u Dependencies Resolved 1188477705.98 ============================================================================= Package Arch Version Repository Size ============================================================================= Installing: apachetop i386 0.12.6-2.fc6 fedora 32 k Transaction Summary ============================================================================= Install 1 Package(s) Update 0 Package(s) Remove 0 Package(s) Total download size: 32 k Is this ok [y/N]: y Downloading Packages: (1/1): apachetop-0.12.6-2 100% |=========================| 32 kB 00:00 Running rpm_check_debug --> Populating transaction set with selected packages. Please wait. Adding Package apachetop - 0.12.6-2.fc6.i386 in mode u ---> Package apachetop.i386 0:0.12.6-2.fc6 set to be updated Running Transaction Test Adding Package apachetop - 0.12.6-2.fc6.i386 in mode u *** hang *** Finished Transaction Test Transaction Test Succeeded Adding Package apachetop - 0.12.6-2.fc6.i386 in mode u Running Transaction *** hang *** Installing: apachetop ######################### [1/1] Installed: apachetop.i386 0:0.12.6-2.fc6 Complete! Additionally, I have taken a look into the execution with strace. As I already said, during the hangs, nothing is shown by strace or ltrace but this are the last syscalls made before the first hang (transaction test): read(10, "lashplayer\\.so.*\t--\tuser_u:objec"..., 4096) = 4096 brk(0x1030d000) = 0x1030d000 brk(0x1032e000) = 0x1032e000 read(10, "em_u:object_r:home_root_t:s0\n/us"..., 4096) = 4096 brk(0x1034f000) = 0x1034f000 brk(0x10370000) = 0x10370000 read(10, "ns/nprhapengine\\.so.*\t--\tuser_u:"..., 4096) = 4096 brk(0x10391000) = 0x10391000 brk(0x103b2000) = 0x103b2000 read(10, "\n/usr/sap/Z96/[^/]*/\\.mozilla(/."..., 4096) = 4096 brk(0x103d3000) = 0x103d3000 read(10, "ZAW/lost\\+found/.*\t<<none>>\n/usr"..., 4096) = 4096 brk(0x103f4000) = 0x103f4000 brk(0x10415000) = 0x10415000 read(10, "ect_r:user_home_t:s0\n/usr/sap/ZD"..., 4096) = 4096 brk(0x10436000) = 0x10436000 brk(0x10457000) = 0x10457000 read(10, "))(/.+)?\tuser_u:object_r:httpd_u"..., 4096) = 4096 brk(0x10478000) = 0x10478000 read(10, "\t-d\tuser_u:object_r:user_home_di"..., 4096) = 4096 brk(0x10499000) = 0x10499000 brk(0x104ba000) = 0x104ba000 read(10, " user user_u\n#\n\n/usr/sap/ZYB/[^/"..., 4096) = 4096 brk(0x104db000) = 0x104db000 brk(0x104fc000) = 0x104fc000 read(10, "/]*/((www)|(web)|(public_html))("..., 4096) = 3208 brk(0x1051d000) = 0x1051d000 mmap2(NULL, 2072576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb72c1000 munmap(0xb74bb000, 2072576) = 0 These system calls are exactly the same before the second "freeze". Another information: Yum does not respond to Strg + C when hanging but Strg + Z works. Do you have any extremely slow drives or network disks mounted on this system? right now it looks like all the hang time is when yum hands off things to rpm. also - ctrl-c not being able to help is correct when rpm is running the transaction - we don't want it aborting in the middle of the transaction and leaving you with a questionably useful system. I don't have slow drives but a really huge amount of NFS mounts that are not all mounted but available via automount. I just tested and stopped the automounter while all NFS dirs where unmounted but that didn't speed it up. Just b/c I'm out of other ideas: 1. go to /var/lib/rpm and remove any __db* files 2. what ver of rpm and what kernel ver is this? thanks Huge amount of NFS mounts (note that rpm will try to mount all configured mounts when it runs) are one possibility, also things like having user accounts in huge LDAP directory can cause significant "hangs". Oh btw.. can you attach the full strace up to the point it hangs, including forks and timestamps? I'm afraid deleting the __db* files didn't help. rpm-4.4.2.1-1.fc7 rpm-libs-4.4.2.1-1.fc7 Linux ls3163.wdf.sap.corp 2.6.22.1-41.fc7 #1 SMP Fri Jul 27 18:10:34 EDT 2007 i686 i686 i386 GNU/Linux And yes, we have a huge NIS with ~ 5k - 10k users but as I already stated above, the hangups in yum only happen since two or three weeks ago while the NIS setup didn't change over the whole server lifetime (around two month now). I'll attach a full strace with forks and timestamps. Created attachment 181321 [details]
Output from strace -fFtt (20 MB unpacked!)
Ok, thanks. The pauses with no syscall activity are clearly there, now we just need to figure out what it exactly is. I have my suspicions, but can you try one more thing: try installing it with rpm itself with extra verbosity (rpm -Uvvh apachetop*.rpm) and see if it hangs with that too and if so, where in the debug spew the pause is. When running rpm -Uvvh [package], the problem occurs before any output is shown. The duration of the pause seems the same one as one of those when running yum (around two-three minutes). After this initial freeze, the whole rest of the rpm installation (the complete output) works like a charm with normal speed in less than one second. I'll attach the strace output of the rpm run. Created attachment 182961 [details]
Strace output with -fFtt of a rpm run with the freeze in it
Thanks. That removes one suspect (fingerprinting) from the list. It's got something to do with SELinux file context initialisation, but what exactly... SELinux is disabled on this system as far as possible. If it helps, some weeks before we had problems with our setup here (because we have so many NIS users) with the SELinux tool genhomedircon. These problems are documented here: https://bugzilla.redhat.com/show_bug.cgi?id=250233 But it cannot be exactly the same problem here because if it would do anything with our NIS or NFS shares, we should see it in the strace... Hmm.. there are a few suspicious spots in the selinux-patch that fedora carries... I'll try to have a look next week, todays time is running out. As a temporary workaround, --nocontexts switch might help with rpm itself but there doesn't seem to be a way to set that from yum. In any case not a yum bug so reassigning. Yes, calling rpm with --nocontexts fixes the problem for me, so I'm glad we have at least identified where the error stems from. Perhaps you should consider that- independend from this bug- all SELinux-related code shouldn't be run in yum/rpm if SELinux is disabled on the system. Anyway, thanks for your help. Strange things are happening here. I made an identical setup of the system with the problem as a backup machine. With identical I mean really identical: Same packages installed, Winbind joined the same domain and same NIS server. Anyway, on this second machine, the problem described above does not occur. So I guess it cannot be NIS or the Active Directory. Maybe that helps you to track down the bug. My guess is that it's really just a missing SELinux enabled/disabled check in a spot or two. Should be fixed in rpm.org HEAD, just needs backporting to 4.4.2.x for fixing in Fedora. rpm-4.4.2.2-2.fc7 has been pushed to the Fedora 7 testing repository. If problems still persist, please make note of it in this bug report. If you want to test the update, you can install it with su -c 'yum --enablerepo=updates-testing update rpm' As suggested, I just updated to rpm-4.4.2.2-2.fc7. Installing this updated RPM packages took around 5 minutes (so the bug was existent at that stage). Directly aferwards I did a "yum update" and installed some more updates (incl. eclipse-cdt.i386 1:3.1.2-8.fc7 hplip.i386 0:1.7.4a-6.fc7 libsane-hpaio.i386 0:1.7.4a-6.fc7). This update process finished in a few seconds. So it seems this updated rpm package fixed this problem. I think you can close this bug report. THANKS! Helge rpm-4.4.2.2-2.fc7 has been pushed to the Fedora 7 stable repository. If problems still persist, please make note of it in this bug report. |