Bug 1245410
| Summary: | rpm command stops working on large systems | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | George Beshers <gbeshers> | ||||
| Component: | libdb | Assignee: | Jan Staněk <jstanek> | ||||
| Status: | CLOSED ERRATA | QA Contact: | qe-baseos-daemons | ||||
| Severity: | urgent | Docs Contact: | |||||
| Priority: | urgent | ||||||
| Version: | 7.2 | CC: | bugproxy, databases-maint, ffesti, framsay, gbeshers, gcase, hannsj_uhl, hhorak, jkachuck, jstanek, lkardos, lpol, mnavrati, peterm, prarit, rja, tee, vjancik | ||||
| Target Milestone: | rc | Keywords: | OtherQA | ||||
| Target Release: | 7.2 | ||||||
| Hardware: | x86_64 | ||||||
| OS: | Linux | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||
| Doc Text: |
Due to the way memory was managed in libdb, the rpm command did not work on systems with large number of CPUs. This update changes the way memory management takes into account the number of CPUs. As a result, the rpm command works on large systems as expected.
|
Story Points: | --- | ||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2015-11-19 14:51:41 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Bug Depends On: | |||||||
| Bug Blocks: | 1118366, 1185045, 1200716, 1252514 | ||||||
| Attachments: |
|
||||||
Some additional data: This works up to 1024 cpus, but fails starting around 1200 cpus (depending on when the memory allocation fails). RHEL6 does not have this problem because it does not use compat-db. Likewise other vendors (at least the ones we checked) do not use compat-db and therefor do not hit this problem. This is actually a problem with libdb and appears to be a problem with BerkelyDB maintained by Oracle. The problem is that the number of cpus is used to calculate the desirable number of mutexes to allocate -- noone was expecting systems larger than 1024 cpus when the code was written. The mutexes are allocated in a region of memory, which is limited to 30588928 and this gets used up. I tried fiddling with upping the __db_region::max field but something else went wrong --- I *think* there is an assumption about region size built into an existing database. However, when I tried changing the regions size and using initdb that didn't work either and I dropped it. The simplest workaround is simply to limit the number of cpus the database sees in the function __os_cpu_count(). Created attachment 1066630 [details]
Limit libdb's 'understanding' to 1024cpus in function __os_cpu_count()
Good work George, do you have any feedback whether this fixes the issues in customer's case? Hi Honza, Did you ask a question in a hidden comment? George hannsj_uhl.com -- are you seeing this as well? P. Prarit, I think there is a question that I can't see otherwise I don't understand what the needinfo was reset for after I provided a workaround patch. George Hello George, You do have any feedback if this fixes the client issue? Thank You Joe Kachuck The patch included works around the problem. All it does is limit the number of cpus seen by libdb to 1024 which I believe means just SGI systems. This allows rpm & yum to work properly with existing rpm databases -- an important point as packages are often installed when the systems are booted to a single blade. A real fix is more involved and much riskier. Also, the problem exists in the latest BerkeleyDB from Oracle. ====== An interesting question going forward is what database will rpm use in rhel8. This is a blocker for 7.2. SGI & Red Hat have long support contracts with customers for greater than 1024 processor boxes. We cannot break rpm. P. (In reply to George Beshers from comment #10) > Did you ask a question in a hidden comment? Sorry, my mistake. Please, see comment #8, which simply asks whether we have any evidence it really solve the issue? We're not having a machine like this, so won't be able to reproduce. Also, if some testing builds help, we can provide some. Problem reported upstream [1], maybe they will have something to say. [1] https://community.oracle.com/message/13274780#13274780 (In reply to Honza Horak from comment #16) > (In reply to George Beshers from comment #10) > > Did you ask a question in a hidden comment? > > Sorry, my mistake. Please, see comment #8, which simply asks whether we have > any evidence it really solve the issue? We're not having a machine like > this, so won't be able to reproduce. George definitely has a machine like this ;) so he'll be able to test. If you can do a test rpm for him he can grab it directly from the brew link. P. I'm the other partner engineer for SGI, we can definitely test this if given an update. We have a system ready to test on. When you have a test rpm just point us
at the brew link. Thanks.
------------------------------------------
[root@harp31-sys ~]# yum clean all
error: db5 error(-30973) from dbenv->open: BDB0087 DB_RUNRECOVERY: Fatal error, run database recovery
error: cannot open Packages index using db5 - (-30973)
error: cannot open Packages database in /var/lib/rpm
CRITICAL:yum.main:
Error: rpmdb open failed
[root@harp31-sys ~]# topology
System type: UV3000
System name: harp31-sys
Serial number: UV3-00000031
Partition number: 0
32 Blades
2048 CPUs (online: 0-2047)
64 Nodes
7809.94 GB Memory Total
124.00 GB Max Memory on any Node
1 BASE I/O Riser
2 PCIe Slots
2 Network Controllers
2 Storage Controllers
3 USB Controllers
2 VGA GPUs
OK, I built the latest libdb (for 7.2 devel) in brew with applied patch. The task URL is [1]. Please let me know if you need anything else. [1] http://brewweb.devel.redhat.com/brew/taskinfo?taskID=9767941 And to remedy my mistake of providing internal-only link to the build, I copied the test packages to [1]. To be clear, these are test packages and should not be taken as final fix. [1] https://jstanek.fedorapeople.org/libdb/ Hi Jan, First, I have access to brew builds (as does Frank Ramsay), onsite engineers usually do. Second, we have tested the rpms and they work on UV2 system with 2048 cpus and your rpms solve the problem :). Cheers, George Will these rpms make the 7.2 beta release? (In reply to George Beshers from comment #25) > Will these rpms make the 7.2 beta release? Unfortunately not, we're too late for 7.2 Beta. Granting qa_ack. We have no such hw, so we'll do sanityonly and use verification from comment 24. RHQE - SanityOnly Verified by OtherQE (comment 24) Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2015-2163.html |
Description of problem: On large systems, typically >=1024 cpus, the rpm command fails. [root@harp31-sys ~]# rpm -qa error: db5 error(-30973) from dbenv->open: BDB0087 DB_RUNRECOVERY: Fatal error, run database recovery error: cannot open Packages index using db5 - (-30973) error: cannot open Packages database in /var/lib/rpm error: db5 error(-30973) from dbenv->open: BDB0087 DB_RUNRECOVERY: Fatal error, run database recovery error: cannot open Packages database in /var/lib/rpm Test1: reboot to single blade and 'rpm -qa' works. Therefore something is failing to generate in the multiprocessor boot sequence (much more likely than shutdown). Test2: cp __db00[123] files from backup to /var/lib/rpm now 'rpm -qa' works. There is no data corruption per-se. I did wonder if the __db00* files are being removed in the shutdown sequence, but this is not the case. Nate tracked it down to: It looks like the issue is in compat-db 4.7.25 On a large system dbenv->lk_partitions is set cpu * 10. Later in __lock_region_init the program allocates a few objects per lk_partition and one of those petty allocations eventually failed. int __lock_env_create(dbenv) DB_ENV *dbenv; { ... /* * Default to 10 partitions per cpu. This seems to be near * the point of diminishing returns on Xeon type processors. * Cpu count often returns the number of hyper threads and if * there is only one CPU you probably do not want to run partitions. */ cpu = __os_cpu_count(); dbenv->lk_partitions = cpu > 1 ? 10 * cpu : 1; return (0); } Version-Release number of selected component (if applicable): Also fails in 7.1 and current 7.2 development. How reproducible: Always on a large enough system. Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info: