Bug 176443
Summary: | regression: clvmd with gulm gets stuck in infinite loop | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 4 | Reporter: | Corey Marthaler <cmarthal> |
Component: | device-mapper | Assignee: | Alasdair Kergon <agk> |
Status: | CLOSED ERRATA | QA Contact: | Cluster QE <mspqa-list> |
Severity: | medium | Docs Contact: | |
Priority: | high | ||
Version: | 4.3 | CC: | agk, ccaulfie, kanderso, rkenna |
Target Milestone: | --- | Keywords: | Regression |
Target Release: | --- | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | RHBA-2006-0137 | Doc Type: | Bug Fix |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2006-03-07 21:37:04 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 164914, 168430 |
Description
Corey Marthaler
2005-12-22 21:14:10 UTC
I can reproduce this on i386 with gcc4, but I'm not sure if it's exactly the same problem - it certainly looks identical externally. gcc 3.2 on i386 does not have the problem. It seems to be compiler-related. The LVM hash.c functions aren't working properly, so when clvmd tries to traverse a list of nodes it gets stuck in an infinite loop. dm_hash_get_next() never returns a NULL pointer for the end of the list, just the same node over and over again. Adding a dummy function: static void anything() {} to LVM2/lib/datastruct/hash.c fixes it ! Oops, that should be device-mapper/lib/datastruct/hash.c of course. Adding to the blocker list and bumping the priority as this is blocking U3 regression testing. Changing the summary as it has been seen by Patrick on i386. A couple more possibilities for a "fix". All apply to devicemapper/lib/datastruct/hash.c: * add -Os to the CFLAGS * Remove "inline" from _find() Take your pick. I'll remove the inline. And tidy some of the implicit int/long, signed/unsigned conversions in that file too in case they're anything to do with it. Although the bug has only shown up in clvm, the fix we propose goes into the device-mapper package. This fix doesn't appear to help the problem. [root@link-01 ~]# rpm -q lvm2 lvm2-2.02.01-1.2.RHEL4 [root@link-01 ~]# rpm -q lvm2-cluster lvm2-cluster-2.02.01-1.1.RHEL4 [root@link-01 ~]# rpm -q device-mapper device-mapper-1.02.02-2.0.RHEL4 [root@link-01 ~]# lvcreate --version LVM version: 2.02.01 (2005-11-23) Library version: 1.02.02 (2006-01-04) Driver version: 4.5.0 Stuck lvremove cmd: stat("/etc/lvm/lvm.conf", {st_mode=S_IFREG|0644, st_size=10509, ...}) = 0 stat("/usr/lib64/liblvm2clusterlock.so", {st_mode=S_IFREG|0555, st_size=9368, ...}) = 0 open("/usr/lib64/liblvm2clusterlock.so", O_RDONLY) = 3 read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\260\n\0"..., 640) = 640 fstat(3, {st_mode=S_IFREG|0555, st_size=9368, ...}) = 0 mmap(NULL, 1055728, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x2a97b38000 mprotect(0x2a97b3a000, 1047536, PROT_NONE) = 0 mmap(0x2a97c39000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1000) = 0x2a97c39000 close(3) = 0 socket(PF_FILE, SOCK_STREAM, 0) = 3 connect(3, {sa_family=AF_FILE, path=@clvmd}, 110) = 0 stat("/proc/lvm/VGs/gfs", 0x7fbfffb4c0) = -1 ENOENT (No such file or directory) rt_sigprocmask(SIG_SETMASK, ~[RTMIN RT_1], [], 8) = 0 write(3, "3\1\377\277\0\0\0\0\0\0\0\0\10\0\0\0\0\4\0V_gfs\0\351", 26) = 26 read(3, same problem, it's just that the workaround for i386 doesn't seem to have any effect on x86_64. -O0 does "fix" it though, if you fancy your hash functions non-optimised :-( This is still blocking QA for U3 and GFS beta. Any progress? Updated packages ready. Fix verified with the following rpms. [root@link-01 tmp]# rpm -q device-mapper device-mapper-1.02.02-3.0.RHEL4 [root@link-01 tmp]# rpm -q lvm2 lvm2-2.02.01-1.3.RHEL4 [root@link-01 tmp]# rpm -q lvm2-cluster lvm2-cluster-2.02.01-1.2.RHEL4 Phew! Glad we caught that problem - it might have caused all sorts of hard-to-reproduce problems elsewhere in lvm2 on all architectures. An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2006-0099.html An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2006-0137.html |