From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040207 Firefox/0.8 Description of problem: There seems to be isses with running clumanager under x86_64 arch. I rebuild the clumanager-1.2.9-1.src.rpm with no problems. The daemons seem to run, but there are problems. I have seen two problems: 1.) Trying to add or modify a member in the redhat-config-cluster configuration screen results in the following trace: Traceback (most recent call last): File "/usr/share/redhat-config-cluster/configure/memberDialog.py", line 68, in on_okbutton_clicked self.member.validate(options) File "/usr/share/redhat-config-cluster/configure/clusterpkg/member_module.py", line 131, in validate if not ipconvert.ipIsInSameNetAsHost(dns_ipaddress, netmask): File "/usr/share/redhat-config-cluster/configure/clusterpkg/util_ipconvert_module.py", line 101, in ipIsInSameNetAsHost submitted_ip_num = self.dottedQuadToNum(submitted_ip) File "/usr/share/redhat-config-cluster/configure/clusterpkg/util_ipconvert_module.py", line 26, in dottedQuadToNum return struct.unpack('>L', socket.inet_aton(ip))[0] struct.error: unpack str size does not match format 2.) Trying to run "shutil -p /cluster/config.xml" results in a "Segmentation fault" I have not looked into this much yet, but it looks like there could be some issues with the 64bit vs. 32 bit. Version-Release number of selected component (if applicable): clumanager-1.2.9-1
I looked at the segfault problem more today. The gdb debugger produces the following backtrace: #0 0x0000002a95b1a094 in memcpy () from /lib64/tls/libc.so.6 No symbol table info available. #1 0x0000002a95f79391 in diskRawReadLarge (offset=147968, data_in=0x2a9556c000 "", count=1024) at large.c:289 size = {180388627136, 180388627136} i = 2 check_ret = {0, 0} crc = {1512077761, 1512077761} data = {0x2a9556d000 "", 0x2a9556e000 ""} pageSize = 4096 good_part = 0 mmap_size = 4096 #2 0x0000002a95f7747f in shared_raw_read_atomic ( pathname=0x7fbff1bbb4 "/cluster/config.xml", buf=0x51af30, count=672) at ops.c:116 offset = 147968 maxsize = 1048576 hdrp = (SharedHeader *) 0x2a9556c000 data = 0x2a9556c020 "HERE WAS XML CODE FOR CONFIG FILE"... total = 1024 rv = 42 #3 0x00000000004031fc in main () No symbol table info available. I noticed that the size going into the memcpy at the end of the diskRawReadLarge was vary large (see size in frame #1). Looking at things with gdb, I found that zeroing size[0] and size[1] at the beging of the diskRawReadLarge function seemed to corect this. I created a patch (attached) to do this, and rebuild the program. I am now able to run the shutil command without segmentation faults.
Created attachment 99157 [details] Patch to set the size to zero in diskRawReadLarge
Narrowing focus of this bugzilla to 'clumanager' package. Bug #120210 opened for 'redhat-config-cluster' package.
Patch applied to CVS; thanks for finding it so quickly. Typical use of uninitialized data biting us in the pinky toe. The patch will be incorporated in the next errata release *after* Update 2.
If I build any new cvs/test rpms which include the patch, I will post them here.
An errata has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2004-254.html