User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US) AppleWebKit/532.3 (KHTML, like Gecko) Chrome/4.0.223.6 Safari/532.3 hi all I would appreciate some help. I have set up a cluster 3.0.3 with 2.6.31 kernel All went well until I tried a gfs2 mount. The mount hangs without an error gfs_control dump reports nothing: gfs_control dump 1256941054 logging mode 3 syslog f 160 p 6 logfile p 6 /var/log/cluster/gfs_controld.log 1256941054 gfs_controld 3.0.3 started 1256941054 /cluster/gfs_controld/@plock_ownership is 1 1256941054 /cluster/gfs_controld/@plock_rate_limit is 0 1256941054 logging mode 3 syslog f 160 p 6 logfile p 6 /var/log/cluster/gfs_controld.log 1256941054 group_mode 3 compat 0 an strace on mount comand it appers that gfs_control is not responding brk(0) = 0x7f86d9054000 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f86d71aa000 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f86d71a9000 access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory) open("/etc/ld.so.cache", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0644, st_size=27308, ...}) = 0 mmap(NULL, 27308, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f86d71a2000 close(3) = 0 open("/lib/libc.so.6", O_RDONLY) = 3 read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\340\346\1\0\0\0\0\0@"..., 832) = 832 fstat(3, {st_mode=S_IFREG|0755, st_size=1338408, ...}) = 0 mmap(NULL, 3446712, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f86d6c46000 mprotect(0x7f86d6d86000, 2097152, PROT_NONE) = 0 mmap(0x7f86d6f86000, 20480, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x140000) = 0x7f86d6f86000 mmap(0x7f86d6f8b000, 18360, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f86d6f8b000 close(3) = 0 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f86d71a1000 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f86d71a0000 arch_prctl(ARCH_SET_FS, 0x7f86d71a06f0) = 0 open("/dev/urandom", O_RDONLY) = 3 read(3, "k\6\244\266U\3731\237"..., 8) = 8 close(3) = 0 mprotect(0x7f86d6f86000, 16384, PROT_READ) = 0 mprotect(0x7f86d73b7000, 4096, PROT_READ) = 0 mprotect(0x7f86d71ab000, 4096, PROT_READ) = 0 munmap(0x7f86d71a2000, 27308) = 0 brk(0) = 0x7f86d9054000 brk(0x7f86d9076000) = 0x7f86d9076000 lstat("/dev", {st_mode=S_IFDIR|0755, st_size=3300, ...}) = 0 lstat("/dev/mapper", {st_mode=S_IFDIR|0755, st_size=80, ...}) = 0 lstat("/dev/mapper/san", {st_mode=S_IFBLK|0640, st_rdev=makedev(253, 0), ...}) = 0 brk(0x7f86d9075000) = 0x7f86d9075000 lstat("/var", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0 lstat("/var/www", {st_mode=S_IFDIR|0755, st_size=26, ...}) = 0 lstat("/var/www/superstore.to", {st_mode=S_IFDIR|0755, st_size=17, ...}) = 0 lstat("/var/www/superstore.to/data", {st_mode=S_IFDIR|0755, st_size=6, ...}) = 0 stat("/var/www/superstore.to/data", {st_mode=S_IFDIR|0755, st_size=6, ...}) = 0 open("/dev/mapper/san", O_RDONLY) = 3 lseek(3, 65536, SEEK_SET) = 65536 read(3, "\1\26\31p\0\0\0\1\0\0\0\0\0\0\0\0\0\0\0d\0\0\0\0\0\0\7\t\0\0\7l\0"..., 512) = 512 close(3) = 0 rt_sigprocmask(SIG_BLOCK, [INT], [], 8) = 0 socket(PF_FILE, SOCK_STREAM, 0) = 3 connect(3, {sa_family=AF_FILE, path=@"gfsc_sock"...}, 12) = 0 write(3, "\\o\\o\1\0\1\0\7\0\0\0\0\0\0\0`p\0\0\0\0\0\0\0\0\0\0\0\0\0\0s"..., 28768) = 28768 read(3, any idee how to proceed further? thank you Reproducible: Always Steps to Reproduce: 1. start the cluster ulimit -c unlimited modprobe dlm modprobe gfs2 mount -t configfs none /sys/kernel/config cman_tool join fence_node -U groupd fenced dlm_controld gfs_controld fence_tool join 2. try to mount a gfs2 partition Actual Results: mount hangs and gfs_controld is not responding
Created attachment 367099 [details] node logs and debug info
Created attachment 367100 [details] node logs and debug info
Comment on attachment 367099 [details] node logs and debug info node trompeten
Created attachment 367101 [details] node-techno logs and debug info
Hi, something isn´t right in this bug report. What is your base distribution? Are you trying to run cluster 3.0.3 on top of RHEL5.4? Why are you running the daemons manually instead of using the init script? groupd doesn´t need to start at all in cluster 3 unless you are performing rolling upgrades operations.
I installed it manually from the source, so I can trace the problem. I have started groupd cause it was the way I did it in cluster2, there is light documentation for cluster3. I just started the cluster without groupd, and is the same behavior
I recommend that you run: udevadm monitor --environment --kernel on the node on which you do the mount, and then paste the results from that into this bz after running a mount which hangs. That should tell us a bit more about what is going on.
I don't receive anything, this means no uevent from the mount command? techno ~ # udevadm monitor --environment --kernel monitor will print the received events for: KERNEL - the kernel uevent
It is the kernel which is supposed to produce the uevents, not the mount command, so it looks like the issue is very early in the mount sequence, in this case.
is a 2.6.31 kernel, as requested in the cluster website. there is a special feature that should I check in the kernel?
No, and I don't think that the kernel is the issue since it looks like the mount process doesn't get to the point of actually trying to talk to the kernel (otherwise you'd get uevents reported). The problem looks like it occurs earlier than that.
This bug appears to have been reported against 'rawhide' during the Fedora 12 development cycle. Changing version to '12'. More information and reason for this action is here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping
Have you solved this issue yet? I'd suggest upgrading to a more recent Fedora at least. From what I can see above it seems that the problem might be caused by a lack of gfs_controld running which maybe because the cman packages isn't installed or working. Let us know if you are continuing to have problems, otherwise we'll close this.
I rolled back to cluster2, so the issue is still there. the cluster3 was compiled from source and gfs_controld was running, but I suspect it didn't respond
I still don't understand what is going on here.... can you explain exactly which versions of software don't work and exactly what you are doing to reproduce the problem?
mount.gfs2 command hanged on cluster-3.0.3 with kernel 2.6.31 I recently tried cluster 3.0.16 with kernel 2.6.32 and everything is working fine. I think you should close the bug cause it was an issue I had last year, the new cluster sources are working fine.
Ok, let us know if you have any more issues.