Description of problem: Since gulm doesn't handle non FQDNs (though cman does), I had to change my cluster.conf file to match the 'uname -ar' output to test gulm. However, when using that same cluster.conf for cman, it appears to just use the basename for it's membership which causes the fence service not to join. I know we've had this discussion before :) but it would be nice if both gulm and cman were fixed to work with and without a FQDN. [root@morph-01 root]# uname -ar Linux morph-01.lab.msp.redhat.com 2.6.9-5.ELsmp #1 SMP Wed Jan 5 19:30:39 EST 2005 i686 i686 i386 GNU/Linux from cluster.conf <clusternode name="morph-01.lab.msp.redhat.com" votes="1"> [root@morph-01 root]# cat /proc/cluster/nodes Node Votes Exp Sts Name 1 1 5 M morph-01 2 1 5 M morph-04 3 1 5 M morph-03 4 1 5 M morph-02 5 1 5 M morph-05 [root@morph-01 root]# fence_tool join fenced: local cman node name "morph-01" not found in cluster.conf strace: . . . execve("/sbin/fenced", ["fenced"], [/* 24 vars */]) = 0 uname({sys="Linux", node="morph-01.lab.msp.redhat.com", ...}) = 0 brk(0) = 0x966e000 open("/etc/ld.so.preload", O_RDONLY) = -1 ENOENT (No such file or directory) open("/etc/ld.so.cache", O_RDONLY) = 3 fstat64(3, {st_mode=S_IFREG|0644, st_size=38273, ...}) = 0 old_mmap(NULL, 38273, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb7ff6000 close(3) = 0 open("/lib/tls/libpthread.so.0", O_RDONLY) = 3 read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0`\327\324"..., 512) = 512 fstat64(3, {st_mode=S_IFREG|0755, st_size=106212, ...}) = 0 old_mmap(0xd49000, 70128, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0xd49000 old_mmap(0xd57000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 3, 0xd000) = 0xd57000 old_mmap(0xd59000, 4592, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xd59000 close(3) = 0 open("/lib/tls/libc.so.6", O_RDONLY) = 3 read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\300+\300"..., 512) = 512 fstat64(3, {st_mode=S_IFREG|0755, st_size=1455084, ...}) = 0 old_mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7ff5000 old_mmap(0xbee000, 1158124, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0xbee000 old_mmap(0xd03000, 16384, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 3, 0x115000) = 0xd03000 old_mmap(0xd07000, 7148, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xd07000 close(3) = 0 mprotect(0xd03000, 8192, PROT_READ) = 0 mprotect(0xd57000, 4096, PROT_READ) = 0 mprotect(0xbea000, 4096, PROT_READ) = 0 set_thread_area({entry_number:-1 -> 6, base_addr:0xb7ff5900, limit:1048575, seg_32bit:1, contents:0, read_exe0 munmap(0xb7ff6000, 38273) = 0 set_tid_address(0xb7ff5948) = 3937 rt_sigaction(SIGRTMIN, {0xd4d6d0, [], SA_RESTORER|SA_SIGINFO, 0xd54450}, NULL, 8) = 0 rt_sigprocmask(SIG_UNBLOCK, [RTMIN], NULL, 8) = 0 getrlimit(RLIMIT_STACK, {rlim_cur=10240*1024, rlim_max=RLIM_INFINITY}) = 0 _sysctl({{CTL_KERN, KERN_VERSION}, 2, 0xbfe5ee68, 34, (nil), 0}) = 0 socket(PF_UNIX, SOCK_DGRAM, 0) = 3 brk(0) = 0x966e000 brk(0x968f000) = 0x968f000 brk(0) = 0x968f000 socket(0x1e /* PF_??? */, SOCK_DGRAM, 3) = 4 ioctl(4, 0x780b, 0) = 1 ioctl(4, 0xc1187890, 0xbfe5ef70) = 0 time(NULL) = 1106777318 sendto(3, "1106777318 our name from cman \"m"..., 41, 0, {sa_family=AF_UNIX, path=@fenced_socket}, 16) = -1 E) close(4) = 0 socket(PF_INET6, SOCK_STREAM, IPPROTO_IP) = 4 setsockopt(4, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0 bind(4, {sa_family=AF_INET6, sin6_port=htons(1023), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_flowinfo=0, 0 connect(4, {sa_family=AF_INET6, sin6_port=htons(50006), inet_pton(AF_INET6, "::", &sin6_addr), sin6_flowinfo=0 write(4, "\1\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0", 20) = 20 read(4, "\1\0\0\0\0\0\0\0\2\0\0\0\0\0\0\0\0\0\0\0", 20) = 20 close(4) = 0 socket(PF_INET6, SOCK_STREAM, IPPROTO_IP) = 4 setsockopt(4, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0 bind(4, {sa_family=AF_INET6, sin6_port=htons(1023), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_flowinfo=0, 0 connect(4, {sa_family=AF_INET6, sin6_port=htons(50006), inet_pton(AF_INET6, "::", &sin6_addr), sin6_flowinfo=0 write(4, "\3\0\0\0\0\0\0\0\2\0\0\0\0\0\0\0004\0\0\0/cluster/clu"..., 72) = 72 read(4, "\3\0\0\0\0\0\0\0\2\0\0\0\303\377\377\377\0\0\0\0", 20) = 20 close(4) = 0 write(2, "fenced: ", 8fenced: ) = 8 write(2, "local cman node name \"morph-01\" "..., 58local cman node name "morph-01" not found in cluster.conf ) = 58 exit_group(1) = ? Version-Release number of selected component (if applicable): CMAN <CVS> (built Jan 25 2005 15:37:28) How reproducible: Always
For what it's worth... gulm used to use whatever "uname -n" reported. I don't know if that's still the case, and I think that there a flag that you can pass lock_gulmd that specifies it's name. That could be the shortname or the FQDN. I think that it would be best to standardize on the 'uname -n' of the machine. That provides a pretty consistent way of determining identification.
> I think that it would be best to standardize on the 'uname -n' of > the > machine. That provides a pretty consistent way of determining > identification. I disagree. Full name and short name should work exactly the same. It may be more work to implement, but will pay off. This comes up constantly in support.
cman_tool takes as the node name whatever uname(2) returns in utsname.nodename. This is the name that should also be used in cluster.conf. If these two (uname and cluster.conf) match, then there shouldn't be any problem anywhere. I think the real problem here is that cman_tool is too forgiving and doesn't report an error if those two don't match. cman_tool currently looks up the uname(2) in cluster.conf and if it's not found, it tries to remove everything after a "." and tries to find it again in cluster.conf. Does everyone agree that I should make cman_tool more strict and enforce matching uname(2) and cluster.conf? The alternative is to have every piece of code cope with both short and long names.
- cman (the kernel module) is the one single place from which everything else gets the local node name. - when cman_tool starts cman (the kernel module), cman_tool tells cman what the local node name is. - whatever cman_tool gives cman as the name is what it will be, there are no further checks or changes - how does cman_tool pick the local nodename that it gives to cman? 1. first priority is the optional -n <nodename> cman_tool arg 2. second (and most common) is whatever uname(2) reports - Once cman_tool gets this nodename, it checks to verify that this name can be found (exactly) in cluster.conf. If not, it reports an error. - If the check passes (name is found in cluster.conf), cman_tool starts cman with that node name. - when a node with <nodename> fails, cman tells fenced to fence node <nodename> - fenced looks up <nodename> in cluster.conf to find the fencing parameters for the victim - if <nodename> can't be found in cluster.conf, the victim can't be fenced; this is why it's required that the cman nodename matches whatever name is used in cluster.conf - this bug arose because we allowed the cman nodename to be "foo" while its cluster.conf entry was "foo.bar.com" and fenced didn't match them. - this bug is fixed by cman_tool verifying that the nodename it uses is found exactly in cluster.conf, i.e. if the nodename is "foo", then "foo" must be in cluster.conf. or if the nodename is "foo.bar.com", then "foo.bar.com" must be the cluster.conf entry. We no longer permit the nodename to be "foo" while the cluster.conf entry is "foo.bar.com". What does this mean for users? U1. "standard" method where the -n option is not used: the result of uname(2) needs to be used for the cluster.conf entry. U2. if uname(2) reports FQDN but the user wants to use short names for cman and in cluster.conf, then they need to use the -n option, cman_tool join -n <nodename> U3. if uname(2) reports short name but the user wants to use FQDN for cman and in cluster.conf, then they also use the -n option, cman_tool join -n <FQDN> Other possibilities: - always use the short nodename everywhere (never use FQDN) - add a cman_tool -s option to tell it to use short names (i.e. if uname(2) returns a FQDN it will shorten it) Nothing changes above, this is just another way to do U2 that's simpler. - allow mismatch between cman nodename and cluster.conf entries, e.g cman nodename may be foo.bar.com while cluster.conf has foo, or cman nodename may be foo while cluster.conf has foo.bar.com. (This requires some unsavory code to solve the bug described above where fenced didn't match "foo" with "foo.bar.com".)
I now realize (from experience!) that this change may be disruptive to a lot of existing users. IOW, a lot of systems are configured such that uname(2) returns FQDN and cman_tool has been shortening this name automatically in the past so that it matches the short name in cluster.conf. With the new enforcement, the nodename doesn't match cluster.conf and cman_tool fails with an error. This change means that these people (like me) must now do one of: 1. set the hostname of their machine to the short version, or 2. update their cluster.conf to use FQDN, or 3. begin using cman_tool join -n <shortname> I'm thinking about other possibilities that aren't so disruptive to existing setups. One is to have cman_tool manipulate the nodename (if it got the nodename from uname(2), not if the nodename was given as a -n option) so that it matches whatever is found in cluster.conf: - If uname(2) returns FQDN but cluster.conf uses short names, then cman_tool picks the short name as nodename. - If uname(2) returns short name but cluster.conf uses FQDN, then cman_tool chooses the FQDN as the nodename. This way, we preserve the concept that the cman nodename matches what's found in cluster.conf. We just make cman_tool a little smarter about picking the nodename so that it matches what people enter in cluster.conf.
This bug really shouldn't be about FQDNs at all. It's about getting cman to recognize what's in cluster.conf. Ideally, cman's mechanism should match that of gulm's so that this remains consistent across all components. gulm (Mike will correct me if I am wrong) defaults to using uname, however it can be overridden by starting lock_gulmd with the --name flag. cman should probably have the same behavior as gulm. This makes switching from one cluster infrastructure to the other much easier as you don't need to modify init scripts or nodenames in cluster.conf as well. The most non-diruptive change from comment #5 is to make users change their cluster.conf. 1) Making them change their system's hostname seems rather inappropriate in my mind. They should be allowed to name their machines however they want. Changing the hostname allows gulm to continue to operating should they want to switch from cman to gulm or vice versa. 2) Changing cluster.conf doesn't require us to enforce policy on users' clusters (i.e. what names they are allowed to use) 3) changing the startup procedure to use shortnames is the users perogative. If they want to use shortnames, and have long names in the cluster.conf, then the -n name option allows them to do it. I will add code to the init scripts that allows users to specify theiir nodename in /etc/sysconfig/cluster. If the value for that parameter is not present, the same default ought to be used that is used by gulm, namely uname (which is not necessarily the FQDN).
cman defaults to getting nodename from uname, like gulm, and this can be overriden with the -n flag, like gulm. This isn't sufficient to prevent critical mistakes, though. We need to do one of two things, I don't really care which: 1. what I mentioned in comment #4 -- report an error if the nodename (whether it came from uname or -n) doesn't match what's in cluster.conf. This is disruptive for people currently using cman from cvs. 2. what I mentioned at the end of comment #5 -- adapt the uname to whatever is in cluster.conf, i.e. if the user put short names in cluster.conf, shorten uname if needed, or if the user put long names in cluster.conf, lengthen uname if it's short.
Adding to release blocker list
This may be a separate concern, but it's naming-related so I'll add it here for now. How will people be able to specify that they want their cluster traffic to run over a private interface? Possibly by another attribute in the nodename tag that specifies which node interface traffic should run over. Hopefully, the eventual solution to this problem will be flexible enough to allow people to choose the interface their cluster traffic runs over. And that it can be configured in the XML file with the rest of the configuration, rather than having a bunch of extra flags.
Assign this to Dave as he seems to know what's going on here.
Should be fixed.
fix verified with fqdn, short names, and private networks.