Description of problem: I tried to set up heartbeat on Fedora 14. After creating/editing authkeys and ha.cf in /etc/ha.d (contents below) I tried to start it up. After some seconds the system rebooted by itself and according to the logs, it was because of heartbeat itself. This is the relevant log section: Dec 28 12:34:46 db-ha1 cib: [13629]: info: startCib: CIB Initialization completed successfully Dec 28 12:34:46 db-ha1 cib: [13629]: CRIT: get_cluster_type: This installation of Pacemaker does not support the '(null)' cluster infrastructure. Terminating. Dec 28 12:34:46 db-ha1 heartbeat: [13574]: WARN: Managed /usr/lib64/heartbeat/cib process 13629 exited with return code 100. Dec 28 12:34:46 db-ha1 heartbeat: [13574]: EMERG: Rebooting system. Reason: /usr/lib64/heartbeat/cib authkeys: ======================= auth 1 1 sha1 secretpass ======================= ha.cf: ======================= crm on bcast eth0 node db-ha1 db-ha2 ======================= Version-Release number of selected component (if applicable): # rpm -q heartbeat pacemaker heartbeat-3.0.0-0.7.0daab7da36a8.hg.fc14.x86_64 pacemaker-1.1.4-4.fc14.x86_64 How reproducible: Always. Steps to Reproduce: 1. Install Fedora 14, upgrade 2. Set up heartbeat with a minimal configuration, no resources yet. 3. Start heartbeat Actual results: System reboots. Expected results: No reboot, working CRM, crm_mon should show the node(s) in the cluster. Additional info:
The reboot problem is quite critical because: 1. installation of heartbeat automatically adds it to the services run at boot 2. setting it up with a minimal configuration and trying it out reboots the system Bang, instant reboot loop, you need to manually boot into single mode and remove heartbeat from the auto-started services. The same minimum configuration on Debian Squeeze with the versions below works: # dpkg -l heartbeat pacemaker ... ii heartbeat 1:3.0.3-2 Subsystem for High-Availability Linux ii pacemaker 1.0.9.1+hg15626-1 HA cluster resource manager
Curious. :( Whats in your cib? can you attach full messages from startup to reboot and a dump of your cib.xml or perhaps a ha_report output? Also, try changing change "crm yes" to "crm respawn" to keep it up long enough to debug whats happening. Thanks.
Created attachment 471090 [details] cib.xml This is the cib.xml (empty) as was said in the report.
Created attachment 471091 [details] /var/log/messages
I attached cib.xml and /var/log/messages after I forced logrotate. What you can see is: 1. started heartbeat with "crm respawn" 2. stopped heartbeat 3. started heartbeat with "crm on", it rebooted the system. As I said, the heartbeat configuration is empty, there were no resources added yet, only the ha.cf was set up. Actually, the effective ha.cf is below, I used the distributed template. The last 3 lines were added by me, logfacility and auto_failback is set as default. ============================= # grep -v "^#" ha.cf logfacility local0 auto_failback on crm on bcast eth0 node db-ha1 db-ha2 =============================
The above was set up between two VMWare guests on my machine, but as my host OS is also Fedora 14, I tried it natively, faking a two-node setup. Same effect, system rebooted. I will attach the logs from my host OS, too.
Created attachment 471103 [details] /var/log/messages from host OS
You can see from the host's /var/lib/messages, this is also a fresh setup, so there's no point attaching cib.xml: Dec 29 21:01:18 db-ha2 cib: [3043]: WARN: retrieveCib: Cluster configuration not found: /var/lib/heartbeat/crm/cib.xml Dec 29 21:01:18 db-ha2 cib: [3043]: WARN: readCibXmlFile: Primary configuration corrupt or unusable, trying backup... Dec 29 21:01:18 db-ha2 cib: [3043]: WARN: readCibXmlFile: Continuing with an empty configuration.
I tried to recompile pacemaker-1.1.4-4.fc14.src.rpm paying close attention to the ./configure options. There I saw --without-heartbeat and --without-ais. I looked at pacemaker.spec so I could find out that I need to compile it with rpmbuild --define '_with_heartbeat 1' -ba pacemaker.spec to add heartbeat support. With the recompiled pacemaker packages, heartbeat can now start up successfully. The automatic compilation of the packages needs this extra --define option or some conditionals need to be removed from pacemaker.spec.
Moving over to the pacemaker component.
testing the following patch, looks like i messed up the use of the bcond macros @@ -173,14 +173,14 @@ resource health. %build ./autogen.sh %{configure} \ - %{!?_with_heartbeat: --without-heartbeat} \ - %{!?_with_ais: --without-ais} \ - %{!?_with_esmtp: --without-esmtp} \ - %{!?_with_snmp: --without-snmp} \ - %{?_with_cman: --with-cman} \ - %{?_with_profiling: --with-profiling} \ - %{?_with_gcov: --with-gcov} \ - %{?_with_tracedata --with-tracedata} \ + %{!?with_heartbeat: --without-heartbeat} \ + %{!?with_ais: --without-ais} \ + %{!?with_esmtp: --without-esmtp} \ + %{!?with_snmp: --without-snmp} \ + %{?with_cman: --with-cman} \ + %{?with_profiling: --with-profiling} \ + %{?with_gcov: --with-gcov} \ + %{?with_tracedata --with-tracedata} \ --docdir=%{pcmk_docdir} \ --localstatedir=%{_var} \ --with-initdir=%{_initddir} \
pacemaker-1.1.4-5.fc14 has been submitted as an update for Fedora 14. https://admin.fedoraproject.org/updates/pacemaker-1.1.4-5.fc14
pacemaker-1.1.4-5.fc14 has been pushed to the Fedora 14 testing repository. If problems still persist, please make note of it in this bug report. If you want to test the update, you can install it with su -c 'yum --enablerepo=updates-testing update pacemaker'. You can provide feedback for this update here: https://admin.fedoraproject.org/updates/pacemaker-1.1.4-5.fc14
pacemaker-1.1.4-5.fc14 has been pushed to the Fedora 14 stable repository. If problems still persist, please make note of it in this bug report.