Bug 666041
Summary: | Heartbeat reboots system | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Zoltan Boszormenyi <zboszor> | ||||||||
Component: | pacemaker | Assignee: | Andrew Beekhof <andrew> | ||||||||
Status: | CLOSED ERRATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||||
Severity: | high | Docs Contact: | |||||||||
Priority: | low | ||||||||||
Version: | 14 | CC: | andrew, fdinitto, kevin, lhh | ||||||||
Target Milestone: | --- | ||||||||||
Target Release: | --- | ||||||||||
Hardware: | x86_64 | ||||||||||
OS: | Linux | ||||||||||
Whiteboard: | |||||||||||
Fixed In Version: | pacemaker-1.1.4-5.fc14 | Doc Type: | Bug Fix | ||||||||
Doc Text: | Story Points: | --- | |||||||||
Clone Of: | Environment: | ||||||||||
Last Closed: | 2011-01-13 23:34:24 UTC | Type: | --- | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Attachments: |
|
Description
Zoltan Boszormenyi
2010-12-28 15:05:17 UTC
The reboot problem is quite critical because: 1. installation of heartbeat automatically adds it to the services run at boot 2. setting it up with a minimal configuration and trying it out reboots the system Bang, instant reboot loop, you need to manually boot into single mode and remove heartbeat from the auto-started services. The same minimum configuration on Debian Squeeze with the versions below works: # dpkg -l heartbeat pacemaker ... ii heartbeat 1:3.0.3-2 Subsystem for High-Availability Linux ii pacemaker 1.0.9.1+hg15626-1 HA cluster resource manager Curious. :( Whats in your cib? can you attach full messages from startup to reboot and a dump of your cib.xml or perhaps a ha_report output? Also, try changing change "crm yes" to "crm respawn" to keep it up long enough to debug whats happening. Thanks. Created attachment 471090 [details]
cib.xml
This is the cib.xml (empty) as was said in the report.
Created attachment 471091 [details]
/var/log/messages
I attached cib.xml and /var/log/messages after I forced logrotate. What you can see is: 1. started heartbeat with "crm respawn" 2. stopped heartbeat 3. started heartbeat with "crm on", it rebooted the system. As I said, the heartbeat configuration is empty, there were no resources added yet, only the ha.cf was set up. Actually, the effective ha.cf is below, I used the distributed template. The last 3 lines were added by me, logfacility and auto_failback is set as default. ============================= # grep -v "^#" ha.cf logfacility local0 auto_failback on crm on bcast eth0 node db-ha1 db-ha2 ============================= The above was set up between two VMWare guests on my machine, but as my host OS is also Fedora 14, I tried it natively, faking a two-node setup. Same effect, system rebooted. I will attach the logs from my host OS, too. Created attachment 471103 [details]
/var/log/messages from host OS
You can see from the host's /var/lib/messages, this is also a fresh setup, so there's no point attaching cib.xml: Dec 29 21:01:18 db-ha2 cib: [3043]: WARN: retrieveCib: Cluster configuration not found: /var/lib/heartbeat/crm/cib.xml Dec 29 21:01:18 db-ha2 cib: [3043]: WARN: readCibXmlFile: Primary configuration corrupt or unusable, trying backup... Dec 29 21:01:18 db-ha2 cib: [3043]: WARN: readCibXmlFile: Continuing with an empty configuration. I tried to recompile pacemaker-1.1.4-4.fc14.src.rpm paying close attention to the ./configure options. There I saw --without-heartbeat and --without-ais. I looked at pacemaker.spec so I could find out that I need to compile it with rpmbuild --define '_with_heartbeat 1' -ba pacemaker.spec to add heartbeat support. With the recompiled pacemaker packages, heartbeat can now start up successfully. The automatic compilation of the packages needs this extra --define option or some conditionals need to be removed from pacemaker.spec. Moving over to the pacemaker component. testing the following patch, looks like i messed up the use of the bcond macros @@ -173,14 +173,14 @@ resource health. %build ./autogen.sh %{configure} \ - %{!?_with_heartbeat: --without-heartbeat} \ - %{!?_with_ais: --without-ais} \ - %{!?_with_esmtp: --without-esmtp} \ - %{!?_with_snmp: --without-snmp} \ - %{?_with_cman: --with-cman} \ - %{?_with_profiling: --with-profiling} \ - %{?_with_gcov: --with-gcov} \ - %{?_with_tracedata --with-tracedata} \ + %{!?with_heartbeat: --without-heartbeat} \ + %{!?with_ais: --without-ais} \ + %{!?with_esmtp: --without-esmtp} \ + %{!?with_snmp: --without-snmp} \ + %{?with_cman: --with-cman} \ + %{?with_profiling: --with-profiling} \ + %{?with_gcov: --with-gcov} \ + %{?with_tracedata --with-tracedata} \ --docdir=%{pcmk_docdir} \ --localstatedir=%{_var} \ --with-initdir=%{_initddir} \ pacemaker-1.1.4-5.fc14 has been submitted as an update for Fedora 14. https://admin.fedoraproject.org/updates/pacemaker-1.1.4-5.fc14 pacemaker-1.1.4-5.fc14 has been pushed to the Fedora 14 testing repository. If problems still persist, please make note of it in this bug report. If you want to test the update, you can install it with su -c 'yum --enablerepo=updates-testing update pacemaker'. You can provide feedback for this update here: https://admin.fedoraproject.org/updates/pacemaker-1.1.4-5.fc14 pacemaker-1.1.4-5.fc14 has been pushed to the Fedora 14 stable repository. If problems still persist, please make note of it in this bug report. |