| Summary: | glusterd won't start on SSA | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Gluster Storage | Reporter: | Vikas Gorur <vikas> | ||||
| Component: | glusterfs | Assignee: | Csaba Henk <csaba> | ||||
| Status: | CLOSED CURRENTRELEASE | QA Contact: | |||||
| Severity: | medium | Docs Contact: | |||||
| Priority: | urgent | ||||||
| Version: | 1.0 | CC: | gluster-bugs, vinaraya | ||||
| Target Milestone: | --- | ||||||
| Target Release: | --- | ||||||
| Hardware: | x86_64 | ||||||
| OS: | Linux | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2012-07-20 06:51:31 UTC | Type: | --- | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | http://support.gluster.com/rt//Ticket/Display.html?id=3675 | ||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Attachments: |
|
||||||
|
Description
Vikas Gorur
2011-07-22 18:14:56 UTC
Customer (AT&T) is on SSA. GlusterFS upgraded to 3.2.2. glusterd does not start. TRACE logs are attached. glusterd did start once or twice, but it is very rare. 90% of the time it fails to start. /etc/hosts is fine. Both servers had only one eth0 interface. RDMA is not being used. I installed glusterfs-rdma anyway and it still didn't work. Debugging with gdb on rpcsvc_transport_create didn't yield anything. The listen() call succeeded, but glusterd still didn't start. Looks like the problem could be in geo-replication config. Maybe some tools required by geo-replication are missing on the system? glusterd init() calls configure_syncdaemon() which uses a lot of system() calls without logging in case of error. As you can see from the attached logs, the initialization has proceeded upto glusterd_uuid_init(). glusterd_restore() has a debug message (in the code) which says what value it returned, but that is not seen in the log messages. Good likely hood that it was never called. The only function before this was configure_syncdaemon(). Any of its invoked system() commands could have failed. Other system logs might throw light on what command could have failed. Pavan Csaba, Can you please look into this? If glusterd's init() fails because of gsync's init failure, the failure reason should be captured in the logs. Else we should have masked this error and let glusterd continue. Vijay (In reply to comment #3) Yeah, Pavan told me about that already. For which branch(es) do we need a fix? The patch made for master won't necessarily apply to the version used by customers who'd need it. (In reply to comment #5) > For which branch(es) do we need a fix? The patch made for master won't > necessarily apply to the version used by customers who'd need it. We need a fix for release-3.2. raising to P1 - customer impact CHANGE: http://review.gluster.com/96 (Change-Id: I28de4cce140faf1b35ecdc5cbd408f21c9926341) merged in master by Vijay Bellur (vijay) CHANGE: http://review.gluster.com/167 (Change-Id: I28de4cce140faf1b35ecdc5cbd408f21c9926341) merged in release-3.2 by Vijay Bellur (vijay) (In reply to comment #9) > CHANGE: http://review.gluster.com/167 (Change-Id: > I28de4cce140faf1b35ecdc5cbd408f21c9926341) merged in release-3.2 by Vijay > Bellur (vijay) Any update on this, with the log enhancement patch included? All are 'GlusterFS-Commercial' bugs, mostly related to customers a year back or so. Good to have a resolution on these issues. Moving the component considering the visibility in RHS component :-) Closing the bug as it is for SSA. Please reopen if this behaviour is observed with RHS too. |