Bug 1092429
| Summary: | Two instances each, of brick processes, glusterfs-nfs and quotad seen after glusterd restart | |||
|---|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | Shruti Sampat <ssampat> | |
| Component: | quota | Assignee: | Kaushal <kaushal> | |
| Status: | CLOSED ERRATA | QA Contact: | Saurabh <saujain> | |
| Severity: | urgent | Docs Contact: | ||
| Priority: | urgent | |||
| Version: | rhgs-3.0 | CC: | dpati, esammons, kaushal, knarra, mzywusko, nsathyan, psriniva, rhs-bugs, ssamanta, storage-qa-internal | |
| Target Milestone: | --- | Keywords: | TestBlocker | |
| Target Release: | RHGS 3.0.0 | |||
| Hardware: | Unspecified | |||
| OS: | Unspecified | |||
| Whiteboard: | ||||
| Fixed In Version: | glusterfs-3.6.0-4.0.el6rhs | Doc Type: | Bug Fix | |
| Doc Text: |
Previously, the quoatd process started by blocking the epoll thread when glusterd started. This led to glusterd being deadlocked during startup. As a result, the daemon processes could not start and daemonize correctly. As a result, two instances of the daemon processes were observed. With this fix, Quotad is started separately leaving the epoll thread free to serve other requests. All the daemon processes start and daemonize properly displaying only a single instance of each process.
|
Story Points: | --- | |
| Clone Of: | ||||
| : | 1095585 (view as bug list) | Environment: | ||
| Last Closed: | 2014-09-22 19:36:16 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | 1095585 | |||
| Bug Blocks: | ||||
|
Description
Shruti Sampat
2014-04-29 09:53:41 UTC
Hi, Saw the similar issue in my setup as well. Here are the steps to reprocuce. 1) Create a distributed volume. 2) Enable quota on the volume by running the command "gluster vol quota <volname> enable. 3) Now stop glusterd and start it again. Now gluster cli hangs while trying to run the command "gluster v i". Attaching the sos reports for the same. http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/1092429/ 1) Not able to test process plugin Quota in nagios. 2) When quota is enabled on a volume, not able to test any other process plugins i.e glusterd,shd and nfs. This happens because we glusterd enters a deadlock, which is caused because quotad is started using 'runner_run' which is blocks the calling thread. When glusterd is starting up, this happens in the epoll thread, and the epoll thread gets blocked. With the epoll thread blocked glusterd is not able to serve any requests, including cli commands and volfile fetch requests. The daemonize process for glusterfs daemons happens in the following way, * glusterd starts the parent process * the parent process forks and creates a child daemon process * the daemon process does the volfile fetching and initializing, and returns to parent on success. * parent process ends itself after child returns In this case, since the child process cannot fetch the volfile and will not return to the parent process. This will leave two instances of each process, both of which do not work. In the case of quotad, we'll get into a deadlock. Using runner_run blocks the epoll thread. But since the daemon quotad process cannot get the volfile, it blocks the parent process, which blocks runner_run in glusterd. Setting flags required to add BZs to RHS 3.0 Errata Kaushal, Can you please review the edited doc text for technical accuracy and sign off? Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHEA-2014-1278.html |