Bug 1509340 - glusterd does not write pidfile correctly when forking
Summary: glusterd does not write pidfile correctly when forking
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: glusterd
Version: mainline
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Sanju
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-11-03 15:07 UTC by nh2
Modified: 2018-03-15 11:19 UTC (History)
3 users (show)

Fixed In Version: glusterfs-4.0.0
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-03-15 11:19:42 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description nh2 2017-11-03 15:07:27 UTC
Description of problem:

glusterd writes its pidfile from the forked child process. It should write it from the parent process, before that one exits.

This can lead to random glusterd startup failures when using systemd.

Services that fork are usually expected to write their pidfile from the *parent* process, not the child process. This is so that whoever starts the parent process can determine whether the startup was successful or not (they inspect the pidfile after the the parent exits; if it's empty, starting failed, if it's nonempty, it succeeded).

This is also systemd's expectation. The docs in "Example 4. Traditional forking services" say:

  https://www.freedesktop.org/software/systemd/man/systemd.service.html

  "Note that the daemon should write that file before finishing with its initialization. Otherwise, systemd might try to read the file before it exists."

This is exactly what happens to glusterd, which is probably why sometimes glusterd mysteriously fails to start on distributions using systemd (and the systemd unit provided in the glusterfs source code, which uses `PIDFile`). When that happens, you simply see in `journalctl -o short-precise`:

    Nov 03 14:14:47.947284 node1 systemd[1]: Starting GlusterFS, a clustered file-system server...
    Nov 03 14:14:48.053037 node1 systemd[1]: glusterd.service: Control process exited, code=exited status=1
    Nov 03 14:14:48.055438 node1 systemd[1]: Failed to start GlusterFS, a clustered file-system server.

Here, glusterd seems to exit with code 1 without good reason.
In `/var/log/glusterfs/glusterd.log` we have the following output:

    [2017-11-03 14:14:48.023049] I [MSGID: 100030] [glusterfsd.c:2496:main] 0-/nix/store/527752abk5cmf2yrpri9bgyb46g6i1as-glusterfs-3.12.1/sbin/glusterd: Started running /nix/store/527752abk5cmf2yrpri9bgyb46g6i1as-glusterfs-3.12.1/sbin/glusterd version 3.12.1 (args: /nix/store/527752abk5cmf2yrpri9bgyb46g6i1as-glusterfs-3.12.1/sbin/glusterd -p /run/glusterd.pid --log-level=DEBUG)
    [2017-11-03 14:14:48.025480] D [MSGID: 0] [glusterfsd.c:2107:glusterfs_pidfile_update] 0-glusterfsd: pidfile /run/glusterd.pid updated with pid 1041
    [2017-11-03 14:14:48.025523] D [logging.c:1817:__gf_log_inject_timer_event] 0-logging-infra: Starting timer now. Timeout = 120, current buf size = 5
    [2017-11-03 14:14:48.027786] D [MSGID: 0] [glusterfsd.c:660:get_volfp] 0-glusterfsd: loading volume file /nix/store/527752abk5cmf2yrpri9bgyb46g6i1as-glusterfs-3.12.1/etc/glusterfs/glusterd.vol
    [2017-11-03 14:14:48.046019] I [MSGID: 106478] [glusterd.c:1422:init] 0-management: Maximum allowed open file descriptors set to 65536
    [2017-11-03 14:14:48.046052] I [MSGID: 106479] [glusterd.c:1480:init] 0-management: Using /var/lib/glusterd as working directory
    [2017-11-03 14:14:48.046059] I [MSGID: 106479] [glusterd.c:1485:init] 0-management: Using /var/run/gluster as pid file working directory
    [2017-11-03 14:14:48.046157] D [MSGID: 0] [glusterd.c:475:glusterd_rpcsvc_options_build] 0-glusterd: listen-backlog value: 10
    [2017-11-03 14:14:48.047404] D [rpcsvc.c:2424:rpcsvc_init] 0-rpc-service: RPC service inited.
    [2017-11-03 14:14:48.047415] D [rpcsvc.c:1974:rpcsvc_program_register] 0-rpc-service: New program registered: GF-DUMP, Num: 123451501, Ver: 1, Port: 0
    [2017-11-03 14:14:48.047432] D [rpc-transport.c:279:rpc_transport_load] 0-rpc-transport: attempt to load file /nix/store/527752abk5cmf2yrpri9bgyb46g6i1as-glusterfs-3.12.1/lib/glusterfs/3.12.1/rpc-transport/socket.so
    [2017-11-03 14:14:48.051255] D [socket.c:4202:socket_init] 0-socket.management: Configued transport.tcp-user-timeout=0
    [2017-11-03 14:14:48.051273] D [socket.c:4220:socket_init] 0-socket.management: Reconfigued transport.keepalivecnt=9
    [2017-11-03 14:14:48.051281] I [socket.c:4305:socket_init] 0-socket.management: SSL support on the I/O path is ENABLED
    [2017-11-03 14:14:48.051285] I [socket.c:4308:socket_init] 0-socket.management: SSL support for glusterd is ENABLED
    [2017-11-03 14:14:48.051289] I [socket.c:4325:socket_init] 0-socket.management: using private polling thread
    [2017-11-03 14:14:48.052501] E [socket.c:4452:socket_init] 0-socket.management: could not load private key
    [2017-11-03 14:14:48.052513] D [socket.c:4569:init] 0-socket.management: socket_init() failed
    [2017-11-03 14:14:48.052518] W [rpc-transport.c:350:rpc_transport_load] 0-rpc-transport: 'socket' initialization failed
    [2017-11-03 14:14:48.052620] W [rpcsvc.c:1682:rpcsvc_create_listener] 0-rpc-service: cannot create listener, initing the transport failed
    [2017-11-03 14:14:48.052629] E [MSGID: 106243] [glusterd.c:1761:init] 0-management: creation of listener failed
    [2017-11-03 14:14:48.052638] E [MSGID: 101019] [xlator.c:503:xlator_init] 0-management: Initialization of volume 'management' failed, review your volfile again
    [2017-11-03 14:14:48.052665] E [MSGID: 101066] [graph.c:327:glusterfs_graph_init] 0-management: initializing translator failed
    [2017-11-03 14:14:48.052672] E [MSGID: 101176] [graph.c:698:glusterfs_graph_activate] 0-graph: init failed
    [2017-11-03 14:14:48.053702] D [logging.c:1791:gf_log_flush_extra_msgs] 0-logging-infra: Log buffer size reduced. About to flush 5 extra log messages
    [2017-11-03 14:14:48.053720] D [logging.c:1794:gf_log_flush_extra_msgs] 0-logging-infra: Just flushed 5 extra log messages
    [2017-11-03 14:14:48.054826] W [glusterfsd.c:1347:cleanup_and_exit] (-->/nix/store/qka18inm6zxz0yyxhq17dq7k6bck9anv-glibc-2.25-49/lib/libpthread.so.0(+0x7234) [0x7fdb853d8234] -->/nix/store/527752abk5cmf2yrpri9bgyb46g6i1as-glusterfs-3.12.1/sbin/glusterd(glusterfs_sigwaiter+0xe5) [0x408f75] -->/nix/store/527752abk5cmf2yrpri9bgyb46g6i1as-glusterfs-3.12.1/sbin/glusterd(cleanup_and_exit+0x57) [0x408d97] ) 0-: received signum (15), shutting down
    [2017-11-03 14:14:48.054843] D [glusterfsd-mgmt.c:2690:glusterfs_mgmt_pmap_signout] 0-fsd-mgmt: portmapper signout arguments not given

Here everything begins to start as normal, but then we suddenly get `received signum (15), shutting down`.

Who has send signal 15 (SIGTERM) to glusterd? I suspect the following race condition happens:

* glusterd is started
* glusterd fork()s
* systemd sees the started process exiting
* systemd checks if the PIDFile contains a pid; it doesn't
* systemd sends SIGTERM to the entire remaining glusterd.service cgroup
* [never happens:] the fork()ed child would write the pidfile now

I have confirmed the above with

  strace -f -ttt glusterd -p /run/glusterd.pid --log-level=DEBUG

where I can see:

  1509720256.948021 open("/run/glusterd.pid", O_RDWR|O_CREAT|O_APPEND, 0666) = 6
  ...
  1509720256.949193 clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fe34ddcb9d0) = 1324
  ...
  strace: Process 1324 attached
  ...
  [pid  1324] 1509720256.950559 write(6, "1324\n", 5) = 5

This shows that the pidfile contents are written by the child.

Version-Release number of selected component (if applicable): 3.12

How reproducible: Always


Steps to Reproduce:
1. Run strace as I described above

Actual results:

The child process writes to the pidfile.

Expected results:

The parent process writes to the pidfile.

Additional info:

I found this issue when maintaining glusterfs on NixOS.

Comment 2 Worker Ant 2017-11-07 12:11:31 UTC
REVIEW: https://review.gluster.org/18687 (glusterfsd: making parent process to write pidfile while forking) posted (#1) for review on master by Sanju Rakonde

Comment 3 Worker Ant 2017-11-12 06:39:08 UTC
COMMIT: https://review.gluster.org/18687 committed in master by  

------------- glusterfsd: making parent process to write pidfile while forking

Issue: pid is written into glusterd.pid by child process instead of
parent process while forking.

Fix: After fork returns child pid to parent process, it falls under
default case of switch statement, call glusterfs_pidfile_update()
in default case instead of postfork label.

Change-Id: I41b616c140592bf117601bc451dfd8b934a5b640
BUG: 1509340
Signed-off-by: Sanju Rakonde <srakonde>

Comment 4 Shyamsundar 2018-03-15 11:19:42 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-4.0.0, please open a new bug report.

glusterfs-4.0.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://lists.gluster.org/pipermail/announce/2018-March/000092.html
[2] https://www.gluster.org/pipermail/gluster-users/


Note You need to log in before you can comment on or make changes to this bug.