Bug 1796816 - GlusterFs pod continuously Restarting
Summary: GlusterFs pod continuously Restarting
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: GlusterFS
Classification: Community
Component: core
Version: 6
Hardware: x86_64
OS: Linux
unspecified
urgent
Target Milestone: ---
Assignee: bugs@gluster.org
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-01-31 10:09 UTC by Kannan
Modified: 2020-11-16 02:17 UTC (History)
6 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2020-03-02 04:09:49 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Kannan 2020-01-31 10:09:29 UTC
Description of problem:
Hi,
  One of the GFS pod is getting restarted continuously due to Liveliness probe.
We have analyzed the issue. It seems glusterd service is not running within Glusterfs pod. Even we had tried starting the glusterd system process within GFS pod. But it was failing.
Unfortunately no logs on GFS pod.
But when we see the Kernel logs, we got the below logs.

============
Jan 29 13:29:25 chnipc3stg05 sshd[88]: Server listening on 0.0.0.0 port 2222.
Jan 29 13:29:25 chnipc3stg05 sshd[88]: Server listening on :: port 2222.
Jan 29 13:29:25 chnipc3stg05 systemd[1]: Started OpenSSH server daemon.
-- Subject: Unit sshd.service has finished start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit sshd.service has finished starting up.
--
-- The start-up result is done.
Jan 29 13:29:30 chnipc3stg05 gluster-setup.sh[56]: WARNING: lvmetad is being updated, retrying (setup) for 10 more seconds.
Jan 29 13:29:31 chnipc3stg05 gluster-setup.sh[56]: WARNING: lvmetad is being updated, retrying (setup) for 9 more seconds.
Jan 29 13:29:33 chnipc3stg05 gluster-setup.sh[56]: WARNING: lvmetad is being updated, retrying (setup) for 7 more seconds.
Jan 29 13:29:34 chnipc3stg05 lvm[29]: WARNING: Device /dev/centos/root not initialized in udev database even after waiting 10000000 mic
roseconds.
Jan 29 13:29:44 chnipc3stg05 lvm[29]: WARNING: Device /dev/sda1 not initialized in udev database even after waiting 10000000 microsecon
ds.
Jan 29 13:29:44 chnipc3stg05 lvm[29]: WARNING: lvmetad is being updated by another command (pid 89).
Jan 29 13:29:44 chnipc3stg05 lvm[29]: WARNING: Not using lvmetad because cache update failed.
Jan 29 13:29:44 chnipc3stg05 systemd[1]: Started Device-mapper event daemon.
-- Subject: Unit dm-event.service has finished start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit dm-event.service has finished starting up.
--
-- The start-up result is done.
Jan 29 13:29:44 chnipc3stg05 dmeventd[90]: dmeventd ready for processing.
Jan 29 13:29:44 chnipc3stg05 lvm[90]: Monitoring thin pool vg_7b477fb6bfdc692bf3e7b05e93e4d5f4-tp_d52d723ad18d18105083a25a2f4305f1-tpoo
l.
Jan 29 13:29:44 chnipc3stg05 lvm[90]: Monitoring thin pool vg_7b477fb6bfdc692bf3e7b05e93e4d5f4-tp_8b0c99b3a3dcb63a1f0c5bf8b2a0fad8-tpoo
l.
Jan 29 13:29:44 chnipc3stg05 lvm[90]: Monitoring thin pool vg_7b477fb6bfdc692bf3e7b05e93e4d5f4-tp_23b22487ec557cc672f3323023db6db5-tpoo
l.
Jan 29 13:29:44 chnipc3stg05 lvm[90]: Monitoring thin pool vg_7b477fb6bfdc692bf3e7b05e93e4d5f4-tp_c55386147cb58d5dcebe185f485281c0-tpoo
l.
Jan 29 13:29:44 chnipc3stg05 lvm[90]: Monitoring thin pool vg_7b477fb6bfdc692bf3e7b05e93e4d5f4-tp_4b578f3b100aa970a88c98098a01bd28-tpoo
l.
Jan 29 13:29:44 chnipc3stg05 lvm[90]: Monitoring thin pool vg_7b477fb6bfdc692bf3e7b05e93e4d5f4-tp_e5566b3edcbe86ca5835aa6255e49402-tpoo
l.
Jan 29 13:29:44 chnipc3stg05 lvm[90]: Monitoring thin pool vg_7b477fb6bfdc692bf3e7b05e93e4d5f4-tp_bb4f00ab3c2b365aec924d449d7eeb1e-tpoo
l.
Jan 29 13:29:44 chnipc3stg05 lvm[90]: Monitoring thin pool vg_7b477fb6bfdc692bf3e7b05e93e4d5f4-tp_1cfbc25e070739fb7a47cf05c08a3871-tpoo
l.
Jan 29 13:29:44 chnipc3stg05 lvm[90]: Monitoring thin pool vg_7b477fb6bfdc692bf3e7b05e93e4d5f4-tp_13c56b873c2022753a10bcfafa65e662-tpoo
l.
Jan 29 13:29:44 chnipc3stg05 lvm[90]: Monitoring thin pool vg_7b477fb6bfdc692bf3e7b05e93e4d5f4-tp_d9859e18bb5479e845a854aec97b0d4a-tpoo
l.
Jan 29 13:29:44 chnipc3stg05 lvm[90]: Monitoring thin pool vg_7b477fb6bfdc692bf3e7b05e93e4d5f4-tp_93ea5aec41e3863d39d044d47003fd92-tpoo
l.
Jan 29 13:29:44 chnipc3stg05 lvm[90]: Monitoring thin pool vg_7b477fb6bfdc692bf3e7b05e93e4d5f4-tp_ace4f23ced5bf85a3d390cbbf5549200-tpoo
l.
Jan 29 13:29:44 chnipc3stg05 lvm[90]: Monitoring thin pool vg_7b477fb6bfdc692bf3e7b05e93e4d5f4-tp_fbd80d0820689924ac23cb0bc9127a78-tpoo
l.
Jan 29 13:29:44 chnipc3stg05 lvm[90]: Monitoring thin pool vg_7b477fb6bfdc692bf3e7b05e93e4d5f4-tp_800e9c39925541a6e39831673c25c398-tpoo
l.
Jan 29 13:29:44 chnipc3stg05 lvm[90]: Monitoring thin pool vg_7b477fb6bfdc692bf3e7b05e93e4d5f4-tp_c06392f07a9bd2c99b2c44c8550bb455-tpoo
l.
Jan 29 13:29:44 chnipc3stg05 lvm[90]: Monitoring thin pool vg_7b477fb6bfdc692bf3e7b05e93e4d5f4-tp_665807aa914521f050fcce0955ec88d2-tpoo
l.
Jan 29 13:29:44 chnipc3stg05 lvm[90]: Monitoring thin pool vg_7b477fb6bfdc692bf3e7b05e93e4d5f4-tp_e8b35bffea24f555a7254a117abe7cc1-tpoo
l.
Jan 29 13:29:44 chnipc3stg05 lvm[90]: Monitoring thin pool vg_7b477fb6bfdc692bf3e7b05e93e4d5f4-tp_6bd05a1bdf8dd7d495a3bc0a4ea312c7-tpoo
l.
Jan 29 13:29:44 chnipc3stg05 lvm[90]: Monitoring thin pool vg_7b477fb6bfdc692bf3e7b05e93e4d5f4-tp_8f70be56ec65b8e7710e3a646d365b43-tpoo
l.
Jan 29 13:29:44 chnipc3stg05 lvm[90]: Monitoring thin pool vg_7b477fb6bfdc692bf3e7b05e93e4d5f4-tp_5971f229bdaff0c0059463f388f9e99c-tpoo
l.
Jan 29 13:29:44 chnipc3stg05 lvm[90]: Monitoring thin pool vg_7b477fb6bfdc692bf3e7b05e93e4d5f4-tp_a46879aae36465ca07a30b3c1a454ae9-tpoo
l.
Jan 29 13:29:44 chnipc3stg05 lvm[90]: Monitoring thin pool vg_7b477fb6bfdc692bf3e7b05e93e4d5f4-tp_2856b274f9deb87a9dac7c6a54571e50-tpoo
l.
Jan 29 13:29:44 chnipc3stg05 lvm[90]: Monitoring thin pool vg_7b477fb6bfdc692bf3e7b05e93e4d5f4-tp_5402896db7a08fc123cd4e3b66eedbcc-tpoo
l.
Jan 29 13:29:44 chnipc3stg05 lvm[90]: Monitoring thin pool vg_7b477fb6bfdc692bf3e7b05e93e4d5f4-tp_f93cbd4264577c85733731114fc9f5b6-tpoo
l.
Jan 29 13:29:44 chnipc3stg05 lvm[90]: Monitoring thin pool vg_7b477fb6bfdc692bf3e7b05e93e4d5f4-tp_cab00681c0be52c94b6517c638c1b977-tpoo
l.
Jan 29 13:29:44 chnipc3stg05 lvm[90]: Monitoring thin pool vg_7b477fb6bfdc692bf3e7b05e93e4d5f4-tp_5d40677177b9b64a4299a1b797e05f4c-tpoo
l.
Jan 29 13:29:44 chnipc3stg05 lvm[90]: Monitoring thin pool vg_7b477fb6bfdc692bf3e7b05e93e4d5f4-tp_02b01c3bfc9ce29cd3186443205155e9-tpoo
l.
Jan 29 13:29:44 chnipc3stg05 gluster-setup.sh[56]: WARNING: Device /dev/centos/root not initialized in udev database even after waiting
 10000000 microseconds.
Jan 29 13:29:44 chnipc3stg05 lvm[90]: Monitoring thin pool vg_7b477fb6bfdc692bf3e7b05e93e4d5f4-tp_3a1fb942bb0063afece269f505f41502-tpoo
l.
Jan 29 13:29:44 chnipc3stg05 lvm[90]: Monitoring thin pool vg_7b477fb6bfdc692bf3e7b05e93e4d5f4-tp_cacc40f930f35a7812f28dbf9c1fd817-tpoo
l.
Jan 29 13:29:44 chnipc3stg05 lvm[90]: Monitoring thin pool vg_7b477fb6bfdc692bf3e7b05e93e4d5f4-tp_637f192904ee36c66809cd5687944349-tpoo
l.
Jan 29 13:29:44 chnipc3stg05 lvm[90]: Monitoring thin pool vg_7b477fb6bfdc692bf3e7b05e93e4d5f4-tp_4c1369dfa7ff7f11725d6dad784a5404-tpoo
l.
Jan 29 13:29:44 chnipc3stg05 lvm[90]: Monitoring thin pool vg_7b477fb6bfdc692bf3e7b05e93e4d5f4-tp_dbdfecdd690ecdd1bd9ea13e17e53b67-tpoo
l.
Jan 29 13:29:44 chnipc3stg05 lvm[90]: Monitoring thin pool vg_7b477fb6bfdc692bf3e7b05e93e4d5f4-tp_1442dd7a2f019189ac5ae4d179425d0f-tpoo
l.
Jan 29 13:29:44 chnipc3stg05 lvm[90]: Monitoring thin pool vg_7b477fb6bfdc692bf3e7b05e93e4d5f4-tp_13a1d2b6d9286ada6940ea4d4aa4ab11-tpoo
l.
Jan 29 13:29:44 chnipc3stg05 lvm[29]: 136 logical volume(s) in volume group "vg_7b477fb6bfdc692bf3e7b05e93e4d5f4" monitored
Jan 29 13:29:44 chnipc3stg05 lvm[29]: 6 logical volume(s) in volume group "centos" monitored
Jan 29 13:29:44 chnipc3stg05 systemd[1]: Started Monitoring of LVM2 mirrors, snapshots etc. using dmeventd or progress polling.
-- Subject: Unit lvm2-monitor.service has finished start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit lvm2-monitor.service has finished starting up.
--
-- The start-up result is done.
=============

After rebooting the node, GFS pod started successfully.
There could be a problem on our node. But i am analyzing the RCA for this issue. Could you please tell me what gone wrong when Glusterd system process could not start

Version-Release number of selected component (if applicable):
GLusterfs Version - 6.0

How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:

GFS pod is getting restarted due to Liveliness probe.
GFS pod Liveliness probe failed due to Glusterd system process could not start

Expected results:

GFS pod should not get Restarted

Additional info:

I have attached the Kernel logs on the description

Comment 1 Mohit Agrawal 2020-02-25 04:32:33 UTC
@Kanan,

 Are you still facing the issue?
 Please share glusterd logs if you are still facing the issue.

Thanks,
Mohit Agrawal

Comment 2 Mohit Agrawal 2020-03-02 04:09:49 UTC
I am closing the bug, Please reopen it with logs if you face the same issue again.

Comment 3 Kannan 2020-03-03 10:02:24 UTC
@mohit, After rebooting the node, we are not facing this issue.

Comment 4 Kannan 2020-03-03 10:02:45 UTC
@mohit, After rebooting the node, we are not facing this issue.


Note You need to log in before you can comment on or make changes to this bug.