Bug 1419633
| Summary: | [GSS] CTDB service on gluster server is stopped and cannot be started | ||
|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | Otakar Masek <omasek> |
| Component: | ctdb | Assignee: | Anoop C S <anoopcs> |
| Status: | CLOSED CURRENTRELEASE | QA Contact: | surabhi <sbhaloth> |
| Severity: | medium | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | rhgs-3.1 | CC: | abhishku, amukherj, anoopcs, bmohanra, ccalhoun, omasek, psony, rcyriac, rhs-smb, rtalur, sabose, ybronhei |
| Target Milestone: | --- | Keywords: | ZStream |
| Target Release: | --- | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | ctdb-4.6.3-4.el7rhgs | Doc Type: | Known Issue |
| Doc Text: |
CTDB fails to start on those setups where the real time schedulers have been disabled. One such example is where vdsm is installed.
Workaround:
Enable real time schedulers by "echo 950000 > /sys/fs/cgroup/cpu,cpuacct/system.slice/cpu.rt_runtime_us" and then restart ctdb service. Refer the cgroup section of Red Hat Enterprise Linux administration guide (https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html-single/System_Administrators_Guide/index.html) for making this change permanent.
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2017-09-14 08:14:45 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | |||
| Bug Blocks: | 1351530 | ||
|
Description
Otakar Masek
2017-02-06 15:45:10 UTC
Here are the dots that need to be connected. What is the problem? ctdb is not permitted to set scheduling preference for its threads. This should not happen and does not happen with same systemd unit files on non-vdsm setups. What could be the problem? May be vdsm changes something in "/sys/fs/cgroup/cpu/system.slice/cpu.rt_runtime_us" . Workaround to be tried 1. echo 10000 > /sys/fs/cgroup/cpu/system.slice/cpu.rt_runtime_us 2. systemctl stop ctdb.service 3. systemctl start ctdb.service Otakar replied that the following workaround was sufficient ------------------------------------------------------------------------------- Issue fixed after execution of : 1. echo 950000 > /sys/fs/cgroup/cpu,cpuacct/system.slice/cpu.rt_runtime_us 2. systemctl stop ctdb.service 3. systemctl start ctdb.service I had to change the value and the file cpu.rt_runtime_us path as customer RHEL is 7.3 --------------------------------------------------------------------------------- This problem appears on restarting a node after vdsm is installed. Yaniv, does vdsm change any global systemd settings? it doesn't sound like systemd issue, but the service ctdb run fails to start without setting cpu.rt_runtime_us in cgroup.. im not aware of touching this in vdsm scope, but maybe we do as part of sla stuff? check the value before the change, maybe the default in centos is wrong and need an update? (In reply to Yaniv Bronhaim from comment #7) > it doesn't sound like systemd issue, but the service ctdb run fails to start > without setting cpu.rt_runtime_us in cgroup.. im not aware of touching this > in vdsm scope, but maybe we do as part of sla stuff? check the value before > the change, maybe the default in centos is wrong and need an update? The default in rhel 7 is 950000. After a vdsm installation is complete, we see that it has been changed to 0. I am not sure which package makes this change. Is there a mailing list where we can ask this question, it is for sure related to virt. is this something new? is it producible always after installing vdsm? I tried to reproduce it over centos 7.2 and the file was not set at all after vdsm installation I tried over centos 7.3 to remove and re-install vdsm after setting it to 950000 and it is not changed.. same if I reinstalled libvirt ---- snip [root@localhost ~]# cat /etc/redhat-release CentOS Linux release 7.2.1511 (Core) [root@localhost ~]# cat /sys/fs/cgroup/cpu,cpuacct/system.slice/cpu.rt_runtime_us cat: /sys/fs/cgroup/cpu,cpuacct/system.slice/cpu.rt_runtime_us: No such file or directory [root@localhost ~]# rpm -qa | grep vdsm vdsm-jsonrpc-4.20.0-422.git13530cc.el7.centos.noarch vdsm-api-4.20.0-422.git13530cc.el7.centos.noarch vdsm-client-4.20.0-422.git13530cc.el7.centos.noarch vdsm-python-4.20.0-422.git13530cc.el7.centos.noarch vdsm-yajsonrpc-4.20.0-422.git13530cc.el7.centos.noarch vdsm-4.20.0-422.git13530cc.el7.centos.x86_64 vdsm-tests-4.20.0-422.git13530cc.el7.centos.noarch vdsm-xmlrpc-4.20.0-422.git13530cc.el7.centos.noarch vdsm-hook-vmfex-dev-4.20.0-422.git13530cc.el7.centos.noarch ---- (In reply to Yaniv Bronhaim from comment #10) > is this something new? is it producible always after installing vdsm? > I tried to reproduce it over centos 7.2 and the file was not set at all > after vdsm installation > It is NOT new. However, as you have observed, the file was named differently till RHEL 7.2. I don't remember the exact path but it was certainly under /sys/fs/cgroup/. > I tried over centos 7.3 to remove and re-install vdsm after setting it to > 950000 and it is not changed.. same if I reinstalled libvirt I think you did not perform a restart. I have not yet figured out systemd+cgroups works, but after vdsm+virt packages are installed and machine is restarted, this option changes. May be there is some other config file that is changed. > > > ---- snip > [root@localhost ~]# cat /etc/redhat-release > CentOS Linux release 7.2.1511 (Core) > [root@localhost ~]# cat > /sys/fs/cgroup/cpu,cpuacct/system.slice/cpu.rt_runtime_us > cat: /sys/fs/cgroup/cpu,cpuacct/system.slice/cpu.rt_runtime_us: No such file > or directory > > [root@localhost ~]# rpm -qa | grep vdsm > vdsm-jsonrpc-4.20.0-422.git13530cc.el7.centos.noarch > vdsm-api-4.20.0-422.git13530cc.el7.centos.noarch > vdsm-client-4.20.0-422.git13530cc.el7.centos.noarch > vdsm-python-4.20.0-422.git13530cc.el7.centos.noarch > vdsm-yajsonrpc-4.20.0-422.git13530cc.el7.centos.noarch > vdsm-4.20.0-422.git13530cc.el7.centos.x86_64 > vdsm-tests-4.20.0-422.git13530cc.el7.centos.noarch > vdsm-xmlrpc-4.20.0-422.git13530cc.el7.centos.noarch > vdsm-hook-vmfex-dev-4.20.0-422.git13530cc.el7.centos.noarch > ---- This link has the best possible info on the cgroup for realtime cpu and systemd interaction. https://www.freedesktop.org/wiki/Software/systemd/MyServiceCantGetRealtime/ I tried now with fresh centos latest installation, ran yum upgrade, then deployed using ovirt-engine, rebooted the host, and still the file does not exist at all. I saw this 950000 value in some setups. but I can't reproduce the description with vdsm and engine, 4.1 and master code. so I assume its not changed by vdsm rpm installation or the deploy flow Updated the doc text slightly for the release notes Hi Otakar, Is there anything pending from Engineering side? I have provided the solution in comment #16. Can you please confirm whether it worked for the customer or not? |