Bug 1123557

Summary:

systemctl stop not relieable

Product:

[Fedora] Fedora

Reporter:

Harald Reindl <h.reindl>

Component:

systemd

Assignee:

systemd-maint

Status:

CLOSED EOL

QA Contact:

Fedora Extras Quality Assurance <extras-qa>

Severity:

urgent

Docs Contact:

Priority:

unspecified

Version:

CC:

johannbg, lnykryn, msekleta, s, systemd-maint, vpavlin, zbyszek

Target Milestone:

---

Target Release:

---

Hardware:

Unspecified

OS:

Unspecified

Whiteboard:

Fixed In Version:

Doc Type:

Bug Fix

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2015-02-18 11:47:19 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
program to log signals	none
unit file	none

Description Harald Reindl 2014-07-26 12:37:24 UTC

it must not happen that i need "ExecStopPost=/usr/bin/sleep 1" for services with "Type=simple" to prevent that systemcl returns before the process has finished it's shutdown

in combination with "ExecStopPost" it looks like systemd really waits until the MAINPID finished - that is a critical major bug

that started after i removed the "ExecStopPost" from my mysqld-units because i did not see any sense in it - facing that after this change multiple times the hourly backup-job which stops both mysqld-instances and rsync the datadir damages the backup because the source-instance likely still write back it#s data

-------- Original-Nachricht --------
Betreff: Re: race conditions after SIGTERM
Datum: Fri, 25 Jul 2014 12:11:35 +0200
Von: Reindl Harald <h.reindl>
An: Mailing-List systemd <systemd-devel.org>

and it happended again

how can it be that if i have running mysqld-instances as
Type=simple directly starting mysqld that it is not safe
after "systemctl stop" returns to rsync the datadirs

i had over a long time in any mysqld-unit
ExecStopPost=/usr/bin/sleep 1
pretty sure because that happened in the past

but that is a really dirty solution
is 1 second enough, if not 2 or better 5
i have now a "sleep 6" in the backup scripts - not beautiful

after "systemctl stop" returns i expect that the MAINPID is
*really* done, has finished it's cleanups and written data
to disk and that systemd especially in case of a non-forking
daemon knows trustable when it is finished

Am 24.07.2014 13:24, schrieb Reindl Harald:
> maybe that's a systemd thing
> 
> i know Fedora 19 has not a recent systemd but the question
> remains if systemctl in case of "Type=simple" may act the same
> way while stop a service as for starting - send the SIGTERM
> and immediately return while the binary still writes data
> 
> that could explain race conditions like below
> 
> * stop mysqld instance 1
> * stop mysqld instance 2
> * the services still flush data but "systemctl" already returned
> * rsync both datadir
> * corruption
> 
> -------- Original-Nachricht --------
> Betreff: race conditions after SIGTERM
> Datum: Thu, 24 Jul 2014 12:42:51 +0200
> Von: Reindl Harald <h.reindl>
> An: Mailing-List mariadb <"maria-discuss"@lists.launchpad.net>
> 
> how can that script lead to a race condition where files
> are not fully written to disk? that never happens if the
> systemd-unit for the replication instance has
> "ExecStopPost=/usr/bin/sleep 1" and waits a while
> 
> my only explaination is that mysqld returns after the SIGTERM
> from systemd before all data are completly written in some racy
> situations and so rsynced incompletly to the datadir of the
> other instance
> _________________________________________________________
> 
> #!/bin/bash
> systemctl stop replication.service
> systemctl stop mysqld.service
> rsync $RSYNC_PARAMS /mysql_replication/ /mysql_data/
> systemctl start replication.service
> systemctl start mysqld.service
> _________________________________________________________
> 
> [Unit]
> Description=MariaDB Replication
> 
> [Service]
> Type=simple
> PIDFile=/run/mysqld/mysqld_replication.pid
> ExecStart=/usr/libexec/mysqld --defaults-file=/etc/my-replication.cnf --pid-file=/run/mysqld/mysqld_replication.pid
> --socket=/var/lib/mysql/mysql_replication.sock --open-files-limit=30000 --basedir=/usr --user=mysql
> Environment="LANG=en_GB.UTF-8"
> Restart=always
> RestartSec=1
> _________________________________________________________
> 
> 140724 12:22:59 [Note] /usr/libexec/mysqld: Shutdown complete
> 140724 12:23:01 [Note] Plugin 'InnoDB' is disabled.
> Cannot find checkpoint record at LSN (1,0x35767)
> 140724 12:23:01 [ERROR] mysqld: Aria recovery failed. Please run aria_chk -r on all Aria tables and delete all
> aria_log.######## files
> 140724 12:23:01 [ERROR] Plugin 'Aria' init function returned error.
> 140724 12:23:01 [ERROR] Plugin 'Aria' registration as a STORAGE ENGINE failed.
> 140724 12:23:01 [Note] Plugin 'FEDERATED' is disabled.
> 140724 12:23:01 [Note] Plugin 'FEEDBACK' is disabled.
> 140724 12:23:01 [ERROR] Aria engine is not enabled or did not start. The Aria engine must be enabled to continue as
> mysqld was configured with --with-aria-tmp-tables
> 140724 12:23:01 [ERROR] Aborting
> 140724 12:23:01 [Note] /usr/libexec/mysqld: Shutdown complete
> 140724 12:23:03 [Note] Plugin 'InnoDB' is disabled.
> Cannot find checkpoint record at LSN (1,0x35767)

Comment 1 Zbigniew Jędrzejewski-Szmek 2014-07-27 02:01:29 UTC

Created attachment 921269 [details]
program to log signals

Hi Harald,

I tried to reproduce your bug (see attached program), but everything seems to work fine. Maybe you should change to KillMode=process?

lip 26 21:53:24 ameba systemd[1664]: Starting sigtest.service...
lip 26 21:53:24 ameba systemd[1664]: Started sigtest.service.
lip 26 21:53:24 ameba sigtest[4146]: 4146 running
lip 26 21:53:24 ameba sigtest[4146]: 4147 running
lip 26 21:53:26 ameba systemd[1664]: Stopping sigtest.service...
lip 26 21:53:26 ameba sigtest[4146]: 4146 got signal 15 from 1664
lip 26 21:53:26 ameba sigtest[4146]: killing child
lip 26 21:53:26 ameba sigtest[4146]: 4147 got signal 15 from 4146
lip 26 21:53:31 ameba sigtest[4146]: 4147 exiting
lip 26 21:53:31 ameba sigtest[4146]: child is dead
lip 26 21:53:36 ameba sigtest[4146]: 4146 exiting
lip 26 21:53:36 ameba systemd[1664]: Stopped sigtest.service.

Note: tested systemd-208 and -215.

Comment 2 Zbigniew Jędrzejewski-Szmek 2014-07-27 02:06:30 UTC

Created attachment 921270 [details]
unit file

Comment 3 Harald Reindl 2014-07-27 08:27:45 UTC

with "ExecStopPost=/usr/bin/usleep 1" the problem seems not to exist, so it's not a matter of how long "ExecStopPost" is waiting, it just appears to change some internal logic 

i guess "/usr/bin/true" would have the same result, i change now all the units with Type=simple to "ExecStopPost=/usr/bin/true" and if it don't happen again within the next few days it's for sure confirmed 

i am very sure anyways beacuse as long the ExecStopPost was there not a single error in the backups-sripts for at least 7 months and after remove that line once a day on a random backup-machine (the mysqld.service / replication.service constrcut exists several times for different servers)

Comment 4 Harald Reindl 2014-07-27 09:24:48 UTC

> Note: tested systemd-208 and -215

Uhm F19: systemd-204-20.fc19.x86_64

could you please test that with the F19 version too at least we could be sure that it may be fixed in later versions, and if that's the case it would be interesting for systemd-upstream what happened really in the past fixed possibly unintentional and prevent such races in the future

F20 upgrade for production is on my todo-list (originally was scheduled for May but too much systemd problems at that point in time and now to busy) and because the data corruption happens "only" once or twice per 24 hours by having the stop/sync/start-tasks hourly on several machines it's hard to reproduce (i guess most of the time mysqld writes his stuff down quickly enough)

Comment 5 Zbigniew Jędrzejewski-Szmek 2014-07-27 14:34:46 UTC

I tested systemd-204-20.fc19.x86_64, and it behaves the same.

Ah, you have PIDFile= set. I don't think that it works.
See https://bugzilla.redhat.com/show_bug.cgi?id=723942.

Comment 6 Harald Reindl 2014-07-27 16:28:57 UTC

with "ExecStartPost=/usr/bin/true" it behaves like expected

"PIDFile=" is from the Fedora service which IMHO is completly wrong by still using mysqld_safe since systemd can do it's job alone, i removed that now and guess "ExecStartPost=/usr/libexec/mysqld-wait-ready $MAINPID" works independent of that which is an essential part to not fail followup services

however, i will keep "ExecStopPost=/usr/bin/true" because what ever it changes in some semantics is good and safe and i don't get why "PIDFile=" in my case could do any harm in my case because there is no sceond process likein the Fedora mysqld.service

____________________________________________________________


[root@testserver:~]$ cat /usr/lib/systemd/system/mysqld.service
[Unit]
Description=MariaDB Database
Before=postfix.service dovecot.service dbmail-imapd.service dbmail-lmtpd.service dbmail-pop3d.service dbmail-postfix-policyd.service

[Service]
Type=simple
User=mysql
Group=mysql
NoNewPrivileges=yes
ExecStart=/usr/libexec/mysqld --defaults-file=/etc/my.cnf --pid-file=/run/mysqld/mysqld.pid --open-files-limit=750000
ExecStartPost=/usr/libexec/mysqld-wait-ready $MAINPID
ExecStopPost=/usr/bin/true
Environment="LANG=en_GB.UTF-8"
Restart=always
RestartSec=1
TimeoutSec=300
LimitNOFILE=infinity
LimitMEMLOCK=infinity
OOMScoreAdjust=-1000
SysVStartPriority=64
PrivateTmp=true
CapabilityBoundingSet=~CAP_DAC_OVERRIDE CAP_NET_ADMIN CAP_NET_BIND_SERVICE CAP_SETGID CAP_SETUID CAP_SYS_ADMIN CAP_SYS_BOOT CAP_SYS_MODULE CAP_SYS_PTRACE
ReadOnlyDirectories=/etc
ReadOnlyDirectories=/usr
ReadOnlyDirectories=/proc
ReadOnlyDirectories=/sys
ReadOnlyDirectories=/var/lib
ReadWriteDirectories=/var/lib/mysql
InaccessibleDirectories=/boot
InaccessibleDirectories=/home
InaccessibleDirectories=/root
InaccessibleDirectories=/media
InaccessibleDirectories=/usr/local/scripts
InaccessibleDirectories=/var/lib/dbus
InaccessibleDirectories=/var/lib/rpm
InaccessibleDirectories=/var/lib/systemd
InaccessibleDirectories=/var/lib/yum
InaccessibleDirectories=/var/spool
InaccessibleDirectories=/run/dbus
InaccessibleDirectories=/run/lock
InaccessibleDirectories=/run/mount
InaccessibleDirectories=/run/systemd/generator
InaccessibleDirectories=/run/systemd/system
InaccessibleDirectories=/run/systemd/users
InaccessibleDirectories=/run/udev
InaccessibleDirectories=/run/user

[Install]
WantedBy=multi-user.target

Comment 7 Zbigniew Jędrzejewski-Szmek 2014-07-27 19:25:19 UTC

(In reply to Harald Reindl from comment #6)
> with "ExecStartPost=/usr/bin/true" it behaves like expected
Could be a timing issue.

> "PIDFile=" is from the Fedora service which IMHO is completly wrong by still
> using mysqld_safe since systemd can do it's job alone, i removed that now
> and guess "ExecStartPost=/usr/libexec/mysqld-wait-ready $MAINPID" works
> independent of that which is an essential part to not fail followup services
mariadb.service has lost the PIDFile setting.

mysqld.service from community-mysql-server still has it.

> however, i will keep "ExecStopPost=/usr/bin/true" because what ever it
> changes in some semantics is good and safe and i don't get why "PIDFile=" in
> my case could do any harm in my case because there is no sceond process
> likein the Fedora mysqld.service
I'll have a look at the Type=simple + PIDFile support.


19:56 < sztanpet> zbyszek btw even on type=forking systemd reads the pidfile with inotify because of the countless broken daemons afaik
19:56 < sztanpet> faintly remember mbiebl commiting the fix for it, but could be wrong

Comment 8 Harald Reindl 2014-07-27 19:43:40 UTC

> mariadb.service has lost the PIDFile setting

not at RHEL7 at least!

> I'll have a look at the Type=simple + PIDFile support

thanks, looks like i don't need it in the future, but the pitfall 
should be cleared up in some way - i tried to recall the history 
here

* switched to MariaDB 
* the first issues because the aria-logs
* no time and failing backup scripts
* added the ExecStopPost as dirty hack
* faced the hack last week, thouhgt WTF, removed it
* issues again

so my primary intention for the reports was if that affects me this way without complete understanding what happens in the backround that may hit other services and users too - especially because the problem maybe hidden for a very long time until it hits at a real bad moment

Comment 9 Harald Reindl 2014-08-02 18:11:23 UTC

i had the problem of broken aria-logs with the mysql replication-to-hot-backup scripts again today on two machines 

keep in mind my only change was to remove "ExecStopPost=/usr/bin/sleep 1" from my mysqld.service and replication.service and the reason to add them where surely that problem appearing randomly after switch to Fedora 19/MariaDB 2014/01 and never happend with the "sleep 1" hack

until today i thought the "ExecStopPost=/usr/bin/true" has he same result

we are talking about systemd / MariaDB on *Fedora 19* not 20/21/Rawhide

Comment 10 Harald Reindl 2014-08-10 09:40:47 UTC

PING: systemd-204-20.fc19.x86_64

you said "Note: tested systemd-208 and -215" which don't say 
much until F19 has 204 - there is a reason why bugreporters
select the Fedora version

Comment 11 Harald Reindl 2014-08-10 19:18:03 UTC

i get the impression that this is a randomly hitting regression of the last systemd update since i had the same tradegy today on two servers and looking at the yum log and date of this bugreport and my first mail to mariadb how that is possible at all was a day later

so AGAIN please don't check the behavior of 208/215

Jul 23 09:09:28 Updated: systemd-libs-204-20.fc19.x86_64
Jul 23 09:09:30 Updated: systemd-204-20.fc19.x86_64
Jul 23 09:09:30 Updated: systemd-sysv-204-20.fc19.x86_64

Comment 12 Fedora End Of Life 2015-01-09 21:27:58 UTC

This message is a notice that Fedora 19 is now at end of life. Fedora 
has stopped maintaining and issuing updates for Fedora 19. It is 
Fedora's policy to close all bug reports from releases that are no 
longer maintained. Approximately 4 (four) weeks from now this bug will
be closed as EOL if it remains open with a Fedora 'version' of '19'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 19 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 13 Fedora End Of Life 2015-02-18 11:47:19 UTC

Fedora 19 changed to end-of-life (EOL) status on 2015-01-06. Fedora 19 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.

Comment 14 Zbigniew Jędrzejewski-Szmek 2015-02-18 13:48:46 UTC

Sorry we didn't get this fixed before EOL. Please reopen if it happens in F20+.