it must not happen that i need "ExecStopPost=/usr/bin/sleep 1" for services with "Type=simple" to prevent that systemcl returns before the process has finished it's shutdown
in combination with "ExecStopPost" it looks like systemd really waits until the MAINPID finished - that is a critical major bug
that started after i removed the "ExecStopPost" from my mysqld-units because i did not see any sense in it - facing that after this change multiple times the hourly backup-job which stops both mysqld-instances and rsync the datadir damages the backup because the source-instance likely still write back it#s data
-------- Original-Nachricht --------
Betreff: Re: race conditions after SIGTERM
Datum: Fri, 25 Jul 2014 12:11:35 +0200
Von: Reindl Harald <email@example.com>
An: Mailing-List systemd <firstname.lastname@example.org>
and it happended again
how can it be that if i have running mysqld-instances as
Type=simple directly starting mysqld that it is not safe
after "systemctl stop" returns to rsync the datadirs
i had over a long time in any mysqld-unit
pretty sure because that happened in the past
but that is a really dirty solution
is 1 second enough, if not 2 or better 5
i have now a "sleep 6" in the backup scripts - not beautiful
after "systemctl stop" returns i expect that the MAINPID is
*really* done, has finished it's cleanups and written data
to disk and that systemd especially in case of a non-forking
daemon knows trustable when it is finished
Am 24.07.2014 13:24, schrieb Reindl Harald:
> maybe that's a systemd thing
> i know Fedora 19 has not a recent systemd but the question
> remains if systemctl in case of "Type=simple" may act the same
> way while stop a service as for starting - send the SIGTERM
> and immediately return while the binary still writes data
> that could explain race conditions like below
> * stop mysqld instance 1
> * stop mysqld instance 2
> * the services still flush data but "systemctl" already returned
> * rsync both datadir
> * corruption
> -------- Original-Nachricht --------
> Betreff: race conditions after SIGTERM
> Datum: Thu, 24 Jul 2014 12:42:51 +0200
> Von: Reindl Harald <email@example.com>
> An: Mailing-List mariadb <"maria-discuss"@lists.launchpad.net>
> how can that script lead to a race condition where files
> are not fully written to disk? that never happens if the
> systemd-unit for the replication instance has
> "ExecStopPost=/usr/bin/sleep 1" and waits a while
> my only explaination is that mysqld returns after the SIGTERM
> from systemd before all data are completly written in some racy
> situations and so rsynced incompletly to the datadir of the
> other instance
> systemctl stop replication.service
> systemctl stop mysqld.service
> rsync $RSYNC_PARAMS /mysql_replication/ /mysql_data/
> systemctl start replication.service
> systemctl start mysqld.service
> Description=MariaDB Replication
> ExecStart=/usr/libexec/mysqld --defaults-file=/etc/my-replication.cnf --pid-file=/run/mysqld/mysqld_replication.pid
> --socket=/var/lib/mysql/mysql_replication.sock --open-files-limit=30000 --basedir=/usr --user=mysql
> 140724 12:22:59 [Note] /usr/libexec/mysqld: Shutdown complete
> 140724 12:23:01 [Note] Plugin 'InnoDB' is disabled.
> Cannot find checkpoint record at LSN (1,0x35767)
> 140724 12:23:01 [ERROR] mysqld: Aria recovery failed. Please run aria_chk -r on all Aria tables and delete all
> aria_log.######## files
> 140724 12:23:01 [ERROR] Plugin 'Aria' init function returned error.
> 140724 12:23:01 [ERROR] Plugin 'Aria' registration as a STORAGE ENGINE failed.
> 140724 12:23:01 [Note] Plugin 'FEDERATED' is disabled.
> 140724 12:23:01 [Note] Plugin 'FEEDBACK' is disabled.
> 140724 12:23:01 [ERROR] Aria engine is not enabled or did not start. The Aria engine must be enabled to continue as
> mysqld was configured with --with-aria-tmp-tables
> 140724 12:23:01 [ERROR] Aborting
> 140724 12:23:01 [Note] /usr/libexec/mysqld: Shutdown complete
> 140724 12:23:03 [Note] Plugin 'InnoDB' is disabled.
> Cannot find checkpoint record at LSN (1,0x35767)
Created attachment 921269 [details]
program to log signals
I tried to reproduce your bug (see attached program), but everything seems to work fine. Maybe you should change to KillMode=process?
lip 26 21:53:24 ameba systemd: Starting sigtest.service...
lip 26 21:53:24 ameba systemd: Started sigtest.service.
lip 26 21:53:24 ameba sigtest: 4146 running
lip 26 21:53:24 ameba sigtest: 4147 running
lip 26 21:53:26 ameba systemd: Stopping sigtest.service...
lip 26 21:53:26 ameba sigtest: 4146 got signal 15 from 1664
lip 26 21:53:26 ameba sigtest: killing child
lip 26 21:53:26 ameba sigtest: 4147 got signal 15 from 4146
lip 26 21:53:31 ameba sigtest: 4147 exiting
lip 26 21:53:31 ameba sigtest: child is dead
lip 26 21:53:36 ameba sigtest: 4146 exiting
lip 26 21:53:36 ameba systemd: Stopped sigtest.service.
Note: tested systemd-208 and -215.
Created attachment 921270 [details]
with "ExecStopPost=/usr/bin/usleep 1" the problem seems not to exist, so it's not a matter of how long "ExecStopPost" is waiting, it just appears to change some internal logic
i guess "/usr/bin/true" would have the same result, i change now all the units with Type=simple to "ExecStopPost=/usr/bin/true" and if it don't happen again within the next few days it's for sure confirmed
i am very sure anyways beacuse as long the ExecStopPost was there not a single error in the backups-sripts for at least 7 months and after remove that line once a day on a random backup-machine (the mysqld.service / replication.service constrcut exists several times for different servers)
> Note: tested systemd-208 and -215
Uhm F19: systemd-204-20.fc19.x86_64
could you please test that with the F19 version too at least we could be sure that it may be fixed in later versions, and if that's the case it would be interesting for systemd-upstream what happened really in the past fixed possibly unintentional and prevent such races in the future
F20 upgrade for production is on my todo-list (originally was scheduled for May but too much systemd problems at that point in time and now to busy) and because the data corruption happens "only" once or twice per 24 hours by having the stop/sync/start-tasks hourly on several machines it's hard to reproduce (i guess most of the time mysqld writes his stuff down quickly enough)
I tested systemd-204-20.fc19.x86_64, and it behaves the same.
Ah, you have PIDFile= set. I don't think that it works.
with "ExecStartPost=/usr/bin/true" it behaves like expected
"PIDFile=" is from the Fedora service which IMHO is completly wrong by still using mysqld_safe since systemd can do it's job alone, i removed that now and guess "ExecStartPost=/usr/libexec/mysqld-wait-ready $MAINPID" works independent of that which is an essential part to not fail followup services
however, i will keep "ExecStopPost=/usr/bin/true" because what ever it changes in some semantics is good and safe and i don't get why "PIDFile=" in my case could do any harm in my case because there is no sceond process likein the Fedora mysqld.service
[root@testserver:~]$ cat /usr/lib/systemd/system/mysqld.service
Before=postfix.service dovecot.service dbmail-imapd.service dbmail-lmtpd.service dbmail-pop3d.service dbmail-postfix-policyd.service
ExecStart=/usr/libexec/mysqld --defaults-file=/etc/my.cnf --pid-file=/run/mysqld/mysqld.pid --open-files-limit=750000
CapabilityBoundingSet=~CAP_DAC_OVERRIDE CAP_NET_ADMIN CAP_NET_BIND_SERVICE CAP_SETGID CAP_SETUID CAP_SYS_ADMIN CAP_SYS_BOOT CAP_SYS_MODULE CAP_SYS_PTRACE
(In reply to Harald Reindl from comment #6)
> with "ExecStartPost=/usr/bin/true" it behaves like expected
Could be a timing issue.
> "PIDFile=" is from the Fedora service which IMHO is completly wrong by still
> using mysqld_safe since systemd can do it's job alone, i removed that now
> and guess "ExecStartPost=/usr/libexec/mysqld-wait-ready $MAINPID" works
> independent of that which is an essential part to not fail followup services
mariadb.service has lost the PIDFile setting.
mysqld.service from community-mysql-server still has it.
> however, i will keep "ExecStopPost=/usr/bin/true" because what ever it
> changes in some semantics is good and safe and i don't get why "PIDFile=" in
> my case could do any harm in my case because there is no sceond process
> likein the Fedora mysqld.service
I'll have a look at the Type=simple + PIDFile support.
19:56 < sztanpet> zbyszek btw even on type=forking systemd reads the pidfile with inotify because of the countless broken daemons afaik
19:56 < sztanpet> faintly remember mbiebl commiting the fix for it, but could be wrong
> mariadb.service has lost the PIDFile setting
not at RHEL7 at least!
> I'll have a look at the Type=simple + PIDFile support
thanks, looks like i don't need it in the future, but the pitfall
should be cleared up in some way - i tried to recall the history
* switched to MariaDB
* the first issues because the aria-logs
* no time and failing backup scripts
* added the ExecStopPost as dirty hack
* faced the hack last week, thouhgt WTF, removed it
* issues again
so my primary intention for the reports was if that affects me this way without complete understanding what happens in the backround that may hit other services and users too - especially because the problem maybe hidden for a very long time until it hits at a real bad moment
i had the problem of broken aria-logs with the mysql replication-to-hot-backup scripts again today on two machines
keep in mind my only change was to remove "ExecStopPost=/usr/bin/sleep 1" from my mysqld.service and replication.service and the reason to add them where surely that problem appearing randomly after switch to Fedora 19/MariaDB 2014/01 and never happend with the "sleep 1" hack
until today i thought the "ExecStopPost=/usr/bin/true" has he same result
we are talking about systemd / MariaDB on *Fedora 19* not 20/21/Rawhide
you said "Note: tested systemd-208 and -215" which don't say
much until F19 has 204 - there is a reason why bugreporters
select the Fedora version
i get the impression that this is a randomly hitting regression of the last systemd update since i had the same tradegy today on two servers and looking at the yum log and date of this bugreport and my first mail to mariadb how that is possible at all was a day later
so AGAIN please don't check the behavior of 208/215
Jul 23 09:09:28 Updated: systemd-libs-204-20.fc19.x86_64
Jul 23 09:09:30 Updated: systemd-204-20.fc19.x86_64
Jul 23 09:09:30 Updated: systemd-sysv-204-20.fc19.x86_64
This message is a notice that Fedora 19 is now at end of life. Fedora
has stopped maintaining and issuing updates for Fedora 19. It is
Fedora's policy to close all bug reports from releases that are no
longer maintained. Approximately 4 (four) weeks from now this bug will
be closed as EOL if it remains open with a Fedora 'version' of '19'.
Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version'
to a later Fedora version.
Thank you for reporting this issue and we are sorry that we were not
able to fix it before Fedora 19 is end of life. If you would still like
to see this bug fixed and are able to reproduce it against a later version
of Fedora, you are encouraged change the 'version' to a later Fedora
version prior this bug is closed as described in the policy above.
Although we aim to fix as many bugs as possible during every release's
lifetime, sometimes those efforts are overtaken by events. Often a
more recent Fedora release includes newer upstream software that fixes
bugs or makes them obsolete.
Fedora 19 changed to end-of-life (EOL) status on 2015-01-06. Fedora 19 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.
If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
Thank you for reporting this bug and we are sorry it could not be fixed.
Sorry we didn't get this fixed before EOL. Please reopen if it happens in F20+.