Bug 836442

Summary: shutdown -r now in virtual machine often leaves problems
Product: [Fedora] Fedora Reporter: Trever Adams <trever>
Component: qemuAssignee: Fedora Virtualization Maintainers <virt-maint>
Status: CLOSED CURRENTRELEASE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: unspecified    
Version: 17CC: amit.shah, berrange, cfergeau, devrim, dwmw2, esandeen, hhorak, itamar, knoel, pbonzini, rjones, scottt.tw, tgl, virt-maint
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-04-03 11:05:34 EDT Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Description Trever Adams 2012-06-29 02:27:42 EDT
Description of problem:
Not always, but frequently when doing a reboot in a qmu-kvm (AMD with virtualization) I get the following: pg_ctl: invalid data in PID file "/var/lib/pgsql/data/postmaster.pid"

Removing the pid file and restarting fixes the problem, but it is a pain to have to remember this. It started showing up for me about three weeks ago.


Version-Release number of selected component (if applicable):
postgresql-9.1.4-1.fc17.x86_64
postgresql-libs-9.1.4-1.fc17.x86_64
postgresql-server-9.1.4-1.fc17.x86_64
postgresql-upgrade-9.1.4-1.fc17.x86_64


How reproducible:
I am not sure. I have four VMs with postgresql. It seems that at least one does it nearly every reboot. They are low memory VMs. (768M or less)

Additional info:
Three of these are DSPAM setups. One, the highest of load of the three, seems to have it happen more frequently. That VM has 768M.

The other is a Zabbix setup is high load with 768M. I think it is the next most frequent.
Comment 1 Tom Lane 2012-06-29 02:45:12 EDT
This seems reminiscent of some reports that we saw back when postgres did not fsync the postmaster.pid file while creating it.  9.1.4 certainly does so, though, which says to me that this is probably a bug in the VM software (ie, failing to pass through an fsync request).

Out of curiosity, what is in the postmaster.pid file exactly when you see this error?
Comment 2 Trever Adams 2012-06-29 04:08:15 EDT
As the pid is on a tmpfs, I cannot explain this behavior at all.

It will take me a while to reproduce this bug. I will post an answer as soon as I am able.
Comment 3 Tom Lane 2012-06-29 10:13:33 EDT
You've got /var/lib/pgsql/data in a tmpfs?  That's uncommon ... but if so, this is really weird, because the tmpfs ought to come up empty after a reboot.
Comment 4 Trever Adams 2012-06-29 13:06:47 EDT
Sorry. I know most pids are in /var/run and my brain just burped. Yes, this pid is NOT tmpfs. Sorry for the confusion. I was unable to get a pid problem on the reboot I just did. I will keep trying until I can show you the contents.
Comment 5 Honza Horak 2012-07-03 11:36:59 EDT
Trever, if you encounter the issue again, it would be good to attach also postgresql-related messages from the systemd log.

For booting with more verbose systemd log you can use "log_buf_len=1M systemd.log_level=debug systemd.log_target=kmsg" as kernel parameters.
Comment 6 Trever Adams 2012-07-08 15:49:20 EDT
The PID file is zero length.

Jul  8 13:46:07 machine pg_ctl[8339]: pg_ctl: invalid data in PID file "/var/lib/pgsql/data/postmaster.pid"
Jul  8 13:46:07 machine systemd[1]: postgresql.service: control process exited, code=exited status=1
Jul  8 13:46:07 machine systemd[1]: Unit postgresql.service entered failed state.

I have appended those things ot my boot string, but I will have ot wait for them to take effect.
Comment 7 Tom Lane 2012-07-08 19:05:20 EDT
(In reply to comment #6)
> The PID file is zero length.

That's pretty much what I expected.  This indicates that the filesystem metadata showing that the PID file exists reaches disk, but the actual file contents do not, despite the fsync() call that PG has made.  Now it could be that you're using unsafe filesystem settings, but otherwise this is a kernel or VM bug.

I'm going to make a guess and reassign this to qemu, but that's mostly a guess ...
Comment 8 Trever Adams 2012-07-08 23:02:14 EDT
/dev/mapper/vg_machine-lv_root on / type ext4 (rw,relatime,seclabel,data=ordered) is how it is mounted in the vm

/dev/mapper/Family-lv_family on /var/Family type ext4 (rw,relatime,seclabel,data=ordered) is how where the drive images are stored is mounted

I use virtio for network and drives (drive images in this case).

I can provide more information when it is requested. I have tried to include everything I have that may be releavent.
Comment 9 Richard W.M. Jones 2012-07-09 03:18:06 EDT
Is your VM stored in a qcow2 file and does qemu segfault
during the reboot?  (bug 836913)

You would have to enable coredumps and make sure they are
being captured centrally by something like abrt, or if you
don't want to use abrt, then try a technique like the one
described here:
http://people.redhat.com/~rjones/xen-stress-tests/
(By the way the last thing there about restarting services
to set the ulimit probably doesn't work with systemd)
Comment 10 Avi Kivity 2012-07-09 13:18:44 EDT
There is an ext4 bug which causes data loss on shutdown.  Perhaps this is the same bug.  Will try to search for it.
Comment 11 Avi Kivity 2012-07-09 13:26:30 EDT
Candidate bugs:
https://bugzilla.redhat.com/show_bug.cgi?id=721114
https://bugzilla.redhat.com/show_bug.cgi?id=749527
https://bugzilla.redhat.com/show_bug.cgi?id=818172

(all dups of each other - the last one contains patches)
Comment 12 Avi Kivity 2012-07-09 13:37:51 EDT
Those patches are in 2.3.36, so F17 should have them.
Comment 13 Eric Sandeen 2012-07-09 13:52:12 EDT
"It started showing up for me about three weeks ago." - and it was ok before then?  Do you have logs of what packages got updated (either on the host or the guests, I guess?)

So the .pid file is showing up as 0-length on a reboot?  But a .pid file shouldn't even persist through a reboot should it?

This is older postgresql, but:

[root@sandeen ~]# /etc/init.d/postgresql status
postmaster is stopped
[root@sandeen ~]# ls /var/lib/pgsql/data/postmaster.pid
ls: /var/lib/pgsql/data/postmaster.pid: No such file or directory
[root@sandeen ~]# /etc/init.d/postgresql start
Starting postgresql service:                               [  OK  ]
[root@sandeen ~]# ls /var/lib/pgsql/data/postmaster.pid
/var/lib/pgsql/data/postmaster.pid
[root@sandeen ~]# /etc/init.d/postgresql stop
Stopping postgresql service:                               [  OK  ]
[root@sandeen ~]# ls /var/lib/pgsql/data/postmaster.pid
ls: /var/lib/pgsql/data/postmaster.pid: No such file or directory

so if things shut down cleanly there should be no pid file at all, right?
Comment 14 Trever Adams 2012-07-12 16:32:33 EDT
I am using raw images as I plan on migrating these to real hardware at some point. And, no, qemu doesn't segfault on me.

The .pid is 0 length but in existence. It seems to me either there is an fs problem, or a stop or start of postgresql problem.

It might be the ext4 bug, but I believe I only see it with postgresql pid file. I cannot look at the bugs. Apparently there are security related things which lock me out.

Should I attach the entire yum.log (a few months worth because of some logrotate problem)? I am not sure on the exact time the bug started.
Comment 15 Eric Sandeen 2012-07-12 17:17:42 EDT
What are the timestamps on the pid file, maybe that's interesting, to see how they relate to the time of shutdown?

I think it's up to /usr/bin/postmaster to create & remove the pidfile.  Perhaps stracing it would help?  Does a manual /etc/init.d/postmaster [stop|start] do the right thing outside of system shutdown?
Comment 16 Trever Adams 2012-07-18 18:30:44 EDT
I will have to wait until the next failure to get timestampes and other things. I don't like to reboot these vms or the host machine more than I must.

Manually shutting down (service postgresql restart) has failed to produce the problem.
Comment 17 Avi Kivity 2012-07-19 05:44:39 EDT
Is it possible for you to clone the VM and run the tests on a clone?
Comment 18 Trever Adams 2012-07-19 18:26:18 EDT
It might be possible if I can run it on a different machine. The host in question is about maxed. I have access to others which are similar hardware, but not identical.
Comment 19 Trever Adams 2012-07-20 02:21:17 EDT
[root@machine ~]# stat /var/lib/pgsql/data/postmaster.pid 
  File: `/var/lib/pgsql/data/postmaster.pid'
  Size: 0         	Blocks: 0          IO Block: 4096   regular empty file
Device: fd00h/64768d	Inode: 1704046     Links: 1
Access: (0600/-rw-------)  Uid: (   26/postgres)   Gid: (   26/postgres)
Context: system_u:object_r:postgresql_db_t:s0
Access: 2012-07-20 00:15:22.736309720 -0600
Modify: 2012-07-19 19:39:01.596744269 -0600
Change: 2012-07-19 19:39:01.596744269 -0600
 Birth: -
[root@machine ~]# uptime
 00:17:46 up  4:40,  1 user,  load average: 0.64, 1.63, 1.38
[root@machine ~]# date
Fri Jul 20 00:17:51 MDT 2012

So, it appears that the file is being modified on shutdown or reboot. I haven't done the math. The machine was rebooted for glibc/kernel changes about that time (the uptime is from the vm not the host). The host appears to be about the same.
Comment 20 Eric Sandeen 2012-07-20 15:17:15 EDT
so at 00:17:46 you'd been up for 4h 40m, soo... last boot was at about 19:37:46 (you could verify in /var/log/messages I suppose), and the file was modified at 19:39:01...?  If I figured right it looks more like it's been modified during bootup, not shutdown?

I think you're going to need to find a way to trace this to get more info.
Comment 21 Trever Adams 2012-09-12 09:11:23 EDT
I am not sure how to do a trace where this happens on startup. It is happening less and less, but it still happens occasionally.
Comment 22 Trever Adams 2013-01-09 08:01:33 EST
Eric, I haven't been able to trigger this any other place but boot. I haven't been able to trigger it with an strace going.

It is getting harder and harder to trigger. Interestingly enough, most of the time I do not have to remove the pid file, just restart now.
Comment 23 Trever Adams 2013-04-03 10:56:33 EDT
I have not been able to trigger this in the last few months. I think it may be fixed. I know several problems with disk corruption with kqemu, etc. have been fixed and I cannot trigger this any more.
Comment 24 Eric Sandeen 2013-04-03 11:05:34 EDT
Ok, I'm going to go ahead &close this.  If it comes up again, please do re-open.

Thanks,
-Eric