Bug 896161
Summary: | Upgrading PostgreSQL from 9.1 to 9.2 with pg_upgrade/postgreql-setup fails - invalid status retrieved | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | George Machitidze <giomac> | ||||||||||
Component: | postgresql | Assignee: | Pavel Raiskup <praiskup> | ||||||||||
Status: | CLOSED ERRATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||||||
Severity: | unspecified | Docs Contact: | |||||||||||
Priority: | unspecified | ||||||||||||
Version: | 18 | CC: | devrim, giomac, hhorak, redhat2 | ||||||||||
Target Milestone: | --- | Keywords: | Reopened | ||||||||||
Target Release: | --- | ||||||||||||
Hardware: | Unspecified | ||||||||||||
OS: | Unspecified | ||||||||||||
Whiteboard: | |||||||||||||
Fixed In Version: | postgresql-9.2.5-1.fc18 | Doc Type: | Bug Fix | ||||||||||
Doc Text: | Story Points: | --- | |||||||||||
Clone Of: | Environment: | ||||||||||||
Last Closed: | 2013-10-29 03:40:58 UTC | Type: | Bug | ||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||
Documentation: | --- | CRM: | |||||||||||
Verified Versions: | Category: | --- | |||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||
Embargoed: | |||||||||||||
Attachments: |
|
Description
George Machitidze
2013-01-16 18:41:28 UTC
Hm, worksforme. Is there anything non-default about your postgres-related unit files, configuration, etc? No, everything is default, no files changed, no systemd unit files added manually for overrides: [root@giomacdesk ~]# rpm -Va "postgre*" [root@giomacdesk ~]# I've run pg_upgrade manually, without postgresql-setup script, same error/false output. Created attachment 679882 [details]
pg_upgrade strace
strace output attached
Created attachment 679883 [details]
pg_upgrade strace
strace output attached
p.s. [root@giomacdesk ~]# getenforce Permissive The strace shows that pg_upgrade is finding a postmaster.pid lock file in the old data directory: open("/var/lib/pgsql/data-old/postmaster.pid", O_RDONLY) = 5 which seems to indicate that you didn't really shut that server down cleanly. Applying pg_controldata to the old data directory would provide some more details about exactly what state it's in, although I'm afraid you might have to reinstall 9.1.7 to get a compatible pg_controldata version. If pg_controldata claims the cluster is shut down cleanly, then we'd have to inquire into how come there's still a postmaster.pid file in it ... but if not, I think this is just pilot error. Memo to self: this case suggests it'd be worth the disk space to have the postgresql-upgrade package include the older version of pg_controldata. >open("/var/lib/pgsql/data-old/postmaster.pid", O_RDONLY) = 5
>which seems to indicate that you didn't really shut that server down cleanly.
I see, but It's pg_upgrade is responsible for all that, user interaction at that moment is not possible.
I will check status with pg_controldata and update the bug. I'm very sure DB was stopped cleanly.
Created attachment 682660 [details]
Patch for RPM spec
Surprisingly, DB was shut down according to both pg_controldata versions: [root@giomacdesk SPECS]# /usr/lib64/pgsql/postgresql-9.1/bin/pg_controldata /var/lib/pgsql/data/ pg_control version number: 903 Catalog version number: 201105231 Database system identifier: 5779942941333618555 Database cluster state: shut down pg_control last modified: Fri 04 Jan 2013 08:09:38 PM GET Latest checkpoint location: 0/E0D56900 Prior checkpoint location: 0/E0D568A8 Latest checkpoint's REDO location: 0/E0D56900 Latest checkpoint's TimeLineID: 1 Latest checkpoint's NextXID: 0/907918 Latest checkpoint's NextOID: 142607 Latest checkpoint's NextMultiXactId: 1 Latest checkpoint's NextMultiOffset: 0 Latest checkpoint's oldestXID: 1858 Latest checkpoint's oldestXID's DB: 1 Latest checkpoint's oldestActiveXID: 0 Time of latest checkpoint: Fri 04 Jan 2013 08:09:38 PM GET Minimum recovery ending location: 0/0 Backup start location: 0/0 Current wal_level setting: minimal Current max_connections setting: 150 Current max_prepared_xacts setting: 0 Current max_locks_per_xact setting: 64 Maximum data alignment: 8 Database block size: 8192 Blocks per segment of large relation: 131072 WAL block size: 8192 Bytes per WAL segment: 16777216 Maximum length of identifiers: 64 Maximum columns in an index: 32 Maximum size of a TOAST chunk: 1996 Date/time type storage: 64-bit integers Float4 argument passing: by value Float8 argument passing: by value [root@giomacdesk SPECS]# pg_controldata /var/lib/pgsql/data/ WARNING: Calculated CRC checksum does not match value stored in file. Either the file is corrupt, or it has a different layout than this program is expecting. The results below are untrustworthy. pg_control version number: 903 Catalog version number: 201105231 Database system identifier: 5779942941333618555 Database cluster state: shut down pg_control last modified: Fri 04 Jan 2013 08:09:38 PM GET Latest checkpoint location: 0/E0D56900 Prior checkpoint location: 0/E0D568A8 Latest checkpoint's REDO location: 0/E0D56900 Latest checkpoint's TimeLineID: 1 Latest checkpoint's full_page_writes: off Latest checkpoint's NextXID: 907918/142607 Latest checkpoint's NextOID: 1 Latest checkpoint's NextMultiXactId: 0 Latest checkpoint's NextMultiOffset: 1858 Latest checkpoint's oldestXID: 1 Latest checkpoint's oldestXID's DB: 1357315778 Latest checkpoint's oldestActiveXID: 0 Time of latest checkpoint: Thu 01 Jan 1970 04:00:00 AM TBIT Minimum recovery ending location: 0/0 Backup start location: 0/96 Backup end location: 0/40 End-of-backup record required: yes Current wal_level setting: minimal Current max_connections setting: 0 Current max_prepared_xacts setting: 1093850759 Current max_locks_per_xact setting: 8192 Maximum data alignment: 131072 Database block size: 64 Blocks per segment of large relation: 32 WAL block size: 1996 Bytes per WAL segment: 65793 Maximum length of identifiers: 498519280 Maximum columns in an index: 0 Maximum size of a TOAST chunk: 0 Date/time type storage: floating-point numbers Float4 argument passing: by reference Float8 argument passing: by reference [root@giomacdesk SPECS]# "WARNING: Calculated CRC checksum does not match value stored in file. Either the file is corrupt, or it has a different layout than this program is expecting. The results below are untrustworthy." (In reply to comment #10) > "WARNING: Calculated CRC checksum does not match value stored in file. > Either the file is corrupt, or it has a different layout than this program > is expecting. The results below are untrustworthy." Well, that's expected when you're using the wrong pg_controldata version. But anyway, if it *was* shut down, as appears to be the case from the pg_control file, we need to figure out what's that postmaster.pid file doing there. What is in the postmaster.pid file, and how does its file mod date compare to that of the global/pg_control file? Do you remember the exact sequence of events when you were shutting down the old version? Only thing I did before this was complete FC17>FC18 upgrade via yum and reboot with fresh kernel. >"What is in the postmaster.pid file, and how does its file mod date compare to that of >the global/pg_control file?"
It's internal thing, I really can't tell how upgrade process is working there. pid file will contain postmaster PID. Probably pg_upgrade is checking it instantly, without delay.
We need someone from PostgreSQL.
I will post in mailing list...
This package has changed ownership in the Fedora Package Database. Reassigning to the new owner of this component. This package has changed ownership in the Fedora Package Database. Reassigning to the new owner of this component. Based on discussion here: http://www.postgresql.org/message-id/flat/E1TwKHs-0005yp-39@wrigleys.postgresql.org#E1TwKHs-0005yp-39@wrigleys.postgresql.org The fix applied upstream does not do anything very special, IIRC. The point is to just check _for sure_ whether postmaster running, or not (by trying to run & stop it). But if you are sure that the postmaster is turned off & the pid file still exists, it most probably ended in non-expected way (the pid file is leftover which may be removed). I think that back-porting of this patch (which is not so trivial) is not important enough. George, if you can not recall the correct steps to reproduce this issue, I guess this bug should be closed INSUFFICIENT_DATA. One thing is worth to do (at least rawhide) to package old version of pg_controldata (it seems to be less than 100kB binary on x86_64, which is not so big pain). Thanks for your update on this bug, Pavel (In reply to Pavel Raiskup from comment #16) > One thing is worth to do (at least rawhide) to package old version of > pg_controldata (it seems to be less than 100kB binary on x86_64, which is not > so big pain). Ok, it is already packaged Rawhide. Here we go [root@main lib]# rpm -q postgresql-upgrade postgresql-upgrade-9.2.4-1.fc18.x86_64 [root@main lib]# rpm -ql postgresql-upgrade|grep controldata /usr/lib64/pgsql/postgresql-9.1/bin/pg_controldata [root@main lib]# postgresql-setup upgrade Redirecting to /bin/systemctl stop postgresql.service Upgrading database: failed See /var/lib/pgsql/pgupgrade.log for details. [root@main lib]# cat /var/lib/pgsql/pgupgrade.log Performing Consistency Checks ----------------------------- Checking current, bin, and data directories ok Checking cluster versions ok connection to database failed: fe_sendauth: no password supplied could not connect to old postmaster started with the command: "/usr/lib64/pgsql/postgresql-9.1/bin/pg_ctl" -w -l "pg_upgrade_server.log" -D "/var/lib/pgsql/data-old" -o "-p 5432 -b -c listen_addresses='' -c unix_socket_permissions=0700 -c unix_socket_directory='/var/lib/pgsql'" start Failure, exiting After, added trust to pg_hba: [root@main lib]# cat /var/lib/pgsql/pgupgrade.log Performing Consistency Checks ----------------------------- Checking current, bin, and data directories ok There seems to be a postmaster servicing the old cluster. Please shutdown that postmaster and try again. Failure, exiting > [snip] > > After, added trust to pg_hba: > [root@main lib]# cat /var/lib/pgsql/pgupgrade.log > Performing Consistency Checks > ----------------------------- > Checking current, bin, and data directories ok > > There seems to be a postmaster servicing the old cluster. > Please shutdown that postmaster and try again. > Failure, exiting Thanks for your reply. The old postmaster should be stopped after the authentication fail. Workaround is obvious: stop the old server by hand, fix pg_hba.conf and run 'postgresql-setup upgrade' again. Created attachment 785618 [details]
Adjust the possibly running cluster a little bit earlier.
Believed fix in attachment.
I'll probably issue an update for F18+ (if nobody complains) once upstream confirms that this patch is OK: http://www.postgresql.org/message-id/1376303849-17344-1-git-send-email-praiskup@redhat.com postgresql-9.2.4-2.fc18 has been submitted as an update for Fedora 18. https://admin.fedoraproject.org/updates/postgresql-9.2.4-2.fc18 postgresql-9.2.4-7.fc19 has been submitted as an update for Fedora 19. https://admin.fedoraproject.org/updates/postgresql-9.2.4-7.fc19 Package postgresql-9.2.4-2.fc18: * should fix your issue, * was pushed to the Fedora 18 testing repository, * should be available at your local mirror within two days. Update it with: # su -c 'yum update --enablerepo=updates-testing postgresql-9.2.4-2.fc18' as soon as you are able to. Please go to the following url: https://admin.fedoraproject.org/updates/FEDORA-2013-14828/postgresql-9.2.4-2.fc18 then log in and leave karma (feedback). postgresql-9.2.4-7.fc19 has been pushed to the Fedora 19 stable repository. If problems still persist, please make note of it in this bug report. postgresql-9.2.5-1.fc18 has been submitted as an update for Fedora 18. https://admin.fedoraproject.org/updates/postgresql-9.2.5-1.fc18 Package postgresql-9.2.5-1.fc18: * should fix your issue, * was pushed to the Fedora 18 testing repository, * should be available at your local mirror within two days. Update it with: # su -c 'yum update --enablerepo=updates-testing postgresql-9.2.5-1.fc18' as soon as you are able to. Please go to the following url: https://admin.fedoraproject.org/updates/FEDORA-2013-19316/postgresql-9.2.5-1.fc18 then log in and leave karma (feedback). postgresql-9.2.5-1.fc18 has been pushed to the Fedora 18 stable repository. If problems still persist, please make note of it in this bug report. |