Hide Forgot
Version-Release number of selected component: postgresql-server-9.3.2-1.fc20 Additional info: reporter: libreport-2.1.9 backtrace_rating: 4 cmdline: 'postgres: checkpointer process ' '' '' '' '' '' '' '' '' '' '' '' '' '' '' '' crash_function: errfinish executable: /usr/bin/postgres kernel: 3.11.9-300.fc20.x86_64 runlevel: N 5 type: CCpp uid: 26 Truncated backtrace: Thread no. 1 (10 frames) #2 errfinish #3 UpdateControlFile #4 CreateCheckPoint #5 ShutdownXLOG #6 CheckpointerMain #7 AuxiliaryProcessMain #8 StartChildProcess #9 reaper #11 __select_nocancel at ../sysdeps/unix/syscall-template.S:81 #12 ServerLoop
Created attachment 834358 [details] File: backtrace
Created attachment 834359 [details] File: cgroup
Created attachment 834360 [details] File: core_backtrace
Created attachment 834361 [details] File: dso_list
Created attachment 834362 [details] File: environ
Created attachment 834363 [details] File: limits
Created attachment 834364 [details] File: maps
Created attachment 834365 [details] File: open_fds
Created attachment 834366 [details] File: proc_pid_status
Created attachment 834367 [details] File: var_log_messages
The stack trace is pretty uninformative (unless you can get one with debug symbols). Can we see the postmaster log?
Created attachment 834394 [details] backtrace Backtrace created using proper debug symbols using gdb with "thread apply all bt full" command.
Unfortunately, I am not able to provide any other logs.
OK, so the error is being thrown from here: #3 0x00000000004b5705 in UpdateControlFile () at xlog.c:3759 which in 9.3.2 is this code: fd = BasicOpenFile(XLOG_CONTROL_FILE, O_RDWR | PG_BINARY, S_IRUSR | S_IWUSR); if (fd < 0) ereport(PANIC, (errcode_for_file_access(), errmsg("could not open control file \"%s\": %m", XLOG_CONTROL_FILE))); It would sure be interesting to know what errno was reported, but without the postmaster log we're probably not going to find that out (unless you can dig into the core dump? What we'd want to look at is the contents of elog.c's errordata[0] struct.) In any case, the control file certainly ought to be there and be readable. So this is looking like user error or filesystem misfeasance, and not anything particularly exciting in postgres itself.
I am not very skilled with gdb command line. Following output hopefully contains errordata[0] value: (gdb) p errordata[0] $1 = {elevel = 22, output_to_server = 1 '\001', output_to_client = 0 '\000', show_funcname = 0 '\000', hide_stmt = 0 '\000', filename = 0x757fc5 "xlog.c", lineno = 3762, funcname = 0x75ed70 <__func__.18668> "UpdateControlFile", domain = 0x82a435 "postgres-9.3", context_domain = 0x0, sqlerrcode = 16908805, message = 0x1f0af50 "could not open control file \"global/pg_control\": No such file or directory", detail = 0x0, detail_log = 0x0, hint = 0x0, context = 0x0, schema_name = 0x0, table_name = 0x0, column_name = 0x0, datatype_name = 0x0, constraint_name = 0x0, cursorpos = 0, internalpos = 0, internalquery = 0x0, saved_errno = 2} I can provide you the core dump privately. Also, we can discuss the exploring of the core dump via IRC ( freenode/#postgresql/jmlich ).
(In reply to Jozef Mlich from comment #15) > message = 0x1f0af50 "could not open control file \"global/pg_control\": No > such file or directory", OK, that's what we needed to know right there. So the question becomes, what happened to pg_control? Postgres certainly didn't remove that file. I'm still thinking this is user error or a filesystem problem.
Since I cannot reproduce this bug, I close it. Feel free to reopen it, if you want to explore the core dump deeper
*** Bug 1167105 has been marked as a duplicate of this bug. ***