Bug 1435832
Summary: | PostgreSQL DB Restore: unexpected data beyond EOF | |||
---|---|---|---|---|
Product: | [Community] GlusterFS | Reporter: | javishi | |
Component: | nfs | Assignee: | Humble Chirammal <hchiramm> | |
Status: | CLOSED EOL | QA Contact: | ||
Severity: | unspecified | Docs Contact: | ||
Priority: | unspecified | |||
Version: | 3.8 | CC: | bugs, fuzz, kostrzewa, rgoncalves, tcarlin | |
Target Milestone: | --- | Keywords: | Triaged | |
Target Release: | --- | |||
Hardware: | x86_64 | |||
OS: | Linux | |||
Whiteboard: | ||||
Fixed In Version: | Doc Type: | If docs needed, set a value | ||
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1512691 (view as bug list) | Environment: | ||
Last Closed: | 2017-11-07 10:40:31 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: |
Description
javishi
2017-03-24 23:15:16 UTC
My thoughts about it being a container networking issue were incorrect. I now believe this is truly a glusterfs + postgresql issue. I confirmed that I occasionally do get restore failures on the postgresql container itself which eliminates the container networking interface (CNI). I also get occasional successful restores on separate restore containers which further eliminates CNI. The "unexpected data beyond EOF" error occurs intermittently with about a ~30% success rate regardless of how the restore is attempted. Also, the table size for the failing table is actually 244MB. All other tables that do successfully restore are under 10MB. Just recently I bumped into the same error using GlusterFS 3.10.5 and 3.12.1 (from SIG repositories). I have created a cluster of 3 VMs with CentOS 7.2 (uname below) and spin up a PostgreSQL 9.6.2 docker (v17.06) container. GlusterFS volume was bind-mounted into the container to default location where PostgreSQL stores its data (/var/lib/postgresql/data). When filling up the database with data at some point I got this "unexpected data beyond EOF" error. On PostgreSQL's mailing list similar issue was discussed but about PostgreSQL on NFS. In fact such issue was reported and fixed already in RHEL5 (https://bugzilla.redhat.com/show_bug.cgi?id=672981). I tried using latest PostgreSQL's docker image (i.e. 9.6.5), unfortunately with the same results. uname -a: Linux node-10-9-4-109 3.10.0-327.el7.x86_64 #1 SMP Thu Nov 19 22:10:57 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux I'm having the same problem here. I have installed postgresql 9.6.5 on 3.10.0-693.2.2.el7.x86_64 and executed pgbench with a scale factor of 1000, i.e. 10.000.000 accounts. First run was executed using the O.S filesystem. Everything went well. After that I have stopped postgresql, created a GlusterFS replicated volume (3 replicas), and copied postgresql data directory into the GlusterFS volume.The volume is mounted as type fuse.glusterfs. 10.112.76.37:gv0 on /mnt/batatas type fuse.glusterfs (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072) After that I've tried to run pgbench. Running with concurrency level of one, things work fine. However, running with concurrency level > 1, this error occurs: client 1 aborted in state 9: ERROR: unexpected data beyond EOF in block 316 of relation base/16384/16516 HINT: This has been seen to occur with buggy kernels; consider updating your system. I'm using glusterfs 3.12.2. Any idea? This bug is getting closed because the 3.8 version is marked End-Of-Life. There will be no further updates to this version. Please open a new bug against a version that still receives bugfixes if you are still facing this issue in a more current release. |