1754816 – Wal sender process occasionally fails with "Bad address" error

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1754816 - Wal sender process occasionally fails with "Bad address" error

Summary: Wal sender process occasionally fails with "Bad address" error

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	Red Hat Enterprise Linux 7
Classification:	Red Hat
Component:	postgresql
Sub Component:
Version:	7.5
Hardware:	Unspecified
OS:	Linux
Priority:	urgent
Severity:	urgent
Target Milestone:	rc
Target Release:	---
Assignee:	Patrik Novotný
QA Contact:	RHEL CS Apps Subsystem QE
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1795017 1795170 1795171
TreeView+	depends on / blocked

Reported:	2019-09-24 07:05 UTC by Yuki Okada
Modified:	2023-09-07 20:39 UTC (History)
CC List:	9 users (show)
Fixed In Version:	postgresql-9.2.24-3.el7
Doc Type:	No Doc Update
Doc Text:
Clone Of:
Clones:	1795017 1795170 1795171 (view as bug list)
Environment:
Last Closed:	2020-07-01 11:53:12 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Description Yuki Okada 2019-09-24 07:05:59 UTC

Description of problem:
- 2 node streaming replication environment
- Replication sometimes stops unexpectedly, and wal sender and wal reeiver stops at that time.
- The following messages appear in journal log. Wal sender process fails with Bad address (EFAULT) error.

  ## journal log on primary ##
  postgres[19813]: [3-1] 2019-06-14 10:28:10.012 JST replication_user  [unknown] 19813[2]LOG:  replication connection authorized: user=replication_user
  postgres[19813]: [4-1] 2019-06-14 10:28:10.014 JST replication_user  [unknown] 19813[3]LOG:  could not send data to client: Bad address
  postgres[19813]: [5-1] 2019-06-14 10:28:10.014 JST replication_user  [unknown] 19813[4]LOG:  disconnection: session time: 0:00:00.009 user=replication_user database= host=<hostname> port=<port number>
  
  ## journal log on standby ##
  postgres[20686]: [2-1] 2019-06-14 10:28:47.122 JST    20686[1]LOG:  streaming replication successfully connected to primary
  postgres[20686]: [3-1] 2019-06-14 10:28:47.125 JST    20686[2]FATAL:  could not receive data from WAL stream: SSL SYSCALL error: EOF detected
  postgres[20686]: [3-2]

- strace of wal sender process shows that sendto syscall fails with EFAULT, and previous sendto syscall fails with EAGAIN (Resource temporarily unavailable) error.

  29986 12:04:13.246208 sendto(11<TCP:[10.72.37.131:5432->10.72.37.97:63712]>, "<snip>"..., 127400, 0, NULL, 0) = 33491 <0.000029>
  29986 12:04:13.246309 sendto(11<TCP:[10.72.37.131:5432->10.72.37.97:63712]>, "<snip>"..., 93909, 0, NULL, 0) = -1 EAGAIN (Resource temporarily unavailable) <0.000020>
  29986 12:04:13.246398 read(3<pipe:[11734672]>, 0x7fff040e9960, 16) = -1 EAGAIN (Resource temporarily unavailable) <0.000018>
  29986 12:04:13.246483 poll([{fd=11<TCP:[10.72.37.131:5432->10.72.37.97:63712]>, events=POLLIN|POLLOUT}, {fd=3<pipe:[11734672]>, events=POLLIN}, {fd=7<pipe:[11734403]>, events=POLLIN}], 3, 6001) = 1 ([{fd=11, revents=POLLIN|POLLOUT}]) <0.031955>
  29986 12:04:13.289521 read(7<pipe:[11734403]>, 0x7fff040e99a7, 1) = -1 EAGAIN (Resource temporarily unavailable) <0.010923>
  29986 12:04:13.318516 recvfrom(11<TCP:[10.72.37.131:5432->10.72.37.97:63712]>, "<snip>", 5, 0, NULL, NULL) = 5 <0.011928>
  29986 12:04:13.341517 recvfrom(11<TCP:[10.72.37.131:5432->10.72.37.97:63712]>, "<snip>", 80, 0, NULL, NULL) = 80 <0.011923>
  29986 12:04:13.364516 fcntl(11<TCP:[10.72.37.131:5432->10.72.37.97:63712]>, F_GETFL) = 0x802 (flags O_RDWR|O_NONBLOCK) <0.017896>
  29986 12:04:13.383520 fcntl(11<TCP:[10.72.37.131:5432->10.72.37.97:63712]>, F_SETFL, O_RDWR) = 0 <0.000030>
  29986 12:04:13.383735 fcntl(11<TCP:[10.72.37.131:5432->10.72.37.97:63712]>, F_GETFL) = 0x2 (flags O_RDWR) <0.000027>
  29986 12:04:13.383839 fcntl(11<TCP:[10.72.37.131:5432->10.72.37.97:63712]>, F_SETFL, O_RDWR|O_NONBLOCK) = 0 <0.000025>
  29986 12:04:13.383938 recvfrom(11<TCP:[10.72.37.131:5432->10.72.37.97:63712]>, "<snip>", 5, 0, NULL, NULL) = 5 <0.000028>
  29986 12:04:13.384063 recvfrom(11<TCP:[10.72.37.131:5432->10.72.37.97:63712]>, "<snip>", 80, 0, NULL, NULL) = 80 <0.000038>
  29986 12:04:13.384211 fcntl(11<TCP:[10.72.37.131:5432->10.72.37.97:63712]>, F_GETFL) = 0x802 (flags O_RDWR|O_NONBLOCK) <0.000025>
  29986 12:04:13.384306 fcntl(11<TCP:[10.72.37.131:5432->10.72.37.97:63712]>, F_SETFL, O_RDWR) = 0 <0.000018>
  29986 12:04:13.384382 fcntl(11<TCP:[10.72.37.131:5432->10.72.37.97:63712]>, F_GETFL) = 0x2 (flags O_RDWR) <0.000017>
  29986 12:04:13.384449 fcntl(11<TCP:[10.72.37.131:5432->10.72.37.97:63712]>, F_SETFL, O_RDWR|O_NONBLOCK) = 0 <0.000022>
  29986 12:04:13.384546 recvfrom(11<TCP:[10.72.37.131:5432->10.72.37.97:63712]>, "<snip>", 5, 0, NULL, NULL) = 5 <0.000029>
  29986 12:04:13.384655 recvfrom(11<TCP:[10.72.37.131:5432->10.72.37.97:63712]>, "<snip>", 80, 0, NULL, NULL) = 80 <0.000042>
  29986 12:04:13.384784 fcntl(11<TCP:[10.72.37.131:5432->10.72.37.97:63712]>, F_GETFL) = 0x802 (flags O_RDWR|O_NONBLOCK) <0.000051>
  29986 12:04:13.384927 fcntl(11<TCP:[10.72.37.131:5432->10.72.37.97:63712]>, F_SETFL, O_RDWR) = 0 <0.000027>
  29986 12:04:13.385030 fcntl(11<TCP:[10.72.37.131:5432->10.72.37.97:63712]>, F_GETFL) = 0x2 (flags O_RDWR) <0.000026>
  29986 12:04:13.385141 fcntl(11<TCP:[10.72.37.131:5432->10.72.37.97:63712]>, F_SETFL, O_RDWR|O_NONBLOCK) = 0 <0.000026>
  29986 12:04:13.385239 recvfrom(11<TCP:[10.72.37.131:5432->10.72.37.97:63712]>, 0x27315a3, 5, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable) <0.000026>
  29986 12:04:13.385344 sendto(11<TCP:[10.72.37.131:5432->10.72.37.97:63712]>, 0x82d3, 93909, 0, NULL, 0) = -1 EFAULT (Bad address) <0.000038>
  29986 12:04:13.385627 sendto(10<UNIX:[11734408->1425]>, "<134>Jul  1 12:04:13 postgres[29986]: [4-1] 2019-07-01 12:04:13.385 JST replication_user  [unknown] 29986[3]LOG:  could not send data to client: Bad address", 156, MSG_NOSIGNAL, NULL, 0) = 156 <0.000048>
  29986 12:04:13.502815 --- SIGUSR1 {si_signo=SIGUSR1, si_code=SI_USER, si_pid=29955, si_uid=26} ---
  29986 12:04:13.502877 rt_sigreturn({mask=[]}) = 156 <0.000020>
  29986 12:04:13.503053 sendto(10<UNIX:[11734408->1425]>, "<134>Jul  1 12:04:13 postgres[29986]: [5-1] 2019-07-01 12:04:13.503 JST replication_user  [unknown] 29986[4]LOG:  disconnection: session time: 0:01:02.522 user=replication_user database= host=10.72.37.97 port=63712", 214, MSG_NOSIGNAL, NULL, 0) = 214 <0.000025>
  29986 12:04:13.504626 exit_group(0)     = ?

- Backtrace of wal sender process when the issue reproduces is as follows.

  Breakpoint 1, my_sock_write (h=0xda0a40, buf=<optimized out>, size=15717) at be-secure.c:523
  523 if (res <= 0)
  #0  my_sock_write (h=0xda0a40, buf=<optimized out>, size=15717) at be-secure.c:523
  #1  0x00007f403071113b in BIO_write (b=0xda0a40, in=0x1b443, inl=15717) at bio_lib.c:243 
  #2  0x00007f4030a82692 in ssl3_write_pending (s=s@entry=0xda5620, type=type@entry=23, buf=buf@entry=0x7f40313c8040 "d", len=<optimized out>) at s3_pkt.c:1111
  #3  0x00007f4030a82e63 in ssl3_write_bytes (s=0xda5620, type=23, buf_=0x7f40313c8040, len=131102) at s3_pkt.c:683
  #4  0x00000000005ab4e2 in secure_write (port=0xda3410, ptr=ptr@entry=0x7f40313c8040, len=len@entry=131102) at be-secure.c:390
  #5  0x00000000005b48fb in internal_flush () at pqcomm.c:1358
  #6  0x00000000005b5a6a in pq_flush_if_writable () at pqcomm.c:1436
  #7  0x000000000062bfac in WalSndLoop () at walsender.c:779
  #8  WalSenderMain () at walsender.c:195
  #9  0x000000000065f146 in PostgresMain (argc=1, argv=argv@entry=0xd57fe0, dbname=0xd57f00 "", username=0xd57ed0 "replication_user") at postgres.c:3772
  #10 0x0000000000619996 in BackendRun (port=0xda3410) at postmaster.c:3830
  #11 BackendStartup (port=0xda3410) at postmaster.c:3501
  #12 ServerLoop () at postmaster.c:1508
  #13 0x000000000061a767 in PostmasterMain (argc=argc@entry=5, argv=argv@entry=0xd57210) at postmaster.c:1237
  #14 0x000000000045bd2c in main (argc=5, argv=0xd57210) at main.c:237

- In ssl3_write_pending(), wb->buf or wb->offset might be an invalid address.

   1092 int ssl3_write_pending(SSL *s, int type, const unsigned char *buf,
   1093                        unsigned int len)
   1094 {
   [...]
   1109         if (s->wbio != NULL) {
   1110             s->rwstate = SSL_WRITING;
   1111             i = BIO_write(s->wbio,
   1112                           (char *)&(wb->buf[wb->offset]),       <<<------- This might be invalid.
   1113                           (unsigned int)wb->left);

- Similar issue is reported. In this issue, calling BIO_write in ssl3_write_pending fails with EFAULT because EAGAIN occurs just before EFAULT.

  https://mta.openssl.org/pipermail/openssl-dev/2015-February/000801.html

- In my_sock_write, retry flag is set only when EINTR occurs, and EAGAIN is not handled.

  my_sock_write(BIO *h, const char *buf, int size)
  {
          int                     res = 0;
  
          res = send(BIO_get_fd(h, NULL), buf, size, 0);
          BIO_clear_retry_flags(h);
          if (res <= 0)
          {
                  if (errno == EINTR)
                  {
                          BIO_set_retry_write(h);     <<<-------
                  }
          }
  
          return res;   <<<------- 
  }

- In PostgreSQL 9.6 code, EAGAIN is also handled and retry flag is set.

  https://github.com/postgres/postgres/blob/REL9_6_STABLE/src/backend/libpq/be-secure-openssl.c#L736

I guess this issue could be resolved by adding code which handles EAGAIN error.

Version-Release number of selected component (if applicable):
postgresql-server-9.2.24-1.el7_5.x86_64

How reproducible:
Occasionally in customer's environment

Steps to Reproduce:
Not able to reproduce from scratch

Actual results:
Wal sender and wal receiver stop suddenly.

Expected results:
Wal sender and wal receiver keep working.

Additional info:
- Once this issue happens, manual re-configuration of streaming replication is needed to restore the system. (Just restarting primary postgres and standby postgres didn't work.) There is a huge impact to a production system.
- Tried increasing tcp buffer size like below in order to avoid EAGAIN error, but it didn't work.

  net.core.rmem_max = 16777216
  net.ipv4.tcp_rmem = 4096 87380 16777216
  net.core.wmem_max = 16777216
  net.ipv4.tcp_wmem = 4096 87380 16777216

Note You need to log in before you can comment on or make changes to this bug.