RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1680469 - tar: does not check for NULL error return from xgetcwd
Summary: tar: does not check for NULL error return from xgetcwd
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: tar
Version: 7.6
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: rc
: ---
Assignee: Petr Kubat
QA Contact: RHEL CS Apps Subsystem QE
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-02-25 06:36 UTC by Nag Pavan Chilakam
Modified: 2020-06-15 05:54 UTC (History)
13 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1837871 (view as bug list)
Environment:
Last Closed: 2020-05-20 08:08:25 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
coredump (492.00 KB, application/x-core)
2019-02-28 13:52 UTC, Nag Pavan Chilakam
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1534635 1 None None None 2021-06-10 14:12:12 UTC

Internal Links: 1534635

Description Nag Pavan Chilakam 2019-02-25 06:36:33 UTC
Description of problem:
====================
I have running some system(non-functional) testing for gluster, which involved multiple IO patterns from different clients.
One IO was linux untar.
However the linux untar failed after some iterations as below

[Sat Feb 23 05:11:20 2019] tar[5113]: segfault at 0 ip 00007f8c7985a901 sp 00007ffee8982248 error 4 in libc-2.17.so[7f8c796ec000+1c2000]
[Sat Feb 23 05:11:44 2019] tar[5242]: segfault at 0 ip 00007f2f894bd901 sp 00007ffc237f0078 error 4 in libc-2.17.so[7f2f8934f000+1c2000]
[Sat Feb 23 05:12:12 2019] tar[5589]: segfault at 0 ip 00007f88e6297901 sp 00007ffeba853af8 error 4 in libc-2.17.so[7f88e6129000+1c2000]
[Sat Feb 23 05:12:35 2019] tar[5704]: segfault at 0 ip 00007f3f04bd1901 sp 00007ffc7771f2f8 error 4 in libc-2.17.so[7f3f04a63000+1c2000]
[Sat Feb 23 05:12:55 2019] tar[6023]: segfault at 0 ip 00007ff15e314901 sp 00007ffd9129b1e8 error 4 in libc-2.17.so[7ff15e1a6000+1c2000]
[Sat Feb 23 05:13:20 2019] tar[6368]: segfault at 0 ip 00007f2de0e1a901 sp 00007ffe6371d848 error 4 in libc-2.17.so[7f2de0cac000+1c2000]
[Sat Feb 23 05:13:37 2019] tar[6683]: segfault at 0 ip 00007f5034ce0901 sp 00007ffd22577b18 error 4 in libc-2.17.so[7f5034b72000+1c2000]
[Sat Feb 23 05:14:03 2019] tar[7103]: segfault at 0 ip 00007fa991672901 sp 00007fff4eb384a8 error 4 in libc-2.17.so[7fa991504000+1c2000]
[Sat Feb 23 05:14:22 2019] tar[7491]: segfault at 0 ip 00007f7cadb00901 sp 00007ffe7d970468 error 4 in libc-2.17.so[7f7cad992000+1c2000]
[Sat Feb 23 05:14:45 2019] tar[7946]: segfault at 0 ip 00007f0ffcf29901 sp 00007ffd8b8f4878 error 4 in libc-2.17.so[7f0ffcdbb000+1c2000]
[Sat Feb 23 05:15:01 2019] tar[8158]: segfault at 0 ip 00007fcc4e40e901 sp 00007ffd400c1c88 error 4 in libc-2.17.so[7fcc4e2a0000+1c2000]
[Sat Feb 23 05:15:17 2019] tar[8321]: segfault at 0 ip 00007f2975c8a901 sp 00007ffed73a7c18 error 4 in libc-2.17.so[7f2975b1c000+1c2000]
[Sat Feb 23 05:15:42 2019] tar[8803]: segfault at 0 ip 00007fce66dc5901 sp 00007ffd42f0c328 error 4 in libc-2.17.so[7fce66c57000+1c2000]
[Sat Feb 23 05:16:00 2019] tar[8893]: segfault at 0 ip 00007f6613683901 sp 00007ffe4a8c69f8 error 4 in libc-2.17.so[7f6613515000+1c2000]
[Sat Feb 23 05:16:26 2019] tar[9118]: segfault at 0 ip 00007f7258a6d901 sp 00007ffdcd2ace48 error 4 in libc-2.17.so[7f72588ff000+1c2000]
[Sat Feb 23 05:16:55 2019] tar[9213]: segfault at 0 ip 00007f86e7b8b901 sp 00007ffe441bf958 error 4 in libc-2.17.so[7f86e7a1d000+1c2000]
[Sat Feb 23 05:17:14 2019] tar[9292]: segfault at 0 ip 00007f9c77011901 sp 00007ffdf84fab78 error 4 in libc-2.17.so[7f9c76ea3000+1c2000]
[Sat Feb 23 05:17:38 2019] tar[9500]: segfault at 0 ip 00007f7add332901 sp 00007ffc3e20d938 error 4 in libc-2.17.so[7f7add1c4000+1c2000]
[Sat Feb 23 05:17:56 2019] tar[9700]: segfault at 0 ip 00007f5ab3d3f901 sp 00007ffea2e95d08 error 4 in libc-2.17.so[7f5ab3bd1000+1c2000]
[Sat Feb 23 05:18:23 2019] tar[10280]: segfault at 0 ip 00007f97ee1ad901 sp 00007ffe7703d0d8 error 4 in libc-2.17.so[7f97ee03f000+1c2000]
[Sat Feb 23 05:18:41 2019] tar[10627]: segfault at 0 ip 00007fd47798e901 sp 00007ffca7469a88 error 4 in libc-2.17.so[7fd477820000+1c2000]
[Sat Feb 23 05:19:07 2019] tar[11040]: segfault at 0 ip 00007f2a7e9d7901 sp 00007ffd9b1e2108 error 4 in libc-2.17.so[7f2a7e869000+1c2000]
[Sat Feb 23 05:19:38 2019] tar[11294]: segfault at 0 ip 00007f9bc58e5901 sp 00007fff518c1968 error 4 in libc-2.17.so[7f9bc5777000+1c2000]
[Sat Feb 23 05:19:57 2019] tar[11428]: segfault at 0 ip 00007f4eb6b3a901 sp 00007ffe875f35f8 error 4 in libc-2.17.so[7f4eb69cc000+1c2000]
[Sat Feb 23 05:20:13 2019] tar[11758]: segfault at 0 ip 00007f002fe49901 sp 00007fff477e48a8 error 4 in libc-2.17.so[7f002fcdb000+1c2000]
[Sat Feb 23 05:20:31 2019] tar[12027]: segfault at 0 ip 00007f2b630f7901 sp 00007ffe3d1f85c8 error 4 in libc-2.17.so[7f2b62f89000+1c2000]
[Sat Feb 23 05:20:50 2019] tar[12117]: segfault at 0 ip 00007f39fb3d5901 sp 00007ffec9ec06f8 error 4 in libc-2.17.so[7f39fb267000+1c2000]
[Sat Feb 23 05:21:09 2019] tar[12325]: segfault at 0 ip 00007f267efef901 sp 00007fff5b0b3b28 error 4 in libc-2.17.so[7f267ee81000+1c2000]
[Sat Feb 23 05:21:29 2019] tar[12565]: segfault at 0 ip 00007f5feaf08901 sp 00007ffde8471b68 error 4 in libc-2.17.so[7f5fead9a000+1c2000]
[Sat Feb 23 05:21:49 2019] tar[12741]: segfault at 0 ip 00007f9f27ebb901 sp 00007ffeeaa95fd8 error 4 in libc-2.17.so[7f9f27d4d000+1c2000]
[Sat Feb 23 05:22:05 2019] tar[12882]: segfault at 0 ip 00007f08ecd4a901 sp 00007ffc0fcd8e78 error 4 in libc-2.17.so[7f08ecbdc000+1c2000]
[Sat Feb 23 05:22:27 2019] tar[13112]: segfault at 0 ip 00007fe924977901 sp 00007ffd5c5eb7e8 error 4 in libc-2.17.so[7fe924809000+1c2000]
[Sat Feb 23 05:22:49 2019] tar[13200]: segfault at 0 ip 00007f1f35a51901 sp 00007ffcdec60f18 error 4 in libc-2.17.so[7f1f358e3000+1c2000]
[Mon Feb 25 08:04:26 2019] sched: RT throttling activated
[root@dhcp35-64 IOs]# 
[root@dhcp35-64 IOs]# 
[root@dhcp35-64 IOs]# ls /


dir.17/linux-4.20.8/virt/kvm/coalesced_mmio.c
dir.17/linux-4.20.8/virt/lib/
dir.17/linux-4.20.8/virt/lib/Makefile
dir.17/linux-4.20.8/virt/lib/Kconfig
dir.17/linux-4.20.8/virt/lib/irqbypass.c
dir.17/linux-4.20.8/virt/Makefile
Segmentation fault
Segmentation fault
Segmentation fault
Segmentation fault
Segmentation fault
Segmentation fault
Segmentation fault
Segmentation fault
Segmentation fault
Segmentation fault
Segmentation fault
Segmentation fault
Segmentation fault
Segmentation fault
Segmentation fault
Segmentation fault
Segmentation fault
Segmentation fault
Segmentation fault
Segmentation fault
Segmentation fault
Segmentation fault
Segmentation fault
Segmentation fault
Segmentation fault
Segmentation fault
Segmentation fault
Segmentation fault
Segmentation fault
Segmentation fault
Segmentation fault
Segmentation fault
Segmentation fault
Segmentation fault
Segmentation fault                                                    




Version-Release number of selected component (if applicable):
=================
[root@dhcp35-64 glusterfs]# rpm -qa|egrep "libc|kernel|gluster"
libcap-2.22-9.el7.x86_64
glusterfs-3.12.2-43.el7.x86_64
glibc-common-2.17-260.el7_6.3.x86_64
kernel-tools-libs-3.10.0-957.5.1.el7.x86_64
kernel-3.10.0-957.5.1.el7.x86_64
glusterfs-libs-3.12.2-43.el7.x86_64
glusterfs-client-xlators-3.12.2-43.el7.x86_64
kernel-3.10.0-957.el7.x86_64
libcroco-0.6.12-4.el7.x86_64
kernel-3.10.0-862.14.4.el7.x86_64
libcap-ng-0.7.5-4.el7.x86_64
libcurl-7.29.0-51.el7.x86_64
glibc-2.17-260.el7_6.3.x86_64
kernel-tools-3.10.0-957.5.1.el7.x86_64
glusterfs-fuse-3.12.2-43.el7.x86_64
libcom_err-1.42.9-13.el7.x86_64
[root@dhcp35-64 glusterfs]# cat /etc/red*
Red Hat Enterprise Linux Server release 7.6 (Maipo)


How reproducible:
================
hit it once

Steps to Reproduce:
=====================
1.created a single 3x3 volume on 4 node setup (brickmux disabled, and all settings are default)
2.mounted volume on 8 clients and triggered below IOs
IOs:
1) linux untar from all mounts ---> note that on 4 clients they were being done from non root user, after enabling access through ACLs 
2) was collecting resource consumption and appending to individual files on the mount point
3) continous lookups on all clients
4) same deep directory path creation from all 8 clients parallely
--kept the IOs going for about 2 days, and then as a random health check, --
5)after 2 days, enabled quota and uss
6) rebooted one node
7)
I triggered ls and ls -l to one of the directory paths and I see the above problem




Actual results:
==============
saw that on client 10.70.35.64, linux untar failed with segfault

Comment 7 Florian Weimer 2019-02-25 16:11:00 UTC
Without a coredump, there doesn't seem to be enough data to investigate this.

The crash at this address:

[Sat Feb 23 05:22:49 2019] tar[13200]: segfault at 0 ip 00007f1f35a51901 sp 00007ffcdec60f18 error 4 in libc-2.17.so[7f1f358e3000+1c2000]

is at the movdqu instruction here:

Dump of assembler code for function __strlen_sse2_pminub:
   0x000000000016e8f0 <+0>:	xor    %rax,%rax
   0x000000000016e8f3 <+3>:	mov    %edi,%ecx
   0x000000000016e8f5 <+5>:	and    $0x3f,%ecx
   0x000000000016e8f8 <+8>:	pxor   %xmm0,%xmm0
   0x000000000016e8fc <+12>:	cmp    $0x30,%ecx
   0x000000000016e8ff <+15>:	ja     0x16e91e <__strlen_sse2_pminub+46>
   0x000000000016e901 <+17>:	movdqu (%rdi),%xmm1

This means that something called strlen with a NULL argument.  That is still not enough to debug this further.  We do not even know the responsible component yet.

Maybe Carlos can have a look, but I really don't see how we can move this forward.

Comment 8 Carlos O'Donell 2019-02-25 21:09:29 UTC
(In reply to Florian Weimer from comment #7)
> This means that something called strlen with a NULL argument.  That is still
> not enough to debug this further.  We do not even know the responsible
> component yet.
> 
> Maybe Carlos can have a look, but I really don't see how we can move this
> forward.

We absolutely need a coredump.

It's vital to determining if this is a tar issue with invalid input, or a glibc issue with strlen.

Comment 11 Nag Pavan Chilakam 2019-02-28 13:52:33 UTC
Created attachment 1539495 [details]
coredump

Attached is the coredump for the segfault

Comment 12 Florian Weimer 2019-02-28 16:43:10 UTC
Thanks a lot for providing the coredump.  I think we are finally getting somewhere:

704	  if (dir[0])
705	    {
706	      while (dir[0] == '.' && ISSLASH (dir[1]))
707		for (dir += 2;  ISSLASH (*dir);  dir++)
708		  continue;
709	      if (! dir[dir[0] == '.'])
710		return wd_count - 1;
711	    }
712	
713	  wd[wd_count].name = dir;
714	  /* if the given name is an absolute path, then use that path
715	     to represent this working directory; otherwise, construct
716	     a path based on the previous -C option's absolute path */
717	  if (IS_ABSOLUTE_FILE_NAME (wd[wd_count].name))
718	    wd[wd_count].abspath = xstrdup (wd[wd_count].name);
719	  else
720	    {
721	      namebuf_t nbuf = namebuf_create (wd[wd_count - 1].abspath);
722	      namebuf_add_dir (nbuf, wd[wd_count].name);
723	      wd[wd_count].abspath = namebuf_finish (nbuf);
724	    }

abspath is NULL on line 721.

(gdb) print wd[wd_count]
$5 = {name = 0x1153130 "dir.101", abspath = 0x0, fd = 2}
(gdb) print wd[wd_count - 1]
$6 = {name = 0x43da38 ".", abspath = 0x0, fd = -100}
(gdb) print wd_count
$9 = 1

abspath is assigned here in src/misc.c:

int
chdir_arg (char const *dir)
{
  if (wd_count == wd_alloc)
    {
      if (wd_alloc == 0)
	{
	  wd_alloc = 2;
	  wd = xmalloc (sizeof *wd * wd_alloc);
	}
      else
	wd = x2nrealloc (wd, &wd_alloc, sizeof *wd);

      if (! wd_count)
	{
	  wd[wd_count].name = ".";
	  wd[wd_count].abspath = xgetcwd ();
	  wd[wd_count].fd = AT_FDCWD;
	  wd_count++;
	}
    }
…

However, gnu/xgetcwd.c actually does this:

/* Return the current directory, newly allocated.
   Upon an out-of-memory error, call xalloc_die.
   Upon any other type of error, return NULL.  */

char *
xgetcwd (void)
{
  char *cwd = getcwd (NULL, 0);
  if (! cwd && errno == ENOMEM)
    xalloc_die ();
  return cwd;
}

So even though this is an x* style interface, the caller has to check for a NULL result, and chdir_arg does not do this.

The observed NULL value may have been the result of the glibc change for additional getcwd error reporting in bug 1534635.

In any case, this is a bug in tar and needs to be fixed there.

Comment 13 Pavel Raiskup 2019-03-01 14:15:53 UTC
Forwarded upstream:
https://www.mail-archive.com/bug-tar@gnu.org/msg05768.html

Comment 15 Honza Horak 2020-05-18 13:40:02 UTC
This bug does not seem to important enough to be fixed in RHEL-7 any more. I'd propose to check whether it is fixed in RHEL-8 and either move there or close it WONTFIX.

Comment 16 Ondrej Dubaj 2020-05-20 07:57:59 UTC
This issue is not fixed in rhel-8. I will create a tracker for rhel-8 and close this issue for rhel-7.

Comment 17 Ondrej Dubaj 2020-05-20 08:08:25 UTC
Cloned bug for rhel-8 #1837871 

Closing this issue as WONTFIX

Comment 20 Nag Pavan Chilakam 2020-05-27 11:18:46 UTC
Download the kernel tarball from kernel.org to a glusterfs mount and run below

for i in {1..100};do mkdir dir.$i; cp ../linux-5.3.2.tar.xz dir.$i/linux-5.3.2.tar.xz; echo "############ this is loop $i" >> untar.$i.log ;echo "############ this is loop $i" >> tarball.$i.log ;date >> untar.$i.log ;tar -xvf dir.$i/linux-5.3.2.tar.xz -C dir.$i/ 2>> untar.$i.log;date >> untar.$i.log ;date >> tarball.$i.log ;tar -cvf dir.$i/lin.my.tar dir.$i/linux-5.3.2 2>> tarball.$i.log;date >> tarball.$i.log ;done

Comment 21 Ondrej Dubaj 2020-06-01 07:58:05 UTC
Thank you. Can you please provide more details about your glusterFS ? How many nodes is necessary to reproduce the issue, how much disk memory do they need...

Comment 22 Nag Pavan Chilakam 2020-06-15 05:54:06 UTC
(In reply to Ondrej Dubaj from comment #21)
> Thank you. Can you please provide more details about your glusterFS ? How
> many nodes is necessary to reproduce the issue, how much disk memory do they
> need...

Glusterfs Server nodes: 3 should be enough with each of atleast 16GB. Will need 3x3 disks(LVs for using as glusterfs bricks) of 100GB in each node, apart from the OS disk
Client Nodes: 4 clients each of 4GB atleast. Each with about 40GB disk size is sufficient


Note You need to log in before you can comment on or make changes to this bug.