Bug 1680469
Summary: | tar: does not check for NULL error return from xgetcwd | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Nag Pavan Chilakam <nchilaka> | ||||
Component: | tar | Assignee: | Petr Kubat <pkubat> | ||||
Status: | CLOSED WONTFIX | QA Contact: | RHEL CS Apps Subsystem QE <rhel-cs-apps-subsystem-qe> | ||||
Severity: | high | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | 7.6 | CC: | ashankar, codonell, databases-maint, dj, fweimer, hhorak, mnewsome, nchilaka, odubaj, panovotn, pfrankli, praiskup, storage-qa-internal | ||||
Target Milestone: | rc | ||||||
Target Release: | --- | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | |||||||
: | 1837871 (view as bug list) | Environment: | |||||
Last Closed: | 2020-05-20 08:08:25 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Nag Pavan Chilakam
2019-02-25 06:36:33 UTC
Without a coredump, there doesn't seem to be enough data to investigate this. The crash at this address: [Sat Feb 23 05:22:49 2019] tar[13200]: segfault at 0 ip 00007f1f35a51901 sp 00007ffcdec60f18 error 4 in libc-2.17.so[7f1f358e3000+1c2000] is at the movdqu instruction here: Dump of assembler code for function __strlen_sse2_pminub: 0x000000000016e8f0 <+0>: xor %rax,%rax 0x000000000016e8f3 <+3>: mov %edi,%ecx 0x000000000016e8f5 <+5>: and $0x3f,%ecx 0x000000000016e8f8 <+8>: pxor %xmm0,%xmm0 0x000000000016e8fc <+12>: cmp $0x30,%ecx 0x000000000016e8ff <+15>: ja 0x16e91e <__strlen_sse2_pminub+46> 0x000000000016e901 <+17>: movdqu (%rdi),%xmm1 This means that something called strlen with a NULL argument. That is still not enough to debug this further. We do not even know the responsible component yet. Maybe Carlos can have a look, but I really don't see how we can move this forward. (In reply to Florian Weimer from comment #7) > This means that something called strlen with a NULL argument. That is still > not enough to debug this further. We do not even know the responsible > component yet. > > Maybe Carlos can have a look, but I really don't see how we can move this > forward. We absolutely need a coredump. It's vital to determining if this is a tar issue with invalid input, or a glibc issue with strlen. Created attachment 1539495 [details]
coredump
Attached is the coredump for the segfault
Thanks a lot for providing the coredump. I think we are finally getting somewhere: 704 if (dir[0]) 705 { 706 while (dir[0] == '.' && ISSLASH (dir[1])) 707 for (dir += 2; ISSLASH (*dir); dir++) 708 continue; 709 if (! dir[dir[0] == '.']) 710 return wd_count - 1; 711 } 712 713 wd[wd_count].name = dir; 714 /* if the given name is an absolute path, then use that path 715 to represent this working directory; otherwise, construct 716 a path based on the previous -C option's absolute path */ 717 if (IS_ABSOLUTE_FILE_NAME (wd[wd_count].name)) 718 wd[wd_count].abspath = xstrdup (wd[wd_count].name); 719 else 720 { 721 namebuf_t nbuf = namebuf_create (wd[wd_count - 1].abspath); 722 namebuf_add_dir (nbuf, wd[wd_count].name); 723 wd[wd_count].abspath = namebuf_finish (nbuf); 724 } abspath is NULL on line 721. (gdb) print wd[wd_count] $5 = {name = 0x1153130 "dir.101", abspath = 0x0, fd = 2} (gdb) print wd[wd_count - 1] $6 = {name = 0x43da38 ".", abspath = 0x0, fd = -100} (gdb) print wd_count $9 = 1 abspath is assigned here in src/misc.c: int chdir_arg (char const *dir) { if (wd_count == wd_alloc) { if (wd_alloc == 0) { wd_alloc = 2; wd = xmalloc (sizeof *wd * wd_alloc); } else wd = x2nrealloc (wd, &wd_alloc, sizeof *wd); if (! wd_count) { wd[wd_count].name = "."; wd[wd_count].abspath = xgetcwd (); wd[wd_count].fd = AT_FDCWD; wd_count++; } } … However, gnu/xgetcwd.c actually does this: /* Return the current directory, newly allocated. Upon an out-of-memory error, call xalloc_die. Upon any other type of error, return NULL. */ char * xgetcwd (void) { char *cwd = getcwd (NULL, 0); if (! cwd && errno == ENOMEM) xalloc_die (); return cwd; } So even though this is an x* style interface, the caller has to check for a NULL result, and chdir_arg does not do this. The observed NULL value may have been the result of the glibc change for additional getcwd error reporting in bug 1534635. In any case, this is a bug in tar and needs to be fixed there. Forwarded upstream: https://www.mail-archive.com/bug-tar@gnu.org/msg05768.html This bug does not seem to important enough to be fixed in RHEL-7 any more. I'd propose to check whether it is fixed in RHEL-8 and either move there or close it WONTFIX. This issue is not fixed in rhel-8. I will create a tracker for rhel-8 and close this issue for rhel-7. Cloned bug for rhel-8 #1837871 Closing this issue as WONTFIX Download the kernel tarball from kernel.org to a glusterfs mount and run below for i in {1..100};do mkdir dir.$i; cp ../linux-5.3.2.tar.xz dir.$i/linux-5.3.2.tar.xz; echo "############ this is loop $i" >> untar.$i.log ;echo "############ this is loop $i" >> tarball.$i.log ;date >> untar.$i.log ;tar -xvf dir.$i/linux-5.3.2.tar.xz -C dir.$i/ 2>> untar.$i.log;date >> untar.$i.log ;date >> tarball.$i.log ;tar -cvf dir.$i/lin.my.tar dir.$i/linux-5.3.2 2>> tarball.$i.log;date >> tarball.$i.log ;done Thank you. Can you please provide more details about your glusterFS ? How many nodes is necessary to reproduce the issue, how much disk memory do they need... (In reply to Ondrej Dubaj from comment #21) > Thank you. Can you please provide more details about your glusterFS ? How > many nodes is necessary to reproduce the issue, how much disk memory do they > need... Glusterfs Server nodes: 3 should be enough with each of atleast 16GB. Will need 3x3 disks(LVs for using as glusterfs bricks) of 100GB in each node, apart from the OS disk Client Nodes: 4 clients each of 4GB atleast. Each with about 40GB disk size is sufficient |