Bug 1470157

Summary: symbolic links are broken
Product: [Fedora] Fedora Reporter: Richard W.M. Jones <rjones>
Component: superminAssignee: Richard W.M. Jones <rjones>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: rawhideCC: dhill, eguan, gansalmon, ichavero, itamar, jonathan, kernel-maint, labbott, madhu.chinakonda, mathieu.tarral, mchehab, pabeni, ptoscano, rjones
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: supermin-5.1.18-1.fc26 supermin-5.1.18-1.fc25 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-07-28 17:19:07 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 910269, 1494859    
Attachments:
Description Flags
log from libguestfs-test-tool none

Description Richard W.M. Jones 2017-07-12 13:10:08 UTC
Description of problem:

Hard as it is to believe, the symbolic links appear to be broken
in kernel-4.13.0-0.rc0.git4.1.fc27.x86_64.  They are NOT broken in
the immediately preceeding kernel
(kernel-4.13.0-0.rc0.git3.1.fc27.x86_64).

In supermin we run a small init process, source is here:
https://github.com/libguestfs/supermin/blob/master/init/init.c#L482

With the broken kernel, when displaying the root directory, we
see strangely broken links:

supermin: debug: listing directory /
  205 d boot             040555 4096 0:0
  597 d proc             040555 4096 0:0
  198 - init             100755 6643 1000:1000
  174 d usr              040755 4096 1000:1000
    2 d ..               040755 4096 0:0
  594 d media            040755 4096 0:0
  632 d tmp              041777 4096 0:0
  206 d dev              040755 4096 0:0
  531 d home             040755 4096 0:0
  630 d srv              040755 4096 0:0
  534 l lib64            120777 9 0:0 -> \xe9\x11
 9044 d var              040755 4096 0:0
  604 l sbin             120777 8 0:0 -> \x5\x18
  599 d run              040755 4096 0:0
    2 d .                040755 4096 0:0
  200 l bin              120777 7 0:0 -> M\b
  598 d root             040550 4096 0:0
  631 d sys              040555 4096 0:0
   12 d etc              040755 4096 1000:1000
   11 d lost+found       040700 16384 0:0
  532 l lib              120777 7 0:0 -> \xe7\x11
  596 d opt              040755 4096 0:0
  595 d mnt              040755 4096 0:0
supermin: debug: listing directory /bin
/bin: No such file or directory
supermin: debug: listing directory /lib
/lib: No such file or directory
supermin: debug: listing directory /lib64
/lib64: No such file or directory


Version-Release number of selected component (if applicable):

Broken in:
kernel-4.13.0-0.rc0.git4.1.fc27.x86_64
Working in:
kernel-4.13.0-0.rc0.git3.1.fc27.x86_64

How reproducible:

100%

Steps to Reproduce:
1. Run libguestfs-test-tool with the latest kernel installed.

Additional info:

Full output attached.

Comment 1 Richard W.M. Jones 2017-07-12 13:11:30 UTC
Created attachment 1296975 [details]
log from libguestfs-test-tool

Comment 2 Richard W.M. Jones 2017-07-12 13:26:23 UTC
Also broken in kernel-4.13.0-0.rc0.git5.1.fc27.x86_64

Comment 3 Richard W.M. Jones 2017-07-12 14:37:51 UTC
This is reproducible with the current upstream kernel, but not with
v4.12 tag.  Am currently bisecting ...

Comment 4 Richard W.M. Jones 2017-07-12 16:12:33 UTC
407cd7fb83c0ebabb490190e673d8c71ee7df97e is the first bad commit
commit 407cd7fb83c0ebabb490190e673d8c71ee7df97e
Author: Tahsin Erdogan <tahsin>
Date:   Tue Jul 4 00:11:21 2017 -0400

    ext4: change fast symlink test to not rely on i_blocks
    
    ext4_inode_info->i_data is the storage area for 4 types of data:
    
      a) Extents data
      b) Inline data
      c) Block map
      d) Fast symlink data (symlink length < 60)
    
    Extents data case is positively identified by EXT4_INODE_EXTENTS flag.
    Inline data case is also obvious because of EXT4_INODE_INLINE_DATA
    flag.
    
    Distinguishing c) and d) however requires additional logic. This
    currently relies on i_blocks count. After subtracting external xattr
    block from i_blocks, if it is greater than 0 then we know that some
    data blocks exist, so there must be a block map.
    
    This logic got broken after ea_inode feature was added. That feature
    charges the data blocks of external xattr inodes to the referencing
    inode and so adds them to the i_blocks. To fix this, we could subtract
    ea_inode blocks by iterating through all xattr entries and then check
    whether remaining i_blocks count is zero. Besides being complicated,
    this won't change the fact that the current way of distinguishing
    between c) and d) is fragile.
    
    The alternative solution is to test whether i_size is less than 60 to
    determine fast symlink case. ext4_symlink() uses the same test to decide
    whether to store the symlink in i_data. There is one caveat to address
    before this can work though.
    
    If an inode's i_nlink is zero during eviction, its i_size is set to
    zero and its data is truncated. If system crashes before inode is removed
    from the orphan list, next boot orphan cleanup may find the inode with
    zero i_size. So, a symlink that had its data stored in a block may now
    appear to be a fast symlink. The solution used in this patch is to treat
    i_size = 0 as a non-fast symlink case. A zero sized symlink is not legal
    so the only time this can happen is the mentioned scenario. This is also
    logically correct because a i_size = 0 symlink has no data stored in
    i_data.
    
    Suggested-by: Andreas Dilger <adilger>
    Signed-off-by: Tahsin Erdogan <tahsin>
    Signed-off-by: Theodore Ts'o <tytso>
    Reviewed-by: Andreas Dilger <adilger>

:040000 040000 31b0eecd7314c483ed78346637ebb6a85c49741b 47a1dbadb0203d7f7d233224d47c5889ab22fc4f M	fs

Comment 5 Laura Abbott 2017-07-12 16:28:07 UTC
Thanks for the bisect! I haven't seen this reported on the ext4 mailing list so your best bet is to e-mail that list directly.

Comment 6 Richard W.M. Jones 2017-07-12 16:41:19 UTC
Actually I think I'm going to blame supermin for creating fast symlinks
wrong.  For the full explanation see:

https://www.redhat.com/archives/libguestfs/2017-July/msg00078.html

It's an open question if the kernel wants to support the "fast
symlinks stored slow" or if that is considered to be an erroneous
filesystem.

Comment 7 Richard W.M. Jones 2017-07-12 20:51:47 UTC
*** Bug 1470375 has been marked as a duplicate of this bug. ***

Comment 8 David Hill 2017-07-12 21:56:56 UTC
How can I tell supermin to use 4.12 kernel?

Comment 9 David Hill 2017-07-12 22:01:16 UTC
(In reply to David Hill from comment #8)
> How can I tell supermin to use 4.12 kernel?

 export SUPERMIN_KERNEL=/path/to/linux.git/arch/x86/boot/bzImage
 export SUPERMIN_MODULES=/tmp/kmods/lib/modules/3.xx.yy

Comment 10 David Hill 2017-07-12 22:02:53 UTC
I've tried using the previous 4.11.0-1 kernel and it worked as previously.

Comment 11 Richard W.M. Jones 2017-07-13 07:41:52 UTC
(In reply to David Hill from comment #9)
> (In reply to David Hill from comment #8)
> > How can I tell supermin to use 4.12 kernel?
> 
>  export SUPERMIN_KERNEL=/path/to/linux.git/arch/x86/boot/bzImage
>  export SUPERMIN_MODULES=/tmp/kmods/lib/modules/3.xx.yy

export SUPERMIN_KERNEL=/boot/vmlinuz-4.11.6-300.fc26.x86_64
export SUPERMIN_MODULES=/lib/modules/4.11.6-300.fc26.x86_64
rm -rf /var/tmp/.guestfs-*

Comment 12 Richard W.M. Jones 2017-07-13 07:54:43 UTC
v2 posted:
https://www.redhat.com/archives/libguestfs/2017-July/msg00084.html

Comment 13 Richard W.M. Jones 2017-07-13 07:59:19 UTC
Upstream discussion of the kernel/ext4 issues:
https://marc.info/?l=linux-ext4&m=149987925520576&w=2
Click ‘Subject: Fast symlinks stored slow’ to see the full thread.

Comment 14 Fedora Update System 2017-07-13 11:09:58 UTC
supermin-5.1.18-1.fc26 has been submitted as an update to Fedora 26. https://bodhi.fedoraproject.org/updates/FEDORA-2017-f6b14e0e63

Comment 15 Fedora Update System 2017-07-13 11:10:17 UTC
supermin-5.1.18-1.fc25 has been submitted as an update to Fedora 25. https://bodhi.fedoraproject.org/updates/FEDORA-2017-e587cfd70e

Comment 16 Richard W.M. Jones 2017-07-13 11:11:36 UTC
(In reply to David Hill from comment #10)
> I've tried using the previous 4.11.0-1 kernel and it worked as previously.

There are new supermin builds for Fedora 25, 26 and Rawhide, please
see if one of those works with the latest kernel:

https://koji.fedoraproject.org/koji/packageinfo?packageID=15420

Comment 17 David Hill 2017-07-13 13:58:15 UTC
This looks promising.  I've removed my exports and retried and it booted.  I'll try another build that I didn't modify and come back later on today.

Comment 18 Fedora Update System 2017-07-13 21:25:53 UTC
supermin-5.1.18-1.fc25 has been pushed to the Fedora 25 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2017-e587cfd70e

Comment 19 Fedora Update System 2017-07-13 23:53:35 UTC
supermin-5.1.18-1.fc26 has been pushed to the Fedora 26 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2017-f6b14e0e63

Comment 20 Fedora Update System 2017-07-28 17:19:07 UTC
supermin-5.1.18-1.fc26 has been pushed to the Fedora 26 stable repository. If problems still persist, please make note of it in this bug report.

Comment 21 Fedora Update System 2017-07-28 20:49:20 UTC
supermin-5.1.18-1.fc25 has been pushed to the Fedora 25 stable repository. If problems still persist, please make note of it in this bug report.

Comment 26 Richard W.M. Jones 2017-11-03 11:26:13 UTC
If you are looking for an update supermin5 package for RHEL 7, see:

https://www.redhat.com/archives/libguestfs/2017-November/msg00006.html
"Libguestfs for RHEL 7.5 preview repository"

Comment 27 Richard W.M. Jones 2018-01-15 15:50:04 UTC
*** Bug 1534616 has been marked as a duplicate of this bug. ***