Bug 1475112 - DNF system-upgrade failed to properly shutdown system, fsck found errors on /home after upgrade
DNF system-upgrade failed to properly shutdown system, fsck found errors on /...
Status: NEW
Product: Fedora
Classification: Fedora
Component: mlocate (Show other bugs)
25
Unspecified Unspecified
unspecified Severity unspecified
: ---
: ---
Assigned To: Michal Sekletar
Fedora Extras Quality Assurance
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2017-07-26 01:41 EDT by Paul DeStefano
Modified: 2017-07-31 13:47 EDT (History)
5 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Paul DeStefano 2017-07-26 01:41:24 EDT
Description of problem:
First boot after upgrade (via dnf system-upgrade reboot) errors found on /home.

Version-Release number of selected component (if applicable):
Up-to-date dnf for F25

How reproducible:
Unknown.

Steps to Reproduce:
1.  /home on separate LVM
2. dnf system-upgrade download
3. dnf system-upgrade reboot

Actual results:
/home doesn't mount, requires manual fsck

Expected results:
No data errors.

Additional info:
I've used dnf system-upgrade many times and never had this happen before.  But, /home wasn't even mounted during auto package install (right?), and there wasn't any errors when I booted immediately prior to dnf system-upgrade (which was a reboot after regular dnf upgrade) so these filesystem errors must have occurred after regular dnf upgrade and before the first updated boot.  I suspect dnf system-upgrade reboot shutdown the system too quickly.  I remember thinking it was unusually fast.  Is that possible?
Comment 1 Will Woods 2017-07-27 16:17:30 EDT
(In reply to Paul DeStefano from comment #0)
> I suspect dnf system-upgrade reboot shutdown the
> system too quickly.  I remember thinking it was unusually fast.  Is that
> possible?

It's unlikely. All `dnf system-upgrade reboot` does is make the magic symlink, write a state file, and run `systemctl reboot`. Furthermore, upgrades generally aren't going to touch /home, since no packaged files live there and DNF doesn't put caches there.

System upgrades can be *very* taxing on disks - it generally involves downloading several GB of data and then doing 30-90 minutes of sustained, heavy I/O. I've seen more than one system where this caused hardware to fail. Is there anything in the logs about disk errors during the upgrade?
Comment 2 Paul DeStefano 2017-07-28 01:19:22 EDT
Thanks for your help!

(In reply to Will Woods from comment #1)
> (In reply to Paul DeStefano from comment #0)
> > I suspect dnf system-upgrade reboot shutdown the
> > system too quickly.  I remember thinking it was unusually fast.  Is that
> > possible?
> 
> It's unlikely. All `dnf system-upgrade reboot` does is make the magic
> symlink, write a state file, and run `systemctl reboot`.

> Furthermore,
> upgrades generally aren't going to touch /home, since no packaged files live
> there and DNF doesn't put caches there.

Well, that was my point.  The upgrade process, itself, doesn't seem a likely root cause.  It had to have have happened before that.  But dnf system-update reboot was precisely what I did before that.  And not long before that, I did regular dnf upgrade and normal reboot, which didn't fail to mount /home, so it had to have happened after that.

> System upgrades can be *very* taxing on disks - it generally involves
> downloading several GB of data and then doing 30-90 minutes of sustained,
> heavy I/O. I've seen more than one system where this caused hardware to
> fail.

Yeah, that's for sure.  It ran rpm installs for over 3 hours on my system.  Ugh.  I need faster disk.  But, I do have new disks (7 months old) and new AM4 MB.

> Is there anything in the logs about disk errors during the upgrade?

Hmm, I assume so, let's see... Yes.  As I suspected, home did not unmount properly.  But the errors occur during the upgrade, not before.  That's very confusing.  I just assumed /home didn't even get mounted during the autoinstall process, but it clearly does.  But what would be writing there?  These errors correlate with mlocate, although, I cannot find this unit in systemd.  Could that be the straw that broke the camel's back?

Jul 25 17:22:07 <hostname> sudo[20667]: <username> : TTY=pts/33 ; PWD=/home/<username>/tmp ; USER=root ; COMMAND=/bin/bash -c dnf system-upgrade reboot
Jul 25 17:22:07 <hostname> system-python[20668]: Rebooting to perform upgrade.
Jul 25 17:22:07 <hostname> systemd-logind[1756]: System is rebooting.
Jul 25 17:22:07 <hostname> systemd[1]: Unmounted /home/<username>/mnt/psudrive.
Jul 25 17:22:07 <hostname> systemd[1]: Starting Show Plymouth Reboot Screen...
Jul 25 17:22:08 <hostname> blkdeactivate[20707]:   [UMOUNT]: unmounting vgNew-LVhome (dm-3) mounted on /home... skipping
Jul 25 17:22:08 <hostname> systemd[1]: Started Show Plymouth Reboot Screen.
Jul 25 17:22:39 <hostname> systemd[1]: Starting Console System Reboot Logging...
Jul 25 17:22:39 <hostname> systemd[1]: Stopping Update UTMP about System Boot/Shutdown...
Jul 25 17:22:39 <hostname> systemd[1]: Started Console System Reboot Logging.
Jul 25 17:22:39 <hostname> systemd[1]: Stopped Update UTMP about System Boot/Shutdown.
Jul 25 17:22:39 <hostname> systemd[1]: Unmounting /home...
Jul 25 17:22:39 <hostname> systemd[1]: Unmounting /boot/efi...
Jul 25 17:22:39 <hostname> systemd[1]: Unmounted /boot/efi.
Jul 25 17:22:39 <hostname> systemd[1]: Unmounting /boot...
Jul 25 17:22:39 <hostname> systemd[1]: Unmounted /boot.
Jul 25 17:22:39 <hostname> systemd[1]: Unmounted /home.
Jul 25 17:22:39 <hostname> systemd[1]: Stopped File System Check on /dev/mapper/vgNew-LVhome.
Jul 25 17:22:44 <hostname> systemd[1]: Starting Reboot...
-- Reboot --
Jul 25 17:23:16 <hostname> kernel: Command line: BOOT_IMAGE=/vmlinuz-4.11.11-200.fc25.x86_64 root=/dev/mapper/vgNew-LVroot ro rhgb rd.lvm.lv=vgNew/LVswap rd.dm=0 rd.md.uuid=325c9c27:b8fec7c0:bef88fc4:b4411970 rd.luks.uuid=luks-828643ee-e4a8-49a0-84a7-d2735d10e583 rd.lvm.lv=vgNew/LVroot SYSFONT=latarcyrheb-sun16 KEYMAP=us quiet LANG=en_US.UTF-8
Jul 25 17:23:16 <hostname> kernel: Secure boot could not be determined
Jul 25 17:23:16 <hostname> kernel: Kernel command line: BOOT_IMAGE=/vmlinuz-4.11.11-200.fc25.x86_64 root=/dev/mapper/vgNew-LVroot ro rhgb rd.lvm.lv=vgNew/LVswap rd.dm=0 rd.md.uuid=325c9c27:b8fec7c0:bef88fc4:b4411970 rd.luks.uuid=luks-828643ee-e4a8-49a0-84a7-d2735d10e583 rd.lvm.lv=vgNew/LVroot SYSFONT=latarcyrheb-sun16 KEYMAP=us quiet LANG=en_US.UTF-8
Jul 25 17:23:16 <hostname> kernel: ata1: SATA max UDMA/133 abar m131072@0xfe780000 port 0xfe780100 irq 40
Jul 25 17:23:16 <hostname> kernel: ata2: SATA max UDMA/133 abar m131072@0xfe780000 port 0xfe780180 irq 40
Jul 25 17:23:16 <hostname> kernel: ata3: SATA max UDMA/133 abar m131072@0xfe780000 port 0xfe780200 irq 40
Jul 25 17:23:16 <hostname> kernel: ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
Jul 25 17:23:16 <hostname> kernel: ata1.00: ATA-8: KINGSTON SH103S3120G, 507KC4, max UDMA/133
Jul 25 17:23:16 <hostname> kernel: ata1.00: 234441648 sectors, multi 16: LBA48 NCQ (depth 31/32), AA
Jul 25 17:23:16 <hostname> kernel: ata1.00: configured for UDMA/133
Jul 25 17:23:16 <hostname> kernel: ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
Jul 25 17:23:16 <hostname> kernel: ata2.00: NCQ Send/Recv Log not supported
Jul 25 17:23:16 <hostname> kernel: ata2.00: ATA-9: ST1000DX002-2DV162, CC41, max UDMA/133
Jul 25 17:23:16 <hostname> kernel: ata2.00: 1953525168 sectors, multi 16: LBA48 NCQ (depth 31/32), AA
Jul 25 17:23:16 <hostname> kernel: ata2.00: NCQ Send/Recv Log not supported
Jul 25 17:23:16 <hostname> kernel: ata2.00: configured for UDMA/133
Jul 25 17:23:16 <hostname> kernel: ata3: SATA link down (SStatus 0 SControl 300)
Jul 25 17:23:16 <hostname> dracut-cmdline[340]: Using kernel command line parameters: BOOT_IMAGE=/vmlinuz-4.11.11-200.fc25.x86_64 root=/dev/mapper/vgNew-LVroot ro rhgb rd.lvm.lv=vgNew/LVswap rd.dm=0 rd.md.uuid=325c9c27:b8fec7c0:bef88fc4:b4411970 rd.luks.uuid=luks-828643ee-e4a8-49a0-84a7-d2735d10e583 rd.lvm.lv=vgNew/LVroot SYSFONT=latarcyrheb-sun16 KEYMAP=us quiet LANG=en_US.UTF-8
Jul 25 17:23:16 <hostname> systemd[1]: Starting Show Plymouth Boot Screen...
Jul 25 17:23:16 <hostname> systemd[1]: Started Show Plymouth Boot Screen.
Jul 25 17:29:06 <hostname> dracut-initqueue[572]: inactive '/dev/vgNew/LVhome' [624.36 GiB] inherit
Jul 25 17:29:10 <hostname> systemd[1]: Found device KINGSTON_SH103S3120G Linux\x20\x2fboot.
Jul 25 17:29:10 <hostname> systemd[1]: Mounting /boot...
Jul 25 17:29:10 <hostname> systemd[1]: Mounted /boot.
Jul 25 17:29:10 <hostname> systemd[1]: Mounting /boot/efi...
Jul 25 17:29:10 <hostname> systemd[1]: Mounted /boot/efi.
Jul 25 17:29:12 <hostname> systemd[1]: Found device /dev/mapper/vgNew-LVhome.
Jul 25 17:29:12 <hostname> systemd[1]: Starting File System Check on /dev/mapper/vgNew-LVhome...
Jul 25 17:29:12 <hostname> systemd-fsck[5473]: /dev/mapper/vgNew-LVhome: clean, 527766/40919040 files, 145483965/163671040 blocks
Jul 25 17:29:12 <hostname> systemd[1]: Started File System Check on /dev/mapper/vgNew-LVhome.
Jul 25 17:29:12 <hostname> systemd[1]: Mounting /home...
Jul 25 17:29:12 <hostname> kernel: EXT4-fs (dm-3): mounted filesystem with ordered data mode. Opts: (null)
Jul 25 17:29:12 <hostname> systemd[1]: Mounted /home.
Jul 25 17:29:14 <hostname> systemd[1]: Starting Update UTMP about System Boot/Shutdown...
Jul 25 17:29:14 <hostname> systemd[1]: Started Update UTMP about System Boot/Shutdown.
Jul 25 17:30:02 <hostname> dnf[5544]:  ghc-ghc-boot-th           x86_64 8.0.2-57.fc26              @commandline  68 k
Jul 25 17:30:02 <hostname> dnf[5544]:  efibootmgr                x86_64 15-1.fc26                  @commandline  43 k
Jul 25 17:34:00 <hostname> dnf[5544]:   Installing  : ghc-ghc-boot-th-8.0.2-57.fc26.x86_64                   146/9797
Jul 25 17:37:48 <hostname> dnf[5544]:   Upgrading   : perl-File-HomeDir-1.00-13.fc26.noarch                  484/9797
Jul 25 18:19:48 <hostname> systemd[1]: Started Updates mlocate database every day.
Jul 25 18:19:48 <hostname> systemd[1]: Started Update a database for mlocate.
Jul 25 18:19:49 <hostname> dnf[5544]:   Upgrading   : mlocate-0.26-16.fc26.x86_64                           2719/9797
Jul 25 18:19:50 <hostname> kernel: EXT4-fs error (device dm-3): ext4_iget:4776: inode #8265359: block 11: comm updatedb: invalid block
Jul 25 18:19:51 <hostname> kernel: EXT4-fs error (device dm-3): ext4_iget:4776: inode #8466118: block 11: comm updatedb: invalid block
Jul 25 20:32:48 <hostname> dracut[18516]: Executing: /usr/bin/dracut -f /boot/initramfs-4.11.11-300.fc26.x86_64.img 4.11.11-300.fc26.x86_64
Jul 25 20:33:04 <hostname> dracut[18516]: *** Creating image file '/boot/initramfs-4.11.11-300.fc26.x86_64.img' ***
Jul 25 20:33:10 <hostname> dracut[18516]: *** Creating initramfs image file '/boot/initramfs-4.11.11-300.fc26.x86_64.img' done ***
Jul 25 20:51:23 <hostname> systemd-logind[5558]: System is rebooting.
Jul 25 20:51:23 <hostname> systemd[1]: Starting Show Plymouth Reboot Screen...
Jul 25 20:51:23 <hostname> systemd[1]: Started Show Plymouth Reboot Screen.
Jul 25 20:51:23 <hostname> systemd[1]: Unmounted /home.
Jul 25 20:51:23 <hostname> systemd[1]: Stopped File System Check on /dev/mapper/vgNew-LVhome.
Jul 25 20:51:23 <hostname> blkdeactivate[30270]:   [UMOUNT]: unmounting vgNew-LVhome (dm-3) mounted on /home... done
Jul 25 20:51:24 <hostname> systemd[1]: Starting Console System Reboot Logging...
Jul 25 20:51:24 <hostname> systemd[1]: Stopping Update UTMP about System Boot/Shutdown...
Jul 25 20:51:24 <hostname> systemd[1]: Stopped Update UTMP about System Boot/Shutdown.
Jul 25 20:51:24 <hostname> systemd[1]: Unmounting /boot/efi...
Jul 25 20:51:24 <hostname> systemd[1]: Started Console System Reboot Logging.
Jul 25 20:51:24 <hostname> systemd[1]: Unmounted /boot/efi.
Jul 25 20:51:24 <hostname> systemd[1]: Unmounting /boot...
Jul 25 20:51:24 <hostname> systemd[1]: Unmounted /boot.
Jul 25 20:51:29 <hostname> systemd[1]: Starting Reboot...
-- Reboot --
Jul 25 20:52:00 <hostname> kernel: Command line: BOOT_IMAGE=/vmlinuz-4.11.11-300.fc26.x86_64 root=/dev/mapper/vgNew-LVroot ro rhgb rd.lvm.lv=vgNew/LVswap rd.dm=0 rd.md.uuid=325c9c27:b8fec7c0:bef88fc4:b4411970 rd.luks.uuid=luks-828643ee-e4a8-49a0-84a7-d2735d10e583 rd.lvm.lv=vgNew/LVroot SYSFONT=latarcyrheb-sun16 KEYMAP=us quiet LANG=en_US.UTF-8
Jul 25 20:52:00 <hostname> kernel: Secure boot disabled
Jul 25 20:52:00 <hostname> kernel: smpboot: Allowing 16 CPUs, 0 hotplug CPUs
Jul 25 20:52:00 <hostname> kernel: Booting paravirtualized kernel on bare hardware
Jul 25 20:52:00 <hostname> kernel: Kernel command line: BOOT_IMAGE=/vmlinuz-4.11.11-300.fc26.x86_64 root=/dev/mapper/vgNew-LVroot ro rhgb rd.lvm.lv=vgNew/LVswap rd.dm=0 rd.md.uuid=325c9c27:b8fec7c0:bef88fc4:b4411970 rd.luks.uuid=luks-828643ee-e4a8-49a0-84a7-d2735d10e583 rd.lvm.lv=vgNew/LVroot SYSFONT=latarcyrheb-sun16 KEYMAP=us quiet LANG=en_US.UTF-8
Jul 25 20:52:00 <hostname> kernel: smpboot: Max logical packages: 2
Jul 25 20:52:00 <hostname> kernel: smpboot: CPU0: AMD Ryzen 7 1800X Eight-Core Processor (family: 0x17, model: 0x1, stepping: 0x1)
Jul 25 20:52:00 <hostname> kernel: x86: Booting SMP configuration:
Jul 25 20:52:00 <hostname> kernel: smpboot: Total of 16 processors activated (122744.68 BogoMIPS)
Jul 25 20:52:00 <hostname> kernel: pci 0000:28:00.0: vgaarb: setting as boot VGA device
Jul 25 20:52:00 <hostname> kernel: ata1: SATA max UDMA/133 abar m131072@0xfe780000 port 0xfe780100 irq 40
Jul 25 20:52:00 <hostname> kernel: ata2: SATA max UDMA/133 abar m131072@0xfe780000 port 0xfe780180 irq 40
Jul 25 20:52:00 <hostname> kernel: ata3: SATA max UDMA/133 abar m131072@0xfe780000 port 0xfe780200 irq 40
Jul 25 20:52:00 <hostname> kernel: ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
Jul 25 20:52:00 <hostname> kernel: ata1.00: ATA-8: KINGSTON SH103S3120G, 507KC4, max UDMA/133
Jul 25 20:52:00 <hostname> kernel: ata1.00: 234441648 sectors, multi 16: LBA48 NCQ (depth 31/32), AA
Jul 25 20:52:00 <hostname> kernel: ata1.00: configured for UDMA/133
Jul 25 20:52:00 <hostname> kernel: ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
Jul 25 20:52:00 <hostname> kernel: ata2.00: NCQ Send/Recv Log not supported
Jul 25 20:52:00 <hostname> kernel: ata2.00: ATA-9: ST1000DX002-2DV162, CC41, max UDMA/133
Jul 25 20:52:00 <hostname> kernel: ata2.00: 1953525168 sectors, multi 16: LBA48 NCQ (depth 31/32), AA
Jul 25 20:52:00 <hostname> kernel: ata2.00: NCQ Send/Recv Log not supported
Jul 25 20:52:00 <hostname> kernel: ata2.00: configured for UDMA/133
Jul 25 20:52:00 <hostname> kernel: ata3: SATA link down (SStatus 0 SControl 300)
Jul 25 20:52:00 <hostname> dracut-cmdline[339]: Using kernel command line parameters: BOOT_IMAGE=/vmlinuz-4.11.11-300.fc26.x86_64 root=/dev/mapper/vgNew-LVroot ro rhgb rd.lvm.lv=vgNew/LVswap rd.dm=0 rd.md.uuid=325c9c27:b8fec7c0:bef88fc4:b4411970 rd.luks.uuid=luks-828643ee-e4a8-49a0-84a7-d2735d10e583 rd.lvm.lv=vgNew/LVroot SYSFONT=latarcyrheb-sun16 KEYMAP=us quiet LANG=en_US.UTF-8
Jul 25 20:52:00 <hostname> systemd[1]: Starting Show Plymouth Boot Screen...
Jul 25 20:52:00 <hostname> systemd[1]: Started Show Plymouth Boot Screen.
Jul 25 20:52:15 <hostname> dracut-initqueue[553]: inactive '/dev/vgNew/LVhome' [624.36 GiB] inherit
Jul 25 20:52:26 <hostname> systemd[1]: Found device KINGSTON_SH103S3120G Linux\x20\x2fboot.
Jul 25 20:52:27 <hostname> systemd[1]: Mounting /boot...
Jul 25 20:52:27 <hostname> systemd[1]: Mounted /boot.
Jul 25 20:52:27 <hostname> systemd[1]: Mounting /boot/efi...
Jul 25 20:52:27 <hostname> systemd[1]: Mounted /boot/efi.
Jul 25 20:52:29 <hostname> systemd[1]: Found device /dev/mapper/vgNew-LVhome.
Jul 25 20:52:29 <hostname> systemd[1]: Starting File System Check on /dev/mapper/vgNew-LVhome...
Jul 25 20:52:34 <hostname> systemd-fsck[1429]: /dev/mapper/vgNew-LVhome contains a file system with errors, check forced.
Jul 25 20:52:34 <hostname> systemd-fsck[1429]: /dev/mapper/vgNew-LVhome: Inode 5792 extent tree (at level 1) could be shorter.  IGNORED.
Jul 25 20:52:35 <hostname> systemd-fsck[1429]: /dev/mapper/vgNew-LVhome: Inode 658735 extent tree (at level 1) could be shorter.  IGNORED.
Jul 25 20:52:35 <hostname> systemd-fsck[1429]: /dev/mapper/vgNew-LVhome: Inode 658821 extent tree (at level 1) could be shorter.  IGNORED.
Jul 25 20:52:36 <hostname> systemd-fsck[1429]: /dev/mapper/vgNew-LVhome: Inode 667084 extent tree (at level 1) could be shorter.  IGNORED.
Jul 25 20:52:36 <hostname> systemd-fsck[1429]: /dev/mapper/vgNew-LVhome: Inode 667090 extent tree (at level 1) could be shorter.  IGNORED.
Jul 25 20:52:36 <hostname> systemd-fsck[1429]: /dev/mapper/vgNew-LVhome: Inode 667094 extent tree (at level 1) could be shorter.  IGNORED.
Jul 25 20:52:42 <hostname> systemd-fsck[1429]: /dev/mapper/vgNew-LVhome: Inode 1447928 extent tree (at level 2) could be narrower.  IGNORED.
Jul 25 20:52:46 <hostname> systemd-fsck[1429]: /dev/mapper/vgNew-LVhome: Inode 3801781 extent tree (at level 2) could be narrower.  IGNORED.
Jul 25 20:52:46 <hostname> systemd-fsck[1429]: /dev/mapper/vgNew-LVhome: Inode 3804647 extent tree (at level 2) could be narrower.  IGNORED.
Jul 25 20:52:46 <hostname> systemd-fsck[1429]: /dev/mapper/vgNew-LVhome: Inode 3804814 extent tree (at level 2) could be narrower.  IGNORED.
Jul 25 20:52:47 <hostname> systemd-fsck[1429]: /dev/mapper/vgNew-LVhome: Inode 8265359 has INLINE_DATA_FL flag but extended attribute not found.
Jul 25 20:52:47 <hostname> systemd-fsck[1429]: /dev/mapper/vgNew-LVhome: UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY.
Jul 25 20:52:48 <hostname> systemd[1]: systemd-fsck@dev-mapper-vgNew\x2dLVhome.service: Main process exited, code=exited, status=1/FAILURE
Jul 25 20:52:48 <hostname> systemd[1]: Failed to start File System Check on /dev/mapper/vgNew-LVhome.
Jul 25 20:52:48 <hostname> systemd[1]: Dependency failed for /home.
Jul 25 20:52:48 <hostname> systemd[1]: Dependency failed for Mark the need to relabel after reboot.
Jul 25 20:52:48 <hostname> systemd[1]: home.mount: Job home.mount/start failed with result 'dependency'.
Jul 25 20:52:48 <hostname> systemd[1]: systemd-fsck@dev-mapper-vgNew\x2dLVhome.service: Unit entered failed state.
Jul 25 20:52:48 <hostname> systemd[1]: systemd-fsck@dev-mapper-vgNew\x2dLVhome.service: Failed with result 'exit-code'.
Comment 3 Will Woods 2017-07-31 13:00:35 EDT
(In reply to Paul DeStefano from comment #2)
> Thanks for your help!

Thanks for taking the time to actually look for the root cause of the problem!

> (In reply to Will Woods from comment #1)
> > Is there anything in the logs about disk errors during the upgrade?
> 
> Hmm, I assume so, let's see... Yes.  As I suspected, home did not unmount
> properly.  But the errors occur during the upgrade, not before.  That's very
> confusing.  I just assumed /home didn't even get mounted during the
> autoinstall process, but it clearly does.  But what would be writing there? 

All local disks should be mounted during the upgrade process, so /home is probably mounted. But no Fedora packages write files to /home, so the upgrade generally doesn't need to do much I/O on that filesystem. (But see below about mlocate..)

Also: we're talking about the underlying hardware here. It does look like your root partition and /home are in the same VG, so they are sharing the same disk(s). So, in theory, disk troubles during the upgrade could cause problems with /home.

> These errors correlate with mlocate, although, I cannot find this unit in
> systemd.  

It looks like mlocate ships two systemd unit files:

    /usr/lib/systemd/system/mlocate-updatedb.service
    /usr/lib/systemd/system/mlocate-updatedb.timer

(I use "rpm -ql mlocate | grep /lib/systemd" as a quick way to see what units are in a given package - or "dnf repoquery -l PACKAGENAME | grep ..." if the package isn't installed..)

> Could that be the straw that broke the camel's back?

Well, I can't tell for sure, but mlocate-updatedb definitely involves doing a lot of I/O, across all mounted filesystems (including /home!), as it reads metadata for every file that's changed on disk since the last update. Which (if it was started during an upgrade, as it seems to have done here) would be nearly everything.

That log really does look like running updatedb during the system upgrade resulted in filesystem errors:

> Jul 25 18:19:48 <hostname> systemd[1]: Started Updates mlocate database
> every day.
> Jul 25 18:19:48 <hostname> systemd[1]: Started Update a database for mlocate.
> Jul 25 18:19:49 <hostname> dnf[5544]:   Upgrading   :
> mlocate-0.26-16.fc26.x86_64                           2719/9797
> Jul 25 18:19:50 <hostname> kernel: EXT4-fs error (device dm-3):
> ext4_iget:4776: inode #8265359: block 11: comm updatedb: invalid block
> Jul 25 18:19:51 <hostname> kernel: EXT4-fs error (device dm-3):
> ext4_iget:4776: inode #8466118: block 11: comm updatedb: invalid block

I don't think updatedb *caused* the filesystem errors, but at the very least it's probably better for upgrade speed and disk reliability if we can avoid running it in the middle of a system upgrade. (At the very least it'll make the upgrade finish quicker...)

So maybe mlocate-updatedb.timer should have:

  Conflicts=system-update.target
  After=system-update.target

to prevent it from running during system updates/upgrades?

Or maybe, more generally, *all* timers should default to that? (That's really a question for the systemd maintainers though..)

Reassigning to mlocate to get opinions from its maintainer(s).
Comment 4 Paul DeStefano 2017-07-31 13:47:22 EDT
(In reply to Will Woods from comment #3)
> Thanks for taking the time to actually look for the root cause of the
> problem!

Of course!  I just need a little guidance, but always willing to troubleshoot.

> Also: we're talking about the underlying hardware here. It does look like
> your root partition and /home are in the same VG, so they are sharing the
> same disk(s). So, in theory, disk troubles during the upgrade could cause
> problems with /home.

Ah, yes, good point.  That makes complete sense.

> It looks like mlocate ships two systemd unit files:
> 
>     /usr/lib/systemd/system/mlocate-updatedb.service
>     /usr/lib/systemd/system/mlocate-updatedb.timer
> 
> (I use "rpm -ql mlocate | grep /lib/systemd" as a quick way to see what
> units are in a given package - or "dnf repoquery -l PACKAGENAME | grep ..."
> if the package isn't installed..)

Thanks, that's useful.  Side note, why don't these units show up in systemctl list-units?  Is still don't understand systemd.

> I don't think updatedb *caused* the filesystem errors, but at the very least
> it's probably better for upgrade speed and disk reliability if we can avoid
> running it in the middle of a system upgrade. (At the very least it'll make
> the upgrade finish quicker...)

Sure, I follow.  That seems reasonable to me, but you guys know better.

Awesome!  Thanks for your help.  Let me know if you need any more information.

Note You need to log in before you can comment on or make changes to this bug.