1548399 – XFS file system created on VDO volumes are not coming up after reboot

Bug 1548399 - XFS file system created on VDO volumes are not coming up after reboot

Summary: XFS file system created on VDO volumes are not coming up after reboot

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	gdeploy
Sub Component:
Version:	rhgs-3.3
Hardware:	Unspecified
OS:	Linux
Priority:	high
Severity:	high
Target Milestone:	---
Target Release:	RHGS 3.3.1 Async
Assignee:	Ramakrishna Reddy Yekulla
QA Contact:	bipin
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1548393 1581561
TreeView+	depends on / blocked

Reported:	2018-02-23 12:12 UTC by bipin
Modified:	2018-06-21 03:34 UTC (History)
CC List:	13 users (show)
Fixed In Version:	gdeploy-2.0.2-27
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:	1548393
Environment:
Last Closed:	2018-06-21 03:33:15 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHEA-2018:1958	0	None	None	None	2018-06-21 03:34:48 UTC

Description bipin 2018-02-23 12:12:50 UTC

+++ This bug was initially created as a clone of Bug #1548393 +++

Description of problem:

Deployed HE RHV 4.2 with ansible on RHEL-7.5 with VDO volumes created. After rebooting one of the host could see the host going down. After logging to the mm console could see no device mapper path for the gluster volumes created. After commenting the mount path under /etc/fstab and rebooting could see the host coming up with the dev mapper path .


Version-Release number of selected component (if applicable):
kmod-kvdo-6.1.0.146-13.el7.x86_64
vdo-6.1.0.146-16.x86_64
glusterfs-3.12.2-4.el7rhgs.x86_64
glusterfs-fuse-3.12.2-4.el7rhgs.x86_64
rhvm-appliance-4.2-20180202.0.el7.noarch
rhv-release-4.2.2-1-001.noarch
gdeploy-2.0.2-22.el7rhgs.noarch
cockpit-ovirt-dashboard-0.11.11-0.1.el7ev.noarch

How reproducible:


Steps to Reproduce:
1.Deploy hostedengine  RHV 4.2 on RHEL 7.5
2.Rebooted one of the 3 POD host ( my case non SPM host)
3.See the host going down

Actual results:
The host goes down after a reboot

Expected results:
The host should come up after a reboot 


Additional info:

Comment 4 SATHEESARAN 2018-03-02 14:55:12 UTC

The actual problem here is when the server with VDO volume is rebooted, systemd tries to mount the kernel filesystems and thus tries to mount gluster XFS bricks in /etc/fstab.

But these filesystems are not yet available as VDO volume is not yet started, which leads the boot to fail and thus dropped in to maintenance shell.

Here is VDO systemd config file.
# cat /usr/lib/systemd/system/vdo.service 
[Unit]
Description=VDO volume services
After=systemd-remount-fs.service

[Service]
Type=oneshot
RemainAfterExit=yes
ExecStart=/usr/bin/vdo start --all --confFile /etc/vdoconf.yml
ExecStop=/usr/bin/vdo stop --all --confFile /etc/vdoconf.yml

[Install]
WantedBy=multi-user.target

So VDO service is loaded after mounting kernel filesystem service

Comment 5 SATHEESARAN 2018-03-02 15:03:54 UTC

The simple fix that could be made is to add param in XFS entries corresponding to gluster bricks, to have parameter 'x-systemd.requires=vdo.service'

Entry in fstab becomes like this:

'/dev/mapper/vg1-lv1 /home/bricks xfs defaults,x-systemd.requires=vdo.service 0 0'

gdeploy should add this line post creation XFS brick creation and VDO service is enabled on the underlying physical disk. This makes the requirement that [lv] section should carry that intelligence, whether the volume is created on physical disk or on VDO volume

Comment 6 Sahina Bose 2018-04-11 06:08:52 UTC

Please also look at https://github.com/dm-vdo/vdo/issues/7

Dennis, can you confirm this is the preferred way to solve this issue?

Comment 7 Dennis Keefe 2018-04-11 13:40:38 UTC

The vdo.service file will not mount VDO volumes it will only start the volume.

You can used the "x-systemd.requires=vdo.service" or the mount unit file described in  https://github.com/dm-vdo/vdo/issues/7.

This will need to be tested in conjunction with other service dependencies like
LVM and Gluster for the VDO volume to make sure that everything starts up correctly.

Comment 8 Devyani Kota 2018-04-20 07:47:41 UTC

This commit[1] fixes this issue.

[1]https://github.com/gluster/gdeploy/pull/504/commits/b89286197b37db7a21b037d46932ee31f4fb58c1

Thanks.

Comment 9 Sahina Bose 2018-04-30 05:09:29 UTC

mount did not have the arguments as requested. I still see the same problem while rebooting. Can you check?

[root@rhsdev-grafton3 ~]# rpm -qa | grep gdeploy
gdeploy-2.0.2-26.el7rhgs.noarch
[root@rhsdev-grafton3 ~]# cat /etc/fstab

#
# /etc/fstab
# Created by anaconda on Thu Apr 26 14:22:06 2018
#
# Accessible filesystems, by reference, are maintained under '/dev/disk'
# See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info
#
/dev/mapper/rhel_rhsdev--grafton3-root /                       xfs     defaults        0 0
UUID=2756b572-c6b0-4bd2-9be3-8a9232d48adc /boot                   xfs     defaults        0 0
/dev/mapper/rhel_rhsdev--grafton3-home /home                   xfs     defaults        0 0
/dev/mapper/rhel_rhsdev--grafton3-swap swap                    swap    defaults        0 0
/dev/gluster_vg_sdd/gluster_lv_engine /gluster_bricks/engine xfs inode64,noatime,nodiratime 0 0
/dev/gluster_vg_sdd/gluster_lv_data /gluster_bricks/data xfs inode64,noatime,nodiratime 0 0
/dev/gluster_vg_sdd/gluster_lv_vmstore /gluster_bricks/vmstore xfs inode64,noatime,nodiratime 0 0

#lsblk
NAME                                            MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sda                                               8:0    0   931G  0 disk 
├─sda1                                            8:1    0     1G  0 part /boot
└─sda2                                            8:2    0   930G  0 part 
  ├─rhel_rhsdev--grafton2-root                  253:0    0    50G  0 lvm  /
  ├─rhel_rhsdev--grafton2-swap                  253:1    0     4G  0 lvm  [SWAP]
  └─rhel_rhsdev--grafton2-home                  253:2    0   876G  0 lvm  /home
sdb                                               8:16   0 185.8G  0 disk 
sdc                                               8:32   0 185.8G  0 disk 
sdd                                               8:48   0  18.2T  0 disk 
└─vdo_sdd                                       253:3    0 137.7T  0 vdo  
  ├─gluster_vg_sdd-gluster_thinpool_sdd_tmeta   253:4    0  15.8G  0 lvm  
  │ └─gluster_vg_sdd-gluster_thinpool_sdd-tpool 253:6    0  13.7T  0 lvm  
  │   ├─gluster_vg_sdd-gluster_thinpool_sdd     253:7    0  13.7T  0 lvm  
  │   ├─gluster_vg_sdd-gluster_lv_data          253:9    0   4.9T  0 lvm  /gluster_bricks/data
  │   └─gluster_vg_sdd-gluster_lv_vmstore       253:10   0   8.8T  0 lvm  /gluster_bricks/vmstore
  ├─gluster_vg_sdd-gluster_thinpool_sdd_tdata   253:5    0  13.7T  0 lvm  
  │ └─gluster_vg_sdd-gluster_thinpool_sdd-tpool 253:6    0  13.7T  0 lvm  
  │   ├─gluster_vg_sdd-gluster_thinpool_sdd     253:7    0  13.7T  0 lvm  
  │   ├─gluster_vg_sdd-gluster_lv_data          253:9    0   4.9T  0 lvm  /gluster_bricks/data
  │   └─gluster_vg_sdd-gluster_lv_vmstore       253:10   0   8.8T  0 lvm  /gluster_bricks/vmstore
  └─gluster_vg_sdd-gluster_lv_engine            253:8    0   100G  0 lvm  /gluster_bricks/engine

Comment 10 SATHEESARAN 2018-05-02 10:23:29 UTC

I see the problem similar to comment9.

Tested with gdeploy-2.0.2-26, and still find the same problem.

Checked with Ramky with this, and he is working on to fix the issue.

Comment 11 Sachidananda Urs 2018-05-22 09:15:41 UTC

The commit: https://github.com/gluster/gdeploy/pull/510/commits/a51b6fc149c should fix the issue. 

* Reboots work fine now.
* Tested with VDO and non-VDO disks

Comment 15 errata-xmlrpc 2018-06-21 03:33:15 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2018:1958

Note You need to log in before you can comment on or make changes to this bug.