1638922 – [AIO] standalone deployment cinder-volume storage does not survive a machine reboot

Bug 1638922 - [AIO] standalone deployment cinder-volume storage does not survive a machine reboot

Summary: [AIO] standalone deployment cinder-volume storage does not survive a machine ...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat OpenStack
Classification:	Red Hat
Component:	openstack-tripleo-heat-templates
Sub Component:
Version:	14.0 (Rocky)
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	high
Target Milestone:	beta
Target Release:	14.0 (Rocky)
Assignee:	Alan Bishop
QA Contact:	Gurenko Alex
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2018-10-12 20:14 UTC by Fabio Massimo Di Nitto
Modified:	2019-01-11 11:54 UTC (History)
CC List:	6 users (show)
Fixed In Version:	openstack-tripleo-heat-templates-9.0.1-0.20181013060868.ffbe879.el7ost
Doc Type:	Bug Fix
Doc Text:	Previously, the loopback device for Cinder iSCSI/LVM backend was not recreated after a system restart, which prevented the cinder-volume service from restarting. This fix adds a systemd service that recreates the loopback device and therefore persists the Cinder iSCSI/LVM backend after a restart.
Clone Of:
Environment:
Last Closed:	2019-01-11 11:53:55 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
OpenStack gerrit	597202	0	'None'	MERGED	Recreate cinder LVM loopback device on startup	2020-11-17 12:58:15 UTC
Red Hat Product Errata	RHEA-2019:0045	0	None	None	None	2019-01-11 11:54:03 UTC

Description Fabio Massimo Di Nitto 2018-10-12 20:14:41 UTC

Deploying AIO without ceph with the following parameters:

parameter_defaults:
  CloudName: osp
  # default gateway
  ControlPlaneStaticRoutes:
    - ip_netmask: 0.0.0.0/0
      next_hop: 192.168.0.1
      default: true
  Debug: true
  DeploymentUser: stack
  DnsServers:
    - 192.168.0.1
  NtpServer:
    - 192.168.0.1
  # needed for vip & pacemaker
  KernelIpNonLocalBind: 1
  DockerInsecureRegistryAddress:
    - osp.int.fabbione.net:8787
    - docker-registry.engineering.redhat.com
  NeutronPublicInterface: eth0
  # domain name used by the host
  NeutronDnsDomain: stoca
  # i'm just adding random flags pretending i know what i'm doing
  # stop pretending you all-mighty
  NeutronEnableInternalDNS: true
  DnsServers: ["192.168.0.1"]
  # re-use ctlplane bridge for public net, defined in the standalone
  # net config (do not change unless you know what you're doing)
  NeutronBridgeMappings: datacentre:br-ctlplane
  NeutronPhysicalBridge: br-ctlplane
  # enable to force metadata for public net
  #NeutronEnableForceMetadata: true
  StandaloneEnableRoutedNetworks: false
  StandaloneHomeDir: /home/stack
  StandaloneLocalMtu: 1500
  # Needed if running in a VM, not needed if on baremetal
  #StandaloneExtraConfig:
  #  nova::compute::libvirt::services::libvirt_virt_type: qemu
  #  nova::compute::libvirt::libvirt_virt_type: qemu
  HeatEngineOptVolumes:
    - /usr/lib/heat:/usr/lib/heat:ro

resource_registry:
  OS::TripleO::Services::HeatApi: /usr/share/openstack-tripleo-heat-templates/docker/services/heat-api.yaml
  OS::TripleO::Services::HeatApiCfn: /usr/share/openstack-tripleo-heat-templates/docker/services/heat-api-cfn.yaml
  OS::TripleO::Services::HeatEngine: /usr/share/openstack-tripleo-heat-templates/docker/services/heat-engine.yaml

sudo openstack tripleo deploy \
 --templates \
 --local-ip=192.168.0.202/22 \
 -e /usr/share/openstack-tripleo-heat-templates/environments/standalone.yaml \
 -r /usr/share/openstack-tripleo-heat-templates/roles/Standalone.yaml \
 -e $HOME/containers-prepare-parameters.yaml \
 -e $HOME/standalone_parameters.yaml \
 --output-dir $HOME/workdir \
 --standalone

the default cinder-volume storage is configured to use iscsi where the iscsi backend is currently using a loopback device on a file:

[root@osp ~]# losetup -a
/dev/loop2: [64768]:904031 (/var/lib/cinder/cinder-volumes)

The problem is that there are no default facilities in Linux (not just RHEL) to losetup a loopback file at boot.

Upon reboot, the loop2 is not configured and the iscsi/cinder-volume services will stop functioning properly.

2018-10-12 18:52:49.691 64 INFO cinder.volume.manager [req-8ce372ac-9886-457f-9c8d-36916252f401 - - - - -] Initializing RPC dependent components of volume driver LVMVolumeDriver (3.0.0)
2018-10-12 18:52:49.691 64 ERROR cinder.utils [req-8ce372ac-9886-457f-9c8d-36916252f401 - - - - -] Volume driver LVMVolumeDriver not initialized
2018-10-12 18:52:49.691 64 ERROR cinder.volume.manager [req-8ce372ac-9886-457f-9c8d-36916252f401 - - - - -] Cannot complete RPC initialization because driver isn't initialized properly.: DriverNotInitialized: Volume driver not ready.

That said, by googling around and doing some tests, the only viable option was to introduce a custom systemd unit file (http://www.anthonyldechiaro.com/blog/2010/12/19/lvm-loopback-how-to/).

[root@osp ~]# cat /etc/systemd/system/cinder-volume-loopback.service 
[Unit]
Description=Activate Cinder Volume Loopback device
DefaultDependencies=no
After=systemd-udev-settle.service
Before=lvm2-activation-early.service
Wants=systemd-udev-settle.service

[Service]
ExecStart=/sbin/losetup /dev/loop2 /var/lib/cinder/cinder-volumes
Type=oneshot

[Install]
WantedBy=local-fs.target

systemctl enable cinder-volume-loopback

Enabling the unit et all, will return the system to the correct status after a reboot.

Side notes:
1) adding both DFG:DF and DFG:Storage since it affects both.
2) I don't know if this storage configuration is supported or not. if it's not, then this bz should turn into an RFE to change default backend. I am no storage expert, i just notice when my volumes disappear :P
3) Severity: High to the impact of the problem, Priority: Medium since AIO is still TP

Comment 1 Michele Baldessari 2018-10-12 20:25:44 UTC

Note that this has been closed as WONTFIX in the past as LVM on loopback is not considered for production use (https://bugzilla.redhat.com/show_bug.cgi?id=1241644)

Comment 2 Fabio Massimo Di Nitto 2018-10-13 04:47:48 UTC

(In reply to Michele Baldessari from comment #1)
> Note that this has been closed as WONTFIX in the past as LVM on loopback is
> not considered for production use
> (https://bugzilla.redhat.com/show_bug.cgi?id=1241644)

Noted, but then the default storage should be changed (see also notes #2 in comment #1).

That said, this being a single node deployment that doesn't suffer of HA complexity, it might be treated differently.

Comment 3 Alan Bishop 2018-10-17 15:26:23 UTC

Upstream patch has (nearly) merged, and I'll propose it for Rocky so we can get this fixed for OSP-14.

Comment 4 Alan Bishop 2018-10-21 15:25:48 UTC

Patch has merged on master, and has been proposed for stable/rocky.

Comment 13 errata-xmlrpc 2019-01-11 11:53:55 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2019:0045

Note You need to log in before you can comment on or make changes to this bug.