Bug 1326232

Summary: [RFE] Validate basic Node status during boot
Product: [oVirt] ovirt-node Reporter: Fabian Deutsch <fdeutsch>
Component: GeneralAssignee: Fabian Deutsch <fdeutsch>
Status: CLOSED CURRENTRELEASE QA Contact: cshao <cshao>
Severity: high Docs Contact:
Priority: high    
Version: 4.0CC: bugs, dfediuck, mgoldboi, tlitovsk, ycui, ylavi
Target Milestone: ovirt-4.0.1Keywords: FutureFeature
Target Release: 4.0Flags: rule-engine: ovirt-4.0.z+
rule-engine: exception+
mgoldboi: planning_ack+
fdeutsch: devel_ack+
ycui: testing_ack+
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: ovirt-node-ng-installer-ovirt-4.0-snapshot-2016061710.iso Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-08-04 13:32:13 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Node RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1356611    
Bug Blocks: 1140646, 1374562    

Description Fabian Deutsch 2016-04-12 08:27:23 UTC
Description of problem:
Some problems with Node can be detected at boot.
If such a problem is detected, a message should be written to issues/motd to notify the user.

We need to raise problems early to prevent that something goes wrong at some later point in time.

Such problems are:
- Installation failed, no corretc layout (imgbase layout fails)
- The VG/pool is running out of space

Version-Release number of selected component (if applicable):
4.0

How reproducible:
Always

Comment 1 Sandro Bonazzola 2016-05-02 09:53:34 UTC
Moving from 4.0 alpha to 4.0 beta since 4.0 alpha has been already released and bug is not ON_QA.

Comment 2 Fabian Deutsch 2016-05-31 09:19:21 UTC
How to test:

1. install
2. Reboot host
3. Login on terminal

After 3: Theer should be a status message "imgbased status: OK" or "imgbased status: DEGRADED"

Comment 3 Fabian Deutsch 2016-06-15 09:20:24 UTC
A patch was missed.

Comment 4 Red Hat Bugzilla Rules Engine 2016-06-15 14:17:59 UTC
Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release.

Comment 5 Fabian Deutsch 2016-06-15 18:46:26 UTC
The remaining problem is that systemd presets don't work nice, and systemd enable also does not work nice, because it only changes /etc, which should actually only be touched by the user.

This is a try to fix this:
https://gerrit.ovirt.org/59280 node: Make some units enabled by default [DRAFT]

Another idea is to let imgbased move /etc/systemd to some vendor specififc directory in the postprocessing.

Comment 6 cshao 2016-07-13 08:49:51 UTC
Test version:
redhat-virtualization-host-4.0-20160708.0 
imgbased-0.7.2-0.1.el7ev.noarch
redhat-release-virtualization-host-4.0-0.13.el7.x86_64
cockpit-0.108-1.el7.x86_64

Test steps:
1. Install RHVH
2. Reboot host
3. Login on terminal
4. imgbase check
5. Login cockpit - > Dashboard page.

Test result:
1. After step 4,
# imgbase check
Status: OK
Mount points ... OK
  Separate /var ... OK
  Discard is used ... OK
Basic storage ... OK
  Initialized VG ... OK
  Initialized Thin Pool ... OK
  Initialized LVs ... OK
Thin storage ... OK
  Checking available space in thinpool ... OK
  Checking thinpool auto-extend ... OK
2. After step5, 
Status
⊖ Thin storage
Checking available space in thinpool
Checking thinpool auto-extend
⊖ Basic storage
Initialized thin pool
Initialized lvs
Initialized vg
vdsmd
⊖ Mount points
Discard is used
Separate var

The basic Node status can be detected, so the bug is fixed. 
Change bug status to VERIFIED.

Comment 7 Ying Cui 2016-07-13 13:00:28 UTC
This is RFE bug, we need to consider the following to completely verify it. And if new issue, then we need to open new bug to trace, but at least, we need to cover these scenarios.

1. check imgbase-motd.service default status, it should be running.
2. also need negative test cases, like 
   - umount some lv, and back to check Node status on Mounts point,any exception error?
   - installation without thinpool, what's happen on Node status? any exception error?
   - installation without lvm, what's happend on Node status?any exception error?
    ...

Comment 8 cshao 2016-07-14 13:45:19 UTC
(In reply to Ying Cui from comment #7)
> This is RFE bug, we need to consider the following to completely verify it.
> And if new issue, then we need to open new bug to trace, but at least, we
> need to cover these scenarios.
> 
> 1. check imgbase-motd.service default status, it should be running.
The imgbase-motd.service is inactive as default.
Bug 1356611 can trace this issue.

> 2. also need negative test cases, like 
>    - umount some lv, and back to check Node status on Mounts point,any
> exception error?
umount lv, check node status - pass

# imgbase check
Status: OK
Mount points ... OK
  Separate /var ... OK
  Discard is used ... OK
Basic storage ... OK
  Initialized VG ... OK
  Initialized Thin Pool ... OK
  Initialized LVs ... OK
Thin storage ... OK
  Checking available space in thinpool ... OK
  Checking thinpool auto-extend ... OK


No exception after reboot.

>    - installation without thinpool, what's happen on Node status? any
> exception error?

Pass as no exception

>    - installation without lvm, what's happend on Node status?any exception
> error?
>     ...

Pass as no exception


But due to the imgbase-motd.service is inactive as default, so I will verify this bug after 1356611 fixed.

Thanks for ycui's note

Comment 9 Ryan Barry 2016-07-14 14:22:59 UTC
(In reply to Ying Cui from comment #7)
> This is RFE bug, we need to consider the following to completely verify it.
> And if new issue, then we need to open new bug to trace, but at least, we
> need to cover these scenarios.
> 
> 2. also need negative test cases, like 
>    - umount some lv, and back to check Node status on Mounts point,any
> exception error?
>    - installation without thinpool, what's happen on Node status? any
> exception error?
>    - installation without lvm, what's happend on Node status?any exception
> error?
>     ...

As a request, for these scenarios, please use "imgbase check" -- the dashboard is polling this (through nodectl), and any failures are most easily traced from here.

Comment 10 cshao 2016-07-14 14:26:48 UTC
(In reply to Ryan Barry from comment #9)
> (In reply to Ying Cui from comment #7)
> > This is RFE bug, we need to consider the following to completely verify it.
> > And if new issue, then we need to open new bug to trace, but at least, we
> > need to cover these scenarios.
> > 
> > 2. also need negative test cases, like 
> >    - umount some lv, and back to check Node status on Mounts point,any
> > exception error?
> >    - installation without thinpool, what's happen on Node status? any
> > exception error?
> >    - installation without lvm, what's happend on Node status?any exception
> > error?
> >     ...
> 
> As a request, for these scenarios, please use "imgbase check" -- the
> dashboard is polling this (through nodectl), and any failures are most
> easily traced from here.

OK, thanks for your advice.

Comment 11 cshao 2016-07-26 10:05:05 UTC
The depends bug 1356611 is fixed, and according #c6 & #c8, this bug is fixed, change bug status to VERIFIED.