Bug 1624731

Summary: Include smartmontools in oVirt Node
Product: [oVirt] ovirt-node Reporter: Hesham <hsahmed>
Component: RFEsAssignee: Yuval Turgeman <yturgema>
Status: CLOSED WONTFIX QA Contact: cshao <cshao>
Severity: medium Docs Contact:
Priority: unspecified    
Version: masterCC: andreas.elvers+redhat.bugzilla, bugs, cshao, huzhao, jiaczhan, qiyuan, rbarry, sbonazzo, weiwang, yaniwang, ycui, yturgema, yzhao
Target Milestone: ---Keywords: FutureFeature
Target Release: ---Flags: rule-engine: planning_ack?
rule-engine: devel_ack?
rule-engine: testing_ack?
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1933245 (view as bug list) Environment:
Last Closed: 2018-09-03 13:37:44 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Node RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1933245    

Description Hesham 2018-09-03 07:36:08 UTC
Description of problem:
smartmontools is required to monitor the health of SMART capable hard disks. It should be included in the oVirt Node image to allow monitoring of node hard disk health

Comment 1 Sandro Bonazzola 2018-09-03 13:37:44 UTC
Thanks for reporting but I think smartmontools package is not needed on oVirt Node for monitoring.
oVirt Node ships cockpit-storaged-172-2.el7.centos.noarch which uses libatasmart-0.19-6.el7.x86_64 to access S.M.A.R.T. data and give data on cockpit UI.

We need to trade-off between package installed and image size so unless there's a critical need for smartmontools I would prefer to not have it include in the ISO.

Note that if for some reason you need it to investigate a bad status reported by cockpit you can just install it after enabling the repository shipping it. Its installation will be persisted on upgrade.

Comment 2 Andreas Elvers 2019-05-22 21:25:56 UTC
I had a node failures beacause of a failing ssd drive. It was a node ng with a raid 1 out of two ssds. Shouldn't the engine pick up and report those s.m.a.r.t errors? I only realized because of the errors printed in server console.