Bug 1121145
| Summary: | vdsm does not report stats for vlan devices due to a sloppy backport | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Virtualization Manager | Reporter: | Martin Pavlik <mpavlik> | ||||||
| Component: | vdsm | Assignee: | Dan Kenigsberg <danken> | ||||||
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Martin Pavlik <mpavlik> | ||||||
| Severity: | urgent | Docs Contact: | |||||||
| Priority: | high | ||||||||
| Version: | 3.4.0 | CC: | bazulay, danken, ecohen, gklein, iheim, lpeer, lvernia, mburman, mpavlik, nyechiel, yeylon | ||||||
| Target Milestone: | --- | ||||||||
| Target Release: | 3.4.1 | ||||||||
| Hardware: | Unspecified | ||||||||
| OS: | Unspecified | ||||||||
| Whiteboard: | network | ||||||||
| Fixed In Version: | vdsm-4.14.11-5.el6ev.x86_64 | Doc Type: | Bug Fix | ||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | Environment: | ||||||||
| Last Closed: | 2014-08-05 08:25:38 UTC | Type: | Bug | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | Network | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Embargoed: | |||||||||
| Bug Depends On: | |||||||||
| Bug Blocks: | 1117634 | ||||||||
| Attachments: |
|
||||||||
|
Description
Martin Pavlik
2014-07-18 13:33:40 UTC
Description of problem:
After self hosted engine deployment via physical host VLANed interface, host is marked as non operation with following reason:
Host hosted_engine_1 moved to Non-Operational state because interfaces 'em1.172' are down but are needed by networks 'rhevm' in the current cluster
It seems that engine incorrectly detects state of host vlan interface em1.172 (which is UP)
[root@dell-r210ii-07 ~]# ip a l em1.172
16: em1.172@em1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP
link/ether d0:67:e5:f0:82:44 brd ff:ff:ff:ff:ff:ff
inet6 fe80::d267:e5ff:fef0:8244/64 scope link
valid_lft forever preferred_lft forever
Version-Release number of selected component (if applicable):
Red Hat Enterprise Virtualization Manager Version: 3.4.1-0.28.el6ev
ovirt-hosted-engine-setup-1.1.3-2.el6ev.noarch
Steps to Reproduce:
1. configure VLAN on host
cat > ifcfg-em1 << EOF
DEVICE=em1
ONBOOT=yes
BOOTPROTO=none
NM_CONTROLLED=no
EOF
cat > ifcfg-em1.172 << EOF
DEVICE=em1.172
VLAN=yes
BOOTPROTO=dhcp
ONBOOT=yes
NM_CONTROLLED=no
EOF
2. yum install ovirt-hosted-engine-setup
3. screen hosted-engine --deploy
4. and choose interface em1.172 for deployment
5. finish self hosted engine deployment
6. log in webadmin
Actual results:
incorrect detection of physical hosts VLAN interface state
Expected results:
correct detection of physical hosts VLAN interface state
Created attachment 919094 [details]
screenshot1
Created attachment 919096 [details]
logs
What does "vdsClient -s 0 getVdsStats" print? [root@dell-r210ii-07 ~]# vdsClient -s 0 getVdsStats
cpuIdle = '100.00'
cpuSys = '0.00'
cpuSysVdsmd = '0.00'
cpuUser = '0.00'
cpuUserVdsmd = '0.00'
dateTime = '2014-07-21T06:31:06 GMT'
elapsedTime = '237389'
generationID = '1d7a067c-3927-4d2c-9fe7-3a6ee79c7dc1'
haScore = 2400
haStats = {'active': True,
'configured': True,
'globalMaintenance': False,
'localMaintenance': False,
'score': 2400}
ksmCpu = 2
ksmPages = 64
ksmState = True
memAvailable = 3030
memCommitted = 4161
memFree = 4544
memShared = 538
momStatus = 'active'
netConfigDirty = 'True'
rxRate = '0.00'
statsAge = '237389.63'
storageDomains = {'05910f18-1eba-42ac-b6d1-3fcd59e81904': {'acquired': True,
'code': 0,
'delay': '0.000371408',
'lastCheck': '9.2',
'valid': True,
'version': 3}}
swapFree = 2047
swapTotal = 2047
txRate = '0.00'
vmActive = 1
vmCount = 1
vmMigrating = 0
Interesting, there seems to be no interface-related data. Dan, any idea why that could be? I see a couple of errors on the vdsm and supervdsm logs, not sure which of them are pertinent (there's one specifically about stat-fetching). There's no info regarding networking AT ALL.
Thread-12::ERROR::2014-07-18 14:10:46,387::sampling::438::vds::(run) Error while sampling stats
Traceback (most recent call last):
File "/usr/share/vdsm/sampling.py", line 420, in run
sample = self.sample()
File "/usr/share/vdsm/sampling.py", line 410, in sample
hs = HostSample(self._pid)
File "/usr/share/vdsm/sampling.py", line 186, in __init__
(link.name, InterfaceSample(link)) for link in getLinks())
File "/usr/share/vdsm/sampling.py", line 186, in <genexpr>
(link.name, InterfaceSample(link)) for link in getLinks())
File "/usr/share/vdsm/sampling.py", line 109, in __init__
self.speed = _getLinkSpeed(link)
File "/usr/share/vdsm/sampling.py", line 547, in _getLinkSpeed
speed = netinfo.vlanSpeed(dev.name)
AttributeError: 'module' object has no attribute 'vlanSpeed'
I suspect that vdsm and vdsm-python have misfitting versions, despite an explicit requirement. What is `rpm -qa vdsm*` on the host?
[root@dell-r210ii-07 ~]# rpm -qa vdsm* vdsm-4.14.7-7.el6ev.x86_64 vdsm-python-zombiereaper-4.14.7-7.el6ev.noarch vdsm-python-4.14.7-7.el6ev.x86_64 vdsm-cli-4.14.7-7.el6ev.noarch vdsm-xmlrpc-4.14.7-7.el6ev.noarch And what's `rpm -V vdsm-python`? Can you see vlanSpeed in /usr/lib64/python2.6/site-packages/vdsm/netinfo.py ? *** This bug has been marked as a duplicate of bug 1121643 *** This is a 3.4.1-only glitch, and it was solved by reverting the offending patch. *** Bug 1122483 has been marked as a duplicate of this bug. *** verified
[root@dell-r210ii-07 ~]# rpm -q vdsm
vdsm-4.14.11-5.el6ev.x86_64
[root@dell-r210ii-07 ~]# vdsClient -s 0 getVdsStats
.
.
.
'em2': {'name': 'em2',
'rxDropped': '0',
'rxErrors': '0',
'rxRate': '0.0',
'speed': '1000',
'state': 'up',
'txDropped': '0',
'txErrors': '0',
'txRate': '0.0'},
'em2.162': {'name': 'em2.162',
'rxDropped': '0',
'rxErrors': '0',
'rxRate': '0.0',
'speed': '1000',
'state': 'up',
'txDropped': '0',
'txErrors': '0',
'txRate': '0.0'},
.
.
.
all these bugs were fixed and released as part of 3.4.1, and weren't closed since they weren't included in any errata. closing as current release. |