Bug 1395617 - rhcert-listener doesn't start after reboot in KDUMP nfs test on a machine with 2 NICs
Summary: rhcert-listener doesn't start after reboot in KDUMP nfs test on a machine wit...
Keywords:
Status: NEW
Alias: None
Product: Red Hat Certification Program
Classification: Red Hat
Component: redhat-certification-hardware
Version: 1.0
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: ---
Assignee: Nobody
QA Contact: rhcert qe
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-11-16 10:00 UTC by Rainer Koenig
Modified: 2022-09-07 04:19 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Target Upstream Version:


Attachments (Terms of Use)

Description Rainer Koenig 2016-11-16 10:00:55 UTC
Description of problem:
Running the KDUMP nfs test from the web UI on a SUT that has 2 NICs.
After triggering the crash the machine reboots, but the web UI reports 
still that its waiting for a response. 
Checking rhcert-backend server status shows that the rcert-listerner is not running on port 8009. 
Doing an "rhcert-backend server start" solves the problem immediately.

Version-Release number of selected component (if applicable):
redhat-certification-hardware-4.1-20161019.el7.noarch

How reproducible:
On the machine with 2 NICs: always
On a machine with just one NIC: never

Steps to Reproduce:
1. Install RHEL 7.3 & redhat-certification-4.1... on SUT
2. register the machine to the web UI
3. Perform kdump nfs test

Actual results:
Kdump gets triggered, machine crasehs and reboots and web UI is waiting for a response.

Expected results:
Machine should reboot and then the test should proceed, meaing the web UI gets connected again to the SUT. 

Additional info:
I wanted to try out if this is depending on the NIC that I use when registering the system. My machine has 2 IP adresses:
192.168.2.138 and 192.168.2.168. I tried the second one (.168) and the first attempt to register didn't produce any entry in the list on the web UI. On the second attempt I got my machine listed, but with the first address (.138). I guess that this surprise is triggered by the different metric values that I see when looking at the ip routes:

default via 192.168.2.1 dev enp0s25  proto static  metric 100 
default via 192.168.2.1 dev enp4s0  proto static  metric 101 
192.168.2.0/24 dev enp0s25  proto kernel  scope link  src 192.168.2.138  metric 100 
192.168.2.0/24 dev enp4s0  proto kernel  scope link  src 192.168.2.168  metric 101 
192.168.122.0/24 dev virbr0  proto kernel  scope link  src 192.168.122.1 

A workaround for this problem is to do a "rhcert-backend server start" so that the rhcert-listerner gets started manually.

Comment 1 Rainer Koenig 2016-12-13 13:33:59 UTC
Problem occurs also in RHCert 4.2. Seems to happen every time the SUT has more than one NIC.


Note You need to log in before you can comment on or make changes to this bug.