Bug 1035604

Summary: virt-who won't start if pidfile exists but the process is no longer running
Product: Red Hat Enterprise Linux 7 Reporter: Liushihui <shihliu>
Component: virt-whoAssignee: Radek Novacek <rnovacek>
Status: CLOSED CURRENTRELEASE QA Contact: gaoshang <sgao>
Severity: high Docs Contact:
Priority: unspecified    
Version: 7.0CC: liliu, ovasik, qianzhan, sgao
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: virt-who-0.8-10.el7 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-06-13 10:56:03 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Liushihui 2013-11-28 07:54:54 UTC
Description of problem:
After start virt-who with command-line in the terminal, it will display "inactive" in the systemctl. 

Version-Release number of selected component (if applicable):
virt-who-0.8-8.el7.noarch
python-rhsm-1.10.6-1.el7.x86_64
subscription-manager-1.10.6-1.el7.x86_64

How reproducible:
Always

Steps to Reproduce:
1. Start virt-who with command-line in the ternimal
[root@dhcp-65-83 /]# virt-who --esx --esx-owner=ACME_Corporation --esx-env=Library --esx-server=10.66.79.63 --esx-username=Administrator --esx-password=qwer1234P! -b -d

2. Check the log in the /var/log/rhsm/rhsm.log, virt-who has been started.
2013-11-28 11:06:39,776 [WARNING]  @virt-who.py:462 - Listening for events is not available in VDSM, ESX, RHEV-M or Hyper-V mode
2013-11-28 11:06:44,959 [DEBUG]  @virt-who.py:475 - Virt-who is running in esx mode
2013-11-28 11:06:44,960 [DEBUG]  @virt-who.py:482 - Starting infinite loop with 3600 seconds interval and event handling
2013-11-28 11:06:45,791 [DEBUG]  @subscriptionmanager.py:89 - Sending update in hosts-to-guests mapping: {44454c4c-4c00-1031-8053-b8c04f4e3258: [42264b6e-e140-4354-a8e8-9efff7b99e06], 44454c4c-4200-1034-8039-b8c04f503258: [564db05b-769f-69ac-642a-8b7916f7be7a, 42268a03-b68f-52e6-805f-9126c3b006dd, 4226082e-2746-17dc-6f01-dc91c6fc7fbb, 4226a1bf-d361-fd50-9d0c-50b5b56c982a]}

3. Check the virt-who status in the systemctl
[root@dhcp-65-83 run]# systemctl status virt-who
virt-who.service - Daemon for reporting virtual guest IDs to subscription-manager
   Loaded: loaded (/usr/lib/systemd/system/virt-who.service; disabled)
   Active: inactive (dead)
Nov 28 11:06:04 dhcp-65-83.nay.redhat.com systemd[1]: Starting Daemon for reporting virtual guest IDs to subscription-manager...
Nov 28 11:06:04 dhcp-65-83.nay.redhat.com systemd[1]: PID 7722 read from file /var/run/virt-who.pid does not exist.
Nov 28 11:06:04 dhcp-65-83.nay.redhat.com python[7728]: Listening for events is not available in VDSM, ESX, RHEV-M or Hyper-V mode
Nov 28 11:06:04 dhcp-65-83.nay.redhat.com systemd[1]: Started Daemon for reporting virtual guest IDs to subscription-manager.
Nov 28 11:06:09 dhcp-65-83.nay.redhat.com python[7728]: Virt-who is running in esx mode
Nov 28 11:06:09 dhcp-65-83.nay.redhat.com python[7728]: Starting infinite loop with 10 seconds interval and event handling
Nov 28 11:06:10 dhcp-65-83.nay.redhat.com python[7728]: Sending update in hosts-to-guests mapping: {44454c4c-4c00-1031-8053-b8c04f4e3258:...
Nov 28 11:06:22 dhcp-65-83.nay.redhat.com systemd[1]: Stopping Daemon for reporting virtual guest IDs to subscription-manager...
Nov 28 11:06:22 dhcp-65-83.nay.redhat.com systemd[1]: Stopped Daemon for reporting virtual guest IDs to subscription-manager.
Hint: Some lines were ellipsized, use -l to show in full.

4. Check the virt-who process
[root@dhcp-65-83 /]# ps -ef|grep virt-who
root      1476     1 37 18:06 ?        00:00:05 /usr/bin/python /usr/share/virt-who/virt-who.py --esx --esx-owner=ACME_Corporation --esx-env=Library --esx-server=10.66.79.63 --esx-username=Administrator --esx-password=qwer1234P! -b -d
root      1489 10632  0 18:06 pts/1    00:00:00 grep --color=auto virt-who

5. Kill the virt-who process and restart the virt-who with systemctl
[root@dhcp-65-83 /]# kill -9 1476
[root@dhcp-65-83 /]# ps -ef|grep virt-who
root      1492 10632  0 18:06 pts/1    00:00:00 grep --color=auto virt-who
[root@dhcp-65-83 run]# systemctl start virt-who
Job for virt-who.service failed. See 'systemctl status virt-who.service' and 'journalctl -xn' for details.
[root@dhcp-65-83 run]# systemctl status virt-who
virt-who.service - Daemon for reporting virtual guest IDs to subscription-manager
   Loaded: loaded (/usr/lib/systemd/system/virt-who.service; disabled)
   Active: failed (Result: exit-code) since Thu 2013-11-28 14:27:21 CST; 6s ago
  Process: 8920 ExecStart=/usr/bin/virt-who (code=exited, status=1/FAILURE)
Nov 28 14:27:21 dhcp-65-83.nay.redhat.com systemd[1]: Starting Daemon for reporting virtual guest IDs to subscription-manager...
Nov 28 14:27:21 dhcp-65-83.nay.redhat.com virt-who[8920]: virt-who seems to be already running. If not, remove /var/run/virt-who.pid
Nov 28 14:27:21 dhcp-65-83.nay.redhat.com systemd[1]: virt-who.service: control process exited, code=exited status=1
Nov 28 14:27:21 dhcp-65-83.nay.redhat.com systemd[1]: Failed to start Daemon for reporting virtual guest IDs to subscription-manager.
Nov 28 14:27:21 dhcp-65-83.nay.redhat.com systemd[1]: Unit virt-who.service entered failed state.

Actual results:
After step2: Although virt-who has been started, its status still display "inactive (dead)" in the systemctl
After step5: Although virt-who hasn't been started, it still can't be started by systemctl.

Expected results:
Once virt-who has been started or stopped, systemctrl can monitor it's status.

Additional info:

Comment 1 Radek Novacek 2013-11-28 08:31:12 UTC
This is kinda expected behaviour. When virt-who is ran from commmand line and not via systemctl, then systemd doesn't know about it and report it as not running.

Then you've killed virt-who via kill -9 and it doesn't have a chance to clean up the PIDfile, therefore it can start even through systemd. Please try to remove /var/run/virt-who.pid manually and then start virt-who via systemctl.

On the other hand, virt-who should check whether process stated in PIDfile exists and remove the file if not.

Comment 3 Radek Novacek 2013-12-04 09:31:08 UTC
I'll use this bug to add the check to virt-who for the pidfile process existence.

Virt-who will check if the process referred by pidfile exists. If not, virt-who will delete the file and start regularly.

Comment 4 Radek Novacek 2013-12-12 08:36:54 UTC
Fixed in virt-who-0.8-10.el7

Comment 6 Liushihui 2013-12-20 10:14:40 UTC
Verified on Rhel7.0-20131219.0(virt-who-0.8-10.el7)

Steps to verify:
1. Start virt-who with command-line in the ternimal
[root@dhcp-65-83 /]# virt-who -b -d

2. Check the log in the /var/log/rhsm/rhsm.log, virt-who has been started.
2013-12-20 18:06:28,762 [DEBUG]  @virt-who.py:512 - Starting event loop
2013-12-20 18:06:28,943 [DEBUG]  @virt-who.py:528 - Virt-who is running in libvirt mode
2013-12-20 18:06:28,943 [DEBUG]  @virt-who.py:535 - Starting infinite loop with 3600 seconds interval and event handling
2013-12-20 18:06:28,944 [DEBUG]  @virt.py:56 - Virtual machine found: rhel7.0: d68be3e7-956d-429a-a2a6-745e70ccafb9
2013-12-20 18:06:28,945 [DEBUG]  @subscriptionmanager.py:81 - Sending list of uuids: ['d68be3e7-956d-429a-a2a6-745e70ccafb9']

3. Check the virt-who status in the systemctl
[root@hp-z220-04 ~]# systemctl status virt-who
virt-who.service - Daemon for reporting virtual guest IDs to subscription-manager
   Loaded: loaded (/usr/lib/systemd/system/virt-who.service; disabled)
   Active: inactive (dead)
Dec 20 18:05:56 hp-z220-04.qe.lab.eng.nay.redhat.com python[14867]: Virtual machine found: rhel7.0: d68be3e7-956d-429a-a2a6-745e70ccafb9
Dec 20 18:05:56 hp-z220-04.qe.lab.eng.nay.redhat.com python[14867]: Sending list of uuids: ['d68be3e7-956d-429a-a2a6-745e70ccafb9']
Dec 20 18:06:02 hp-z220-04.qe.lab.eng.nay.redhat.com python[14867]: Virtual machine found: rhel7.0: d68be3e7-956d-429a-a2a6-745e70ccafb9
Dec 20 18:06:02 hp-z220-04.qe.lab.eng.nay.redhat.com python[14867]: Sending list of uuids: ['d68be3e7-956d-429a-a2a6-745e70ccafb9']
Dec 20 18:06:07 hp-z220-04.qe.lab.eng.nay.redhat.com python[14867]: Virtual machine found: rhel7.0: d68be3e7-956d-429a-a2a6-745e70ccafb9
Dec 20 18:06:07 hp-z220-04.qe.lab.eng.nay.redhat.com python[14867]: Sending list of uuids: ['d68be3e7-956d-429a-a2a6-745e70ccafb9']
Dec 20 18:06:13 hp-z220-04.qe.lab.eng.nay.redhat.com python[14867]: Virtual machine found: rhel7.0: d68be3e7-956d-429a-a2a6-745e70ccafb9
Dec 20 18:06:13 hp-z220-04.qe.lab.eng.nay.redhat.com python[14867]: Sending list of uuids: ['d68be3e7-956d-429a-a2a6-745e70ccafb9']
Dec 20 18:06:16 hp-z220-04.qe.lab.eng.nay.redhat.com systemd[1]: Stopping Daemon for reporting virtual guest IDs to subscription-manager...
Dec 20 18:06:16 hp-z220-04.qe.lab.eng.nay.redhat.com systemd[1]: Stopped Daemon for reporting virtual guest IDs to subscription-manager.
Hint: Some lines were ellipsized, use -l to show in full.

4. Check the virt-who process
[root@hp-z220-04 ~]#  ps -ef|grep virt-who
root     19279     1  0 18:06 ?        00:00:00 /usr/bin/python /usr/share/virt-who/virt-who.py -b -d
root     19344 13044  0 18:07 pts/2    00:00:00 grep --color=auto virt-who

5. Check the virt-who.pid under the /var/run

6. Kill the virt-who process and restart the virt-who with systemctl
[root@dhcp-65-83 /]# kill -9 19279
[root@dhcp-65-83 /]# ps -ef|grep virt-who
root     19344 13044  0 18:07 pts/2    00:00:00 grep --color=auto virt-who
[root@dhcp-65-83 run]# systemctl start virt-who

7. Check the virt-who status, virt-who run normally
[root@hp-z220-04 run]# systemctl status virt-who
virt-who.service - Daemon for reporting virtual guest IDs to subscription-manager
   Loaded: loaded (/usr/lib/systemd/system/virt-who.service; disabled)
   Active: active (running) since Fri 2013-12-20 18:08:10 CST; 23s ago
  Process: 19420 ExecStart=/usr/bin/virt-who -b (code=exited, status=0/SUCCESS)
 Main PID: 19426 (python)
   CGroup: /system.slice/virt-who.service
           └─19426 /usr/bin/python /usr/share/virt-who/virt-who.py -b
Assertion 'n_pids > 0' failed at src/shared/cgroup-show.c:47, function show_pid_array(). Aborting.
Aborted (core dumped)

Comment 7 Ludek Smid 2014-06-13 10:56:03 UTC
This request was resolved in Red Hat Enterprise Linux 7.0.

Contact your manager or support representative in case you have further questions about the request.