Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1438096 - os-collect-config does not start after node hard reset due to xfs
os-collect-config does not start after node hard reset due to xfs
Status: CLOSED ERRATA
Product: Red Hat OpenStack
Classification: Red Hat
Component: os-collect-config (Show other bugs)
11.0 (Ocata)
Unspecified Unspecified
medium Severity medium
: Upstream M2
: 12.0 (Pike)
Assigned To: Ben Nemec
Gurenko Alex
: Triaged
Depends On:
Blocks: 1441393 1442801
  Show dependency treegraph
 
Reported: 2017-03-31 18:19 EDT by Lukas Bezdicka
Modified: 2018-02-05 14:07 EST (History)
10 users (show)

See Also:
Fixed In Version: os-collect-config-7.0.1-0.20170612052603.5870ed6.el7ost
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1441393 1442801 (view as bug list)
Environment:
Last Closed: 2017-12-13 16:22:29 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Launchpad 1678328 None None None 2017-03-31 18:45 EDT
OpenStack gerrit 444898 None None None 2017-03-31 18:47 EDT
Red Hat Product Errata RHEA-2017:3462 normal SHIPPED_LIVE Red Hat OpenStack Platform 12.0 Enhancement Advisory 2018-02-15 20:43:25 EST

  None (edit)
Description Lukas Bezdicka 2017-03-31 18:19:09 EDT
Description of problem:
I noticed issue with os-collect-config when I snapshot my nested vm hypervisor the nodes have to be restarted. Root cause is XFS most likely dropping last transaction and reverting filesystem to point when os-collect-config was copying files around to point where they got turncated to 0. Or it's pure XFS bug (we might wanna investigate). But os-collect-config fails to start after.

Mar 28 13:28:28 overcloud-controller-0.localdomain os-collect-config[5224]: ValueError: No JSON object could be decoded
Mar 28 13:28:28 overcloud-controller-0.localdomain os-collect-config[5224]: raise ValueError("No JSON object could be decoded")
Mar 28 13:28:28 overcloud-controller-0.localdomain os-collect-config[5224]: File "/usr/lib64/python2.7/json/decoder.py", line 384, in raw_decode
Mar 28 13:28:28 overcloud-controller-0.localdomain os-collect-config[5224]: obj, end = self.raw_decode(s, idx=_w(s, 0).end())
Mar 28 13:28:28 overcloud-controller-0.localdomain os-collect-config[5224]: File "/usr/lib64/python2.7/json/decoder.py", line 366, in decode
Mar 28 13:28:28 overcloud-controller-0.localdomain os-collect-config[5224]: return _default_decoder.decode(s)
Mar 28 13:28:28 overcloud-controller-0.localdomain os-collect-config[5224]: File "/usr/lib64/python2.7/json/__init__.py", line 338, in loads
Mar 28 13:28:28 overcloud-controller-0.localdomain os-collect-config[5224]: **kw)
Mar 28 13:28:28 overcloud-controller-0.localdomain os-collect-config[5224]: File "/usr/lib64/python2.7/json/__init__.py", line 290, in load
Mar 28 13:28:28 overcloud-controller-0.localdomain os-collect-config[5224]: metadata = json.load(f)
Mar 28 13:28:28 overcloud-controller-0.localdomain os-collect-config[5224]: File "/usr/lib/python2.7/site-packages/os_collect_config/ec2.py", line 71, in collect
Mar 28 13:28:28 overcloud-controller-0.localdomain os-collect-config[5224]: content = module.Collector(**collector_kwargs).collect()
Mar 28 13:28:28 overcloud-controller-0.localdomain os-collect-config[5224]: File "/usr/lib/python2.7/site-packages/os_collect_config/collect.py", line 166, in collect_all
Mar 28 13:28:28 overcloud-controller-0.localdomain os-collect-config[5224]: collector_kwargs_map=collector_kwargs_map)
Mar 28 13:28:28 overcloud-controller-0.localdomain os-collect-config[5224]: File "/usr/lib/python2.7/site-packages/os_collect_config/collect.py", line 262, in __main__
Mar 28 13:28:28 overcloud-controller-0.localdomain os-collect-config[5224]: sys.exit(__main__())
Mar 28 13:28:28 overcloud-controller-0.localdomain os-collect-config[5224]: File "/usr/bin/os-collect-config", line 10, in <module>
Mar 28 13:28:28 overcloud-controller-0.localdomain os-collect-config[5224]: Traceback (most recent call last):


Steps to Reproduce:
1. openstack baremetal node list | grep -v UUID | awk '{print $2}' | grep -v '^$'| while read i; do openstack baremetal node power off $i ; done
2. openstack baremetal node list | grep -v UUID | awk '{print $2}' | grep -v '^$'| while read i; do openstack baremetal node power on $i ; done
3. for i in `nova list|awk '/ACTIVE/ {print $(NF-1)}' |awk -F"=" '{print $NF}'`; do echo $i; ssh -o StrictHostKeyChecking=no heat-admin@$i "sudo systemctl status os-collect-config "; done

Actual results:
os-collect-config fails to start

Expected results:
os-collect-config handles the 0size cache files
Comment 1 Jaromir Coufal 2017-04-07 00:01:16 EDT
Doc_text if misses OSP11.
Comment 4 Artem Hrechanychenko 2017-11-08 09:19:27 EST
VERIFIED
os-collect-config-7.2.0-1.el7ost.noarch


(undercloud) [stack@undercloud-0 ~]$ ironic node-list
+--------------------------------------+--------------+--------------------------------------+-------------+--------------------+-------------+
| UUID                                 | Name         | Instance UUID                        | Power State | Provisioning State | Maintenance |
+--------------------------------------+--------------+--------------------------------------+-------------+--------------------+-------------+
| 5b16bdf9-fcb6-485d-9c80-90d3a0aa9216 | compute-0    | 95889ef3-c0b3-477a-9c72-e08e41e30cc3 | power on    | active             | False       |
| a1b0248b-2606-41a2-820f-09bcdfe06ccd | controller-3 | 2252bf28-4188-44c8-9dde-066fca8c8e2d | power on    | active             | False       |
| d1b46f07-1d11-4b03-b0a3-71390ea2678c | controller-2 | None                                 | power on    | available          | False       |
| aaddb57c-b260-40e2-b8d7-f772c52151f1 | controller-1 | 5727599a-4310-42b9-a7b1-228804f4aa52 | power on    | active             | False       |
| 6d056b86-a94c-4bb7-9df8-3ac5274a67ae | controller-0 | 55cd6a06-43f9-4fa0-a7a1-4789599ea354 | power on    | active             | False       |




(undercloud) [stack@undercloud-0 ~]$ openstack baremetal node list | grep -v UUID | awk '{print $2}' | grep -v '^$'| while read i; do openstack baremetal node power off $i ; done
(undercloud) [stack@undercloud-0 ~]$ openstack baremetal node list | grep -v UUID | awk '{print $2}' | grep -v '^$'| while read i; do openstack baremetal node power on $i ; done
(undercloud) [stack@undercloud-0 ~]$ for i in `nova list|awk '/ACTIVE/ {print $(NF-1)}' |awk -F"=" '{print $NF}'`; do echo $i; ssh -o StrictHostKeyChecking=no heat-admin@$i "sudo systemctl status os-collect-config "; done
192.168.24.9
Warning: Permanently added '192.168.24.9' (ECDSA) to the list of known hosts.
● os-collect-config.service - Collect metadata and run hook commands.
   Loaded: loaded (/usr/lib/systemd/system/os-collect-config.service; enabled; vendor preset: disabled)
   Active: active (running) since Wed 2017-11-08 14:16:25 UTC; 55s ago
 Main PID: 2183 (os-collect-conf)
   Memory: 6.4M
   CGroup: /system.slice/os-collect-config.service
           └─2183 /usr/bin/python /usr/bin/os-collect-config

Nov 08 14:16:50 compute-0 os-collect-config[2183]: /var/lib/os-collect-config/local-data not found. Skipping
Nov 08 14:16:50 compute-0 os-collect-config[2183]: No local metadata found (['/var/lib/os-collect-config/local-data'])
Nov 08 14:16:52 compute-0 os-collect-config[2183]: /var/lib/os-collect-config/local-data not found. Skipping
Nov 08 14:16:52 compute-0 os-collect-config[2183]: No local metadata found (['/var/lib/os-collect-config/local-data'])
Nov 08 14:16:56 compute-0 os-collect-config[2183]: /var/lib/os-collect-config/local-data not found. Skipping
Nov 08 14:16:56 compute-0 os-collect-config[2183]: No local metadata found (['/var/lib/os-collect-config/local-data'])
Nov 08 14:17:04 compute-0 os-collect-config[2183]: /var/lib/os-collect-config/local-data not found. Skipping
Nov 08 14:17:04 compute-0 os-collect-config[2183]: No local metadata found (['/var/lib/os-collect-config/local-data'])
Nov 08 14:17:20 compute-0 os-collect-config[2183]: /var/lib/os-collect-config/local-data not found. Skipping
Nov 08 14:17:20 compute-0 os-collect-config[2183]: No local metadata found (['/var/lib/os-collect-config/local-data'])
192.168.24.10
● os-collect-config.service - Collect metadata and run hook commands.
   Loaded: loaded (/usr/lib/systemd/system/os-collect-config.service; enabled; vendor preset: disabled)
   Active: active (running) since Wed 2017-11-08 14:16:43 UTC; 49s ago
 Main PID: 2681 (os-collect-conf)
   Memory: 8.1M
   CGroup: /system.slice/os-collect-config.service
           └─2681 /usr/bin/python /usr/bin/os-collect-config

Nov 08 14:17:04 controller-0 os-collect-config[2681]: /var/lib/os-collect-config/local-data not found. Skipping
Nov 08 14:17:04 controller-0 os-collect-config[2681]: No local metadata found (['/var/lib/os-collect-config/local-data'])
Nov 08 14:17:05 controller-0 os-collect-config[2681]: /var/lib/os-collect-config/local-data not found. Skipping
Nov 08 14:17:05 controller-0 os-collect-config[2681]: No local metadata found (['/var/lib/os-collect-config/local-data'])
Nov 08 14:17:08 controller-0 os-collect-config[2681]: /var/lib/os-collect-config/local-data not found. Skipping
Nov 08 14:17:08 controller-0 os-collect-config[2681]: No local metadata found (['/var/lib/os-collect-config/local-data'])
Nov 08 14:17:12 controller-0 os-collect-config[2681]: /var/lib/os-collect-config/local-data not found. Skipping
Nov 08 14:17:12 controller-0 os-collect-config[2681]: No local metadata found (['/var/lib/os-collect-config/local-data'])
Nov 08 14:17:20 controller-0 os-collect-config[2681]: /var/lib/os-collect-config/local-data not found. Skipping
Nov 08 14:17:20 controller-0 os-collect-config[2681]: No local metadata found (['/var/lib/os-collect-config/local-data'])
192.168.24.14
Warning: Permanently added '192.168.24.14' (ECDSA) to the list of known hosts.
● os-collect-config.service - Collect metadata and run hook commands.
   Loaded: loaded (/usr/lib/systemd/system/os-collect-config.service; enabled; vendor preset: disabled)
   Active: active (running) since Wed 2017-11-08 14:16:34 UTC; 57s ago
 Main PID: 2615 (os-collect-conf)
   Memory: 1.8M
   CGroup: /system.slice/os-collect-config.service
           └─2615 /usr/bin/python /usr/bin/os-collect-config

Nov 08 14:16:45 controller-1 os-collect-config[2615]: /var/lib/os-collect-config/local-data not found. Skipping
Nov 08 14:16:45 controller-1 os-collect-config[2615]: No local metadata found (['/var/lib/os-collect-config/local-data'])
Nov 08 14:16:47 controller-1 os-collect-config[2615]: /var/lib/os-collect-config/local-data not found. Skipping
Nov 08 14:16:47 controller-1 os-collect-config[2615]: No local metadata found (['/var/lib/os-collect-config/local-data'])
Nov 08 14:16:52 controller-1 os-collect-config[2615]: /var/lib/os-collect-config/local-data not found. Skipping
Nov 08 14:16:52 controller-1 os-collect-config[2615]: No local metadata found (['/var/lib/os-collect-config/local-data'])
Nov 08 14:17:00 controller-1 os-collect-config[2615]: /var/lib/os-collect-config/local-data not found. Skipping
Nov 08 14:17:00 controller-1 os-collect-config[2615]: No local metadata found (['/var/lib/os-collect-config/local-data'])
Nov 08 14:17:16 controller-1 os-collect-config[2615]: /var/lib/os-collect-config/local-data not found. Skipping
Nov 08 14:17:16 controller-1 os-collect-config[2615]: No local metadata found (['/var/lib/os-collect-config/local-data'])
192.168.24.6
Warning: Permanently added '192.168.24.6' (ECDSA) to the list of known hosts.
● os-collect-config.service - Collect metadata and run hook commands.
   Loaded: loaded (/usr/lib/systemd/system/os-collect-config.service; enabled; vendor preset: disabled)
   Active: active (running) since Wed 2017-11-08 14:16:42 UTC; 51s ago
 Main PID: 2680 (os-collect-conf)
   Memory: 332.0K
   CGroup: /system.slice/os-collect-config.service
           └─2680 /usr/bin/python /usr/bin/os-collect-config

Nov 08 14:16:58 controller-2 os-collect-config[2680]: /var/lib/os-collect-config/local-data not found. Skipping
Nov 08 14:16:58 controller-2 os-collect-config[2680]: No local metadata found (['/var/lib/os-collect-config/local-data'])
Nov 08 14:17:00 controller-2 os-collect-config[2680]: /var/lib/os-collect-config/local-data not found. Skipping
Nov 08 14:17:00 controller-2 os-collect-config[2680]: No local metadata found (['/var/lib/os-collect-config/local-data'])
Nov 08 14:17:04 controller-2 os-collect-config[2680]: /var/lib/os-collect-config/local-data not found. Skipping
Nov 08 14:17:04 controller-2 os-collect-config[2680]: No local metadata found (['/var/lib/os-collect-config/local-data'])
Nov 08 14:17:12 controller-2 os-collect-config[2680]: /var/lib/os-collect-config/local-data not found. Skipping
Nov 08 14:17:12 controller-2 os-collect-config[2680]: No local metadata found (['/var/lib/os-collect-config/local-data'])
Nov 08 14:17:28 controller-2 os-collect-config[2680]: /var/lib/os-collect-config/local-data not found. Skipping
Nov 08 14:17:28 controller-2 os-collect-config[2680]: No local metadata found (['/var/lib/os-collect-config/local-data'])
Comment 7 errata-xmlrpc 2017-12-13 16:22:29 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:3462

Note You need to log in before you can comment on or make changes to this bug.