Bug 1562787
Summary: | Node 0 SHE deployment fails over tagged interface. | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [oVirt] ovirt-hosted-engine-setup | Reporter: | Nikolai Sednev <nsednev> | ||||||
Component: | General | Assignee: | Ido Rosenzwig <irosenzw> | ||||||
Status: | CLOSED WORKSFORME | QA Contact: | Meni Yakove <myakove> | ||||||
Severity: | urgent | Docs Contact: | |||||||
Priority: | unspecified | ||||||||
Version: | 2.2.14 | CC: | bugs, danken, nsednev, ratamir | ||||||
Target Milestone: | --- | Flags: | ratamir:
blocker?
nsednev: planning_ack? nsednev: devel_ack? nsednev: testing_ack? |
||||||
Target Release: | --- | ||||||||
Hardware: | x86_64 | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2018-04-03 15:55:16 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | Network | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Created attachment 1416269 [details]
sosreport from the engine
Which ovirt-engine was tested here? Do you have a clue why it failed? Why did you place it on the Network team? (In reply to Dan Kenigsberg from comment #2) > Which ovirt-engine was tested here? Engine inside the appliance: ovirt-engine-setup-base-4.2.2.6-0.1.el7.noarch > Do you have a clue why it failed? Probably network related issue, as it happens only over tagged interface deployment and not happening over untagged. I've tried the same over NFS using tagged and untagged, the untagged passed just fine. > Why did you place it on the Network team? Please see my previous answer. Nikolai, [ ERROR ] Failed to execute stage 'Closing up': Failed executing ansible-playbook suggests where things broke. Can you attach ansible logs? Can you find out which anisble task failed? AFAIK it should be ovirt-hosted-engine-setup-ansible-initial_clean-20180402155647-74b5jx.log, which is inside /var/log/ovirt-hosted-engine-setup directory. In vdsm log I clearly see that there was a problem establishing SPM and connectivity with the storage: 2018-04-02 16:38:13,834+0300 INFO (jsonrpc/7) [vdsm.api] FINISH getAllTasksInfo error=Not SPM: () from=::1,47774, task_id=b14decc1-564a-4b4d-b39d-388e9410095b (api:50) 2018-04-02 16:38:13,906+0300 ERROR (jsonrpc/7) [storage.TaskManager.Task] (Task='b14decc1-564a-4b4d-b39d-388e9410095b') Unexpected error (task:875) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 882, in _run return fn(*args, **kargs) File "<string>", line 2, in getAllTasksInfo File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 48, in method ret = func(*args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py", line 2218, in getAllTasksInfo raise se.SpmStatusError() SpmStatusError: Not SPM: () 2018-04-02 16:38:13,906+0300 INFO (jsonrpc/7) [storage.TaskManager.Task] (Task='b14decc1-564a-4b4d-b39d-388e9410095b') aborting: Task is aborted: 'Not SPM: ()' - code 654 (task:1181) 2018-04-02 16:38:13,907+0300 ERROR (jsonrpc/7) [storage.Dispatcher] FINISH getAllTasksInfo error=Not SPM: () (dispatcher:82) 2018-04-02 16:38:13,907+0300 INFO (jsonrpc/7) [jsonrpc.JsonRpcServer] RPC call Host.getAllTasksInfo failed (error 654) in 0.08 seconds (__init__:573) 2018-04-02 16:38:13,948+0300 INFO (jsonrpc/0) [vdsm.api] START getAllTasksStatuses(spUUID=None, options=None) from=::1,47774, task_id=44b90186-a560-436d-a709-6b656b651beb (api:46) 2018-04-02 16:38:13,948+0300 INFO (jsonrpc/0) [vdsm.api] FINISH getAllTasksStatuses error=Not SPM: () from=::1,47774, task_id=44b90186-a560-436d-a709-6b656b651beb (api:50) 2018-04-02 16:38:13,948+0300 ERROR (jsonrpc/0) [storage.TaskManager.Task] (Task='44b90186-a560-436d-a709-6b656b651beb') Unexpected error (task:875) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 882, in _run return fn(*args, **kargs) File "<string>", line 2, in getAllTasksStatuses File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 48, in method ret = func(*args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py", line 2178, in getAllTasksStatuses raise se.SpmStatusError() SpmStatusError: Not SPM: () 2018-04-02 16:38:13,949+0300 INFO (jsonrpc/0) [storage.TaskManager.Task] (Task='44b90186-a560-436d-a709-6b656b651beb') aborting: Task is aborted: 'Not SPM: ()' - code 654 (task:1181) 2018-04-02 16:38:13,949+0300 ERROR (jsonrpc/0) [storage.Dispatcher] FINISH getAllTasksStatuses error=Not SPM: () (dispatcher:82) 2018-04-02 16:38:13,949+0300 INFO (jsonrpc/0) [jsonrpc.JsonRpcServer] RPC call Host.getAllTasksStatuses failed (error 654) in 0.00 seconds (__init__:573) I'm still investigating the issue, looks like some network disorder happened during deployment over tagged VLAN. Checking infrastructure... Reproduced again, no network infrastructure related here with storage, as its FC direct links connectivity to the SAN. [ INFO ] changed: [localhost] [ INFO ] TASK [include_tasks] [ INFO ] ok: [localhost] [ INFO ] TASK [Obtain SSO token using username/password credentials] [ INFO ] ok: [localhost] [ INFO ] TASK [Check for the local bootstrap VM] [ INFO ] ok: [localhost] [ INFO ] TASK [Make the engine aware that the external VM is stopped] [ INFO ] TASK [Wait for the local bootstrap VM to be down at engine eyes] [ ERROR ] fatal: [localhost]: FAILED! => {"ansible_facts": {"ovirt_vms": [{"affinity_labels": [], "applications": [], "bios": {"boot_menu": {"enabled": false}}, "cdroms": [], "cluster": {"href": "/ovirt-engine/api/clusters/8ad5d0f2-374c-11e8-b5a6-00163eeeeee1", "id": "8ad5d0f2-374c-11e8-b5a6-00163eeeeee1"}, "cpu": {"architecture": "x86_64", "topology": {"cores": 1, "sockets": 4, "threads": 1}}, "cpu_profile": {"href": "/ovirt-engine/api/cpuprofiles/58ca604e-01a7-003f-01de-000000000250", "id": "58ca604e-01a7-003f-01de-000000000250"}, "cpu_shares": 0, "creation_time": "2018-04-03 17:41:22.135000+03:00", "delete_protected": false, "disk_attachments": [], "display": {"address": "127.0.0.1", "allow_override": false, "copy_paste_enabled": true, "disconnect_action": "LOCK_SCREEN", "file_transfer_enabled": true, "monitors": 1, "port": 5900, "single_qxl_pci": false, "smartcard_enabled": false, "type": "vnc"}, "graphics_consoles": [], "high_availability": {"enabled": false, "priority": 0}, "host": {"href": "/ovirt-engine/api/hosts/ac45012e-8613-489a-8e91-be32743125b6", "id": "ac45012e-8613-489a-8e91-be32743125b6"}, "host_devices": [], "href": "/ovirt-engine/api/vms/b9cd99f2-f703-4dd9-94b0-b5aa29eecbc1", "id": "b9cd99f2-f703-4dd9-94b0-b5aa29eecbc1", "io": {"threads": 0}, "katello_errata": [], "large_icon": {"href": "/ovirt-engine/api/icons/21b0241c-e1eb-c9e8-42ae-7e01aca5ea1d", "id": "21b0241c-e1eb-c9e8-42ae-7e01aca5ea1d"}, "memory": 17179869184, "memory_policy": {"guaranteed": 17179869184, "max": 17179869184}, "migration": {"auto_converge": "inherit", "compressed": "inherit"}, "migration_downtime": -1, "name": "external-HostedEngineLocal", "next_run_configuration_exists": false, "nics": [], "numa_nodes": [], "numa_tune_mode": "interleave", "origin": "external", "original_template": {"href": "/ovirt-engine/api/templates/00000000-0000-0000-0000-000000000000", "id": "00000000-0000-0000-0000-000000000000"}, "os": {"boot": {"devices": ["hd"]}, "type": "other"}, "permissions": [], "placement_policy": {"affinity": "migratable"}, "quota": {"id": "9cd0b006-374c-11e8-b0f0-00163eeeeee1"}, "reported_devices": [], "run_once": false, "sessions": [], "small_icon": {"href": "/ovirt-engine/api/icons/8f625bad-06e9-b023-40bd-bbef504d58a1", "id": "8f625bad-06e9-b023-40bd-bbef504d58a1"}, "snapshots": [], "sso": {"methods": [{"id": "guest_agent"}]}, "start_paused": false, "stateless": false, "statistics": [], "status": "unknown", "storage_error_resume_behaviour": "auto_resume", "tags": [], "template": {"href": "/ovirt-engine/api/templates/00000000-0000-0000-0000-000000000000", "id": "00000000-0000-0000-0000-000000000000"}, "time_zone": {"name": "Etc/GMT"}, "type": "desktop", "usb": {"enabled": false}, "watchdogs": []}]}, "attempts": 24, "changed": false} [ ERROR ] Failed to execute stage 'Closing up': Failed executing ansible-playbook [ INFO ] Stage: Clean up [ INFO ] Cleaning temporary resources [ INFO ] TASK [Gathering Facts] [ INFO ] ok: [localhost] [ INFO ] TASK [include_tasks] [ INFO ] ok: [localhost] [ INFO ] TASK [Remove local vm dir] [ INFO ] changed: [localhost] [ INFO ] Generating answer file '/var/lib/ovirt-hosted-engine-setup/answers/answers-20180403175704.conf' [ INFO ] Stage: Pre-termination [ INFO ] Stage: Termination [ ERROR ] Hosted Engine deployment failed: please check the logs for the issue, fix accordingly or re-deploy from scratch. Log file is located at /var/log/ovirt-hosted-engine-setup/ovirt-hosted-engine-setup-20180403172943-sse552.log Environmental issue: DNS has not been properly updated, hence DNS was pointing the engine to unreachable IP of host's interface within unreachable, untagged VLAN, which was not used during deployment and it was not reachable from tagged VLAN, from which deployment was initiated. Moving to closed. |
Created attachment 1416268 [details] sosreport from puma18 Description of problem: Node 0 SHE deployment fails over tagged interface, using CLI. [ ERROR ] fatal: [localhost]: FAILED! => {"ansible_facts": {"ovirt_vms": [{"affinity_labels": [], "applications": [], "bios": {"boot_menu": {"enabled": false}}, "cdroms": [], "cluster": {"href": "/ovirt-engine/api/clusters/1748a294-3676-11e8-b3bd-00163eeeeee1", "id": "1748a294-3676-11e8-b3bd-00163eeeeee1"}, "cpu": {"architecture": "x86_64", "topology": {"cores": 1, "sockets": 4, "threads": 1}}, "cpu_profile": {"href": "/ovirt-engine/api/cpuprofiles/58ca604e-01a7-003f-01de-000000000250", "id": "58ca604e-01a7-003f-01de-000000000250"}, "cpu_shares": 0, "creation_time": "2018-04-02 16:06:05.791000+03:00", "delete_protected": false, "disk_attachments": [], "display": {"address": "127.0.0.1", "allow_override": false, "copy_paste_enabled": true, "disconnect_action": "LOCK_SCREEN", "file_transfer_enabled": true, "monitors": 1, "port": 5900, "single_qxl_pci": false, "smartcard_enabled": false, "type": "vnc"}, "graphics_consoles": [], "high_availability": {"enabled": false, "priority": 0}, "host": {"href": "/ovirt-engine/api/hosts/043bc7e9-5580-48d8-8953-6d84a33ed596", "id": "043bc7e9-5580-48d8-8953-6d84a33ed596"}, "host_devices": [], "href": "/ovirt-engine/api/vms/b537b175-8545-426d-8571-d6652d56d2ef", "id": "b537b175-8545-426d-8571-d6652d56d2ef", "io": {"threads": 0}, "katello_errata": [], "large_icon": {"href": "/ovirt-engine/api/icons/e57746a0-a95b-019c-4355-27b4eac77170", "id": "e57746a0-a95b-019c-4355-27b4eac77170"}, "memory": 17179869184, "memory_policy": {"guaranteed": 17179869184, "max": 17179869184}, "migration": {"auto_converge": "inherit", "compressed": "inherit"}, "migration_downtime": -1, "name": "external-HostedEngineLocal", "next_run_configuration_exists": false, "nics": [], "numa_nodes": [], "numa_tune_mode": "interleave", "origin": "external", "original_template": {"href": "/ovirt-engine/api/templates/00000000-0000-0000-0000-000000000000", "id": "00000000-0000-0000-0000-000000000000"}, "os": {"boot": {"devices": ["hd"]}, "type": "other"}, "permissions": [], "placement_policy": {"affinity": "migratable"}, "quota": {"id": "2996557c-3676-11e8-8971-00163eeeeee1"}, "reported_devices": [], "run_once": false, "sessions": [], "small_icon": {"href": "/ovirt-engine/api/icons/5ba0b8a7-51c6-5ef5-7ed0-495a62737d13", "id": "5ba0b8a7-51c6-5ef5-7ed0-495a62737d13"}, "snapshots": [], "sso": {"methods": [{"id": "guest_agent"}]}, "start_paused": false, "stateless": false, "statistics": [], "status": "unknown", "storage_error_resume_behaviour": "auto_resume", "tags": [], "template": {"href": "/ovirt-engine/api/templates/00000000-0000-0000-0000-000000000000", "id": "00000000-0000-0000-0000-000000000000"}, "time_zone": {"name": "Etc/GMT"}, "type": "desktop", "usb": {"enabled": false}, "watchdogs": []}]}, "attempts": 24, "changed": false} [ ERROR ] Failed to execute stage 'Closing up': Failed executing ansible-playbook [ INFO ] Stage: Clean up [ INFO ] Cleaning temporary resources [ INFO ] TASK [Gathering Facts] [ INFO ] ok: [localhost] [ INFO ] TASK [include_tasks] [ INFO ] ok: [localhost] [ INFO ] TASK [Remove local vm dir] [ INFO ] changed: [localhost] [ INFO ] Generating answer file '/var/lib/ovirt-hosted-engine-setup/answers/answers-20180402162515.conf' [ INFO ] Stage: Pre-termination [ INFO ] Stage: Termination [ ERROR ] Hosted Engine deployment failed: please check the logs for the issue, fix accordingly or re-deploy from scratch. Log file is located at /var/log/ovirt-hosted-engine-setup/ovirt-hosted-engine-setup-20180402155548-4n0to5.log You have new mail in /var/spool/mail/root Version-Release number of selected component (if applicable): rhvm-appliance-4.2-20180401.0.el7.noarch.rpm ovirt-hosted-engine-ha-2.2.9-1.el7ev.noarch ovirt-hosted-engine-setup-2.2.15-1.el7ev.noarch Linux 3.10.0-862.el7.x86_64 #1 SMP Wed Mar 21 18:14:51 EDT 2018 x86_64 x86_64 x86_64 GNU/Linux Red Hat Enterprise Linux Server release 7.5 (Maipo) How reproducible: 100% Steps to Reproduce: 1.Deploy SHE over VLAN tagged interface. Actual results: Deployment fails. Expected results: Deployment should succeed. Additional info: Logs from host and engine attached.