Created attachment 569307 [details] Info_after_removing_ESXHOST Description of problem: After removing system ESX from SAM, SAM will report "Candlepin::CandlepinResource: 410 Gone ......". Then we do not see the other registered systems info on the SAM Web UI. See attached screenshot for more info. Version-Release number of selected component (if applicable): katello-cli-common-0.1.103-1.el6.noarch katello-candlepin-cert-key-pair-1.0-1.noarch katello-glue-candlepin-0.1.303-1.el6.noarch katello-selinux-0.1.8-1.el6.noarch katello-headpin-all-0.1.142-1.el6.noarch katello-common-0.1.303-1.el6.noarch katello-configure-0.1.106-1.el6.noarch katello-cli-headpin-0.1.16-1.el6.noarch katello-certs-tools-1.0.4-1.el6.noarch katello-headpin-0.1.142-1.el6.noarch katello-candlepin-cert-key-pair-1.0-1.noarch candlepin-tomcat6-0.5.24-1.el6.noarch candlepin-0.5.24-1.el6.noarch thumbslug-0.0.21-1.el6.noarch virt-who-0.6-1.el6 ESX4.1 How reproducible: always Steps to Reproduce: 1. prepared one RHEL6.3 host, and register to SAM: #subscription-namager register --org=ACME_Corporation --environment=env1 --username=$username --password=$password 2.Deployed the env of ESX as following: Set the vcenter's username/password Vcenter username : Administrator Vcenter password :1q2w3e4rP Vcenter server IP : 10.66.0.148 Vsphere client Ip : 10.66.6.128 ESX IP: 10.66.6.66 ESX username=$username ESX password=$password Installed some Guests on ESX 3.Set the /etc/sysconfig/virt-who on RHEL6.3 host. #vim /etc/sysconfig/virt-who VIRTWHO_BACKGROUND=1 VIRTWHO_DEBUG=1 VIRTWHO_ESX=1 VIRTWHO_ESX_OWNER=ACME_Corporation VIRTWHO_ESX_ENV=env1 VIRTWHO_ESX_SERVER=10.66.0.148 VIRTWHO_ESX_USERNAME=Administrator VIRTWHO_ESX_PASSWORD=1q2w3e4rP VIRTWHO_INTERVAL=1 4.Restart virt-who service #service virt-who restart ... Updated host: 44454c4c-4c00-1031-8053-b8c04f4e3258 with guests: [42372e7b-20e5-03fb-9435-e45f2fe5fb43, 4237e7a3-8a69-ce97-a07b-6b6a40a6e00c] Sending updates in hosts-to-guests mapping: {44454c4c-4c00-1031-8053-b8c04f4e3258: [42372e7b-20e5-03fb-9435-e45f2fe5fb43, 4237e7a3-8a69-ce97-a07b-6b6a40a6e00c]} 5.Log in SAM WebUI and look for esx host under "Systems" tab. Then remove the ESX system, refresh the SAM web UI. Actual results: SAM will report the info as "Candlepin::CandlepinResource: 410 Gone...." Expected results: After remove the ESX system from SAM, it won't report the "Candlepin::CandlepinResource: 410 Gone....". Additional info:
After removing the ESX system, the issue will be reproduce when the virt-who send the update to sam.
After reporting the "Candlepin::CandlepinResource: 410 Gone ..." info, all the systems those have registered to SAM will be disappeared from the "System tab", even re-login the SAM.
Worked with chris duryee and was able to replicate it with his & hui's help. Does not require ESX to replicate, can replicate with normal KVM hypervisor setup. Chris thinks it is a candlepin bug, assigning to him & changing component
Hui, The issue here is that ESX hypervisors are auto-detected. Thus, if you delete one via the sam UI without actually shutting down the hypervisor itself, it will get re-created later, causing the error. Candlepin was handling this condition incorrectly, I'll make a code fix so that hypervisors will not be recreated if they are deleted. Additionally, a change will be made to headpin's CLI in order to re-enable a deleted hypervisor, in case someone accidentally deletes one and wants it to come back.
fb61b11 master 0.5.28+
Wondering it the re-enabling couldn't be an option of virt-who script. I'm not sure when the user finds out the hypervisor is not being created, but I assume it's when calling virt-who script. Then the faster way if he still wants to register the hypervisor would be to give virt-who option --enable_host {{UUID}}, or something like that. If that's no an option, what is the best place to put it in Headpin/Katello CLI. What about: system enable_host --uuid {{uuid}} But I'm not happy with this because we don't provide command how to show "disabled/deleted" hosts.
(In reply to comment #7) > Hui, > > The issue here is that ESX hypervisors are auto-detected. Thus, if you delete > one via the sam UI without actually shutting down the hypervisor itself, it > will get re-created later, causing the error. > > Candlepin was handling this condition incorrectly, I'll make a code fix so that > hypervisors will not be recreated if they are deleted. Additionally, a change > will be made to headpin's CLI in order to re-enable a deleted hypervisor, in > case someone accidentally deletes one and wants it to come back. Hi Chris, Now remove the registered system(Esx) from sam WebUI, it do not report the "Candlepin::CandlepinResource: 410 Gone...." info, but the virt-who will report Error and we can't re-register the system to SAM. You mean remove the registered system(Esx) from sam WebUI, then need delete the system by headpin-cli? Is the following right? If the following is anything wrong, pls correct me. Thanks. 1.Remove the registered system(Esx) from sam WebUI. (or delete registered system(Esx) only headpin-cli support) 2.Delete the system(Esx) use the headpin-cli 3.Re-register to SAM when the virt-who report the system uuid. The details of verification as following: version: katello-configure-0.3.3-2.el6_2.noarch katello-headpin-all-0.2.6-4.el6_2.noarch katello-common-0.3.1-1.el6_2.noarch katello-cli-common-0.3.2-3.el6_2.noarch katello-headpin-0.2.6-4.el6_2.noarch katello-glue-candlepin-0.3.1-1.el6_2.noarch katello-cli-headpin-0.2.0-1.el6_2.noarch katello-selinux-0.2.4-1.el6_2.noarch katello-candlepin-cert-key-pair-1.0-1.noarch katello-certs-tools-1.1.5-1.el6_2.noarch virt-who-0.6-6.el6.noarch 1.Remove system 44454c4c-4c00-1031-8053-b8c04f4e3258 from SAM WebUI 2.#service virt-who restart Virt-who is running in esx mode Starting infinite loop with 3600 seconds interval and event handling Sending update in hosts-to-guests mapping: {44454c4c-4c00-1031-8053-b8c04f4e3258: [420a6910-75c5-ec7d-8890-186bed705daa, 421f875f-f7f1-4743-ad34-cffda0abc4a2], 44454c4c-4200-1034-8039-b8c04f503258: [421f931a-9067-d179-8a19-6f7a7c9993ef, 421fa5ae-4157-4628-77a7-b315fbc298df, 42372e7b-20e5-03fb-9435-e45f2fe5fb43]} Error during update list of guests: 44454c4c-4c00-1031-8053-b8c04f4e3258: Hypervisor 44454c4c-4c00-1031-8053-b8c04f4e3258 has been deleted previously Updated host: 44454c4c-4200-1034-8039-b8c04f503258 with guests: [421f931a-9067-d179-8a19-6f7a7c9993ef, 421fa5ae-4157-4628-77a7-b315fbc298df, 42372e7b-20e5-03fb-9435-e45f2fe5fb43] 3.# headpin -u admin -p admin system remove_deletion --uuid=44454c4c-4200-1034-8039-b8c04f503258 Usage: headpin <options> system <action> <options> Supported Actions: list list systems within an organization unregister unregister a system subscriptions list subscriptions for a system subscribe subscribe a system to certificate unsubscribe unsubscribe a system from certificate info display a system within an organization facts display a the hardware facts of a system update update a system report systems report releases list releases available for the system headpin: error: invalid action: please see --help
Hui, Once step #1 is complete, the system is deleted. virt-who is correctly reporting an error, because it's trying to re-register a deleted system. Steps #2 and #3 are only used in case the user wants to allow the deleted system to come back, by removing the deletion record for the system. In most cases, the user will not want the system to come back, so they would just run step #1. However, if they mistakenly deleted the hypervisor, they would do steps #2 and #3 to fix the mistake. Let me know if you need additional info.
(In reply to comment #12) > Hui, > > Once step #1 is complete, the system is deleted. virt-who is correctly > reporting an error, because it's trying to re-register a deleted system. > > Steps #2 and #3 are only used in case the user wants to allow the deleted > system to come back, by removing the deletion record for the system. In most > cases, the user will not want the system to come back, so they would just > run step #1. However, if they mistakenly deleted the hypervisor, they would > do steps #2 and #3 to fix the mistake. > > Let me know if you need additional info. Chris, Thanks very much. According comment 11 an comment 12, the issue should be change the status to VERIFIED. And the following issue should be fixed in BZ 812891. Chris, pls change the status to ON_QA. Then I will change the status to VERIFIED.