Description of problem: Login fails, getting this error message: 2017-02-22 20:01:27,575-08 ERROR [org.ovirt.engine.core.aaa.servlet.SsoPostLoginServlet] (default task-169) [] server_error: Unexpected character ('<' (code 60)): expected a valid value (number, String, array, object, 'true', 'false' or 'null') at [Source: java.io.StringReader@6946a9d2; line: 1, column: 2] Cannot login into ovirt to manage VM's. Version-Release number of selected component (if applicable): 4.1.0.4 How reproducible: Happens every time I try to login. Steps to Reproduce: 1. Upgraded 4.06 to 4.1.4 2. login. 3. Actual results: failure. Expected results: login succeeded Additional info:
Can you attach logs (engine, UI, server) ? Did you try to clear the browser cache?
Created attachment 1257879 [details] ovirt-engine log
Created attachment 1257880 [details] server log
Created attachment 1257881 [details] boot log
Created attachment 1257882 [details] ui log
Created attachment 1257884 [details] newer ovirt engine log with error in it..
I've completely cleared my browser cache, even tried different browsers (chrome vs. firefox). I have not rebooted the machine, I need to migrate a few VM's off before I can do that. Here's the error in the engine log: [root@d8-r13-c1-n1 ovirt-engine]# tail -100 engine.log | grep -v glus 2017-02-26 16:08:51,840-08 INFO [org.ovirt.engine.core.bll.scheduling.policyunits.EvenGuestDistributionBalancePolicyUnit] (DefaultQuartzScheduler5) [4e56b8e7] There is no host with more than 10 running guests, no balancing is needed 2017-02-26 16:08:51,842-08 INFO [org.ovirt.engine.core.bll.scheduling.policyunits.EvenGuestDistributionBalancePolicyUnit] (DefaultQuartzScheduler5) [4e56b8e7] There is no host with more than 10 running guests, no balancing is needed 2017-02-26 16:09:51,882-08 INFO [org.ovirt.engine.core.bll.scheduling.policyunits.EvenGuestDistributionBalancePolicyUnit] (DefaultQuartzScheduler6) [b59c2d5] There is no host with more than 10 running guests, no balancing is needed 2017-02-26 16:09:51,884-08 INFO [org.ovirt.engine.core.bll.scheduling.policyunits.EvenGuestDistributionBalancePolicyUnit] (DefaultQuartzScheduler6) [b59c2d5] There is no host with more than 10 running guests, no balancing is needed 2017-02-26 16:10:33,352-08 INFO [org.ovirt.engine.core.sso.utils.AuthenticationUtils] (default task-194) [] User tdavis@ldap successfully logged in with scopes: ovirt-app-admin ovirt-app-api ovirt-app-portal ovirt-ext=auth:sequence-priority=~ ovirt-ext=revoke:revoke-all ovirt-ext=token-info:authz-search ovirt-ext=token-info:public-authz-search ovirt-ext=token-info:validate ovirt-ext=token:password-access 2017-02-26 16:10:33,681-08 ERROR [org.ovirt.engine.core.aaa.servlet.SsoPostLoginServlet] (default task-196) [] server_error: Unexpected character ('<' (code 60)): expected a valid value (number, String, array, object, 'true', 'false' or 'null') at [Source: java.io.StringReader@21b5ee11; line: 1, column: 2] 2017-02-26 16:10:51,925-08 INFO [org.ovirt.engine.core.bll.scheduling.policyunits.EvenGuestDistributionBalancePolicyUnit] (DefaultQuartzScheduler5) [4e56b8e7] There is no host with more than 10 running guests, no balancing is needed 2017-02-26 16:10:51,927-08 INFO [org.ovirt.engine.core.bll.scheduling.policyunits.EvenGuestDistributionBalancePolicyUnit] (DefaultQuartzScheduler5) [4e56b8e7] There is no host with more than 10 running guests, no balancing is needed Here is the OS version: [root@d8-r13-c1-n1 ovirt-engine]# cat /etc/centos-release CentOS Linux release 7.3.1611 (Core) Here is what is installed: [root@d8-r13-c1-n1 ovirt-engine]# !yum yum list ovirt-engin* Loaded plugins: fastestmirror, versionlock Repository virtio-win-stable is listed more than once in the configuration Loading mirror speeds from cached hostfile * elrepo: mirrors.evowise.com * elrepo-kernel: mirrors.evowise.com * ovirt-4.0: resources.ovirt.org * ovirt-4.0-epel: mirror.sfo12.us.leaseweb.net * ovirt-4.1: resources.ovirt.org * ovirt-4.1-epel: mirror.sfo12.us.leaseweb.net Installed Packages ovirt-engine.noarch 4.1.0.4-1.el7.centos @ovirt-4.1 ovirt-engine-backend.noarch 4.1.0.4-1.el7.centos @ovirt-4.1 ovirt-engine-cli.noarch 3.6.9.2-1.el7.centos @ovirt-4.1 ovirt-engine-dashboard.noarch 1.1.0-1.el7.centos @ovirt-4.1 ovirt-engine-dbscripts.noarch 4.1.0.4-1.el7.centos @ovirt-4.1 ovirt-engine-dwh.noarch 4.1.0-1.el7.centos @ovirt-4.1 ovirt-engine-dwh-setup.noarch 4.1.0-1.el7.centos @ovirt-4.1 ovirt-engine-extension-aaa-jdbc.noarch 1.1.2-1.el7 @ovirt-4.0 ovirt-engine-extension-aaa-ldap.noarch 1.3.0-1.el7 @ovirt-4.1 ovirt-engine-extension-aaa-ldap-setup.noarch 1.3.0-1.el7 @ovirt-4.1 ovirt-engine-extension-aaa-misc.noarch 1.0.1-1.el7 @ovirt-4.0 ovirt-engine-extensions-api-impl.noarch 4.1.0.4-1.el7.centos @ovirt-4.1 ovirt-engine-hosts-ansible-inventory.noarch 4.1.0.4-1.el7.centos @ovirt-4.1 ovirt-engine-lib.noarch 4.1.0.4-1.el7.centos @ovirt-4.1 ovirt-engine-restapi.noarch 4.1.0.4-1.el7.centos @ovirt-4.1 ovirt-engine-sdk-java.noarch 3.6.10.0-1.el7 @epel ovirt-engine-sdk-python.noarch 3.6.9.1-1.el7.centos @ovirt-4.0 ovirt-engine-setup.noarch 4.1.0.4-1.el7.centos @ovirt-4.1 ovirt-engine-setup-base.noarch 4.1.0.4-1.el7.centos @ovirt-4.1 ovirt-engine-setup-plugin-ovirt-engine.noarch 4.1.0.4-1.el7.centos @ovirt-4.1 ovirt-engine-setup-plugin-ovirt-engine-common.noarch 4.1.0.4-1.el7.centos @ovirt-4.1 ovirt-engine-setup-plugin-vmconsole-proxy-helper.noarch 4.1.0.4-1.el7.centos @ovirt-4.1 ovirt-engine-setup-plugin-websocket-proxy.noarch 4.1.0.4-1.el7.centos @ovirt-4.1 ovirt-engine-tools.noarch 4.1.0.4-1.el7.centos @ovirt-4.1 ovirt-engine-tools-backup.noarch 4.1.0.4-1.el7.centos @ovirt-4.1 ovirt-engine-userportal.noarch 4.1.0.4-1.el7.centos @ovirt-4.1 ovirt-engine-vmconsole-proxy-helper.noarch 4.1.0.4-1.el7.centos @ovirt-4.1 ovirt-engine-webadmin-portal.noarch 4.1.0.4-1.el7.centos @ovirt-4.1 ovirt-engine-websocket-proxy.noarch 4.1.0.4-1.el7.centos @ovirt-4.1 ovirt-engine-wildfly.x86_64 10.1.0-1.el7 @ovirt-4.0 ovirt-engine-wildfly-overlay.noarch 10.0.0-1.el7 @ovirt-4.0 Available Packages ovirt-engine-appliance.noarch 4.1-20170201.1.el7.centos ovirt-4.1 ovirt-engine-extension-logger-log4j.noarch 1.0.2-1.el7 ovirt-4.0 ovirt-engine-extensions-api-impl-javadoc.noarch 4.1.0.4-1.el7.centos ovirt-4.1 ovirt-engine-nodejs.x86_64 6.9.1-1.el7 ovirt-4.1 ovirt-engine-nodejs-modules.x86_64 0.0.17-1.el7 ovirt-4.1 ovirt-engine-sdk-java-javadoc.noarch 3.6.10.0-1.el7 epel ovirt-engine-setup-plugin-dockerc.noarch 4.1.0.4-1.el7.centos ovirt-4.1 ovirt-engine-setup-plugin-live.noarch 4.1.0-1.el7.centos ovirt-4.1 ovirt-engine-userportal-debuginfo.noarch 4.1.0.4-1.el7.centos ovirt-4.1 ovirt-engine-webadmin-portal-debuginfo.noarch 4.1.0.4-1.el7.centos ovirt-4.1
I am unable to reproduce this on CentOS7. I upgraded from 4.0.6 to 4.1.0.4 and everything is working fine. I did not have to reboot the machine.
System was originally installed as 3.6 system, that went through all the upgrades to into 4.0.x with a jump to 4.1.04.
Could you please turn on debug logging for AAA part using following command: /usr/share/ovirt-engine-wildfly/bin/jboss-cli.sh --controller=127.0.0.1:8706 --connect --user=admin@internal and enter following inside jboss-cli command prompt: /subsystem=logging/logger=org.ovirt.engine.core.aaa:add /subsystem=logging/logger=org.ovirt.engine.core.sso:add /subsystem=logging/logger=org.ovirt.engine.extension:add /subsystem=logging/logger=org.ovirt.engineextensions.aaa:add /subsystem=logging/logger=org.ovirt.engine.core.aaa:write-attribute(name=level,value=DEBUG) /subsystem=logging/logger=org.ovirt.engine.core.sso:write-attribute(name=level,value=DEBUG) /subsystem=logging/logger=org.ovirt.engine.extension:write-attribute(name=level,value=DEBUG) /subsystem=logging/logger=org.ovirt.engineextensions.aaa:write-attribute(name=level,value=DEBUG) quit When done please try to login and share the logs again with us. Btw this change is not permanent, it's will be lost during ovirt-engine restart, but if you don't want to restart, following command will turn off debugging: /usr/share/ovirt-engine-wildfly/bin/jboss-cli.sh --controller=127.0.0.1:8706 --connect --user=admin@internal and enter following commands: /subsystem=logging/logger=org.ovirt.engine.core.aaa:remove /subsystem=logging/logger=org.ovirt.engine.core.sso:remove /subsystem=logging/logger=org.ovirt.engine.extension.aaa:remove /subsystem=logging/logger=org.ovirt.engineextensions.aaa:remove quit
System will not let me in using the admin password. [root@d8-r13-c1-n1 ~]# /usr/share/ovirt-engine-wildfly/bin/jboss-cli.sh --controller=127.0.0.1:8706 --connect --user=admin@internal Password: Failed to connect to the controller: Unable to authenticate against controller at 127.0.0.1:8706: Authentication failed: all available authentication mechanisms failed: PLAIN: Server rejected authentication I know the password is right, I get this in the engine log when I try via the web interface: 2017-02-27 09:25:57,036-08 INFO [org.ovirt.engine.core.sso.utils.AuthenticationUtils] (default task-274) [] User admin@internal successfully logged in with scopes: ovirt-app-admin ovirt-app-api ovirt-app-portal ovirt-ext=auth:sequence-priority=~ ovirt-ext=revoke:revoke-all ovirt-ext=token-info:authz-search ovirt-ext=token-info:public-authz-search ovirt-ext=token-info:validate ovirt-ext=token:password-access 2017-02-27 09:25:57,114-08 ERROR [org.ovirt.engine.core.aaa.servlet.SsoPostLoginServlet] (default task-276) [] server_error: Unexpected character ('<' (code 60)): expected a valid value (number, String, array, object, 'true', 'false' or 'null') at [Source: java.io.StringReader@311d68eb; line: 1, column: 2]
Ah, I finally found it. The web server is shared with other things, and somehow that is what is causing problems with the ovirt-engine portion. removing that sharing fixed it for now; now to make figure out how to make it share like a good citzen.. ( I know, do not share it if I can..)
Just verified that 3.6 -> 4.0.6 -> 4.1.0.4 works fine too.
(In reply to Thomas Davis from comment #12) > Ah, I finally found it. > > The web server is shared with other things, and somehow that is what is > causing problems with the ovirt-engine portion. > > removing that sharing fixed it for now; now to make figure out how to make > it share like a good citzen.. ( I know, do not share it if I can..) So was this caused by some Apache configuration of the other application sharing the Apache with engine? Could you please share some details so we will be more prepared if this error happens to some other user?
So, the problem boiled down to: The original system was an ovirt3.5, that has been upgraded over time to 4.1 (3.5 -> 3.6 -> 4.0 -> 4.1) The network layout is ovirt exists inside on a private IP network, with the ovirt engine having one of the few public IP's. This ip has a different name than the host name. We do this to minimize the network facing devices (this system has access to a building control system network and major high-voltage switchboard control systems network) ie, the external service was hatter.nersc.gov, the internal system was d8-r13-c1-n1.crt.nersc.gov. When 4.x came out, this broke; to fix, we setup a cname that matched the internal host name, instead of changing the A record. I also set up an Apache virtual host for this cname, so it would point to the internal hostname, and appear to be the correct URL to ovirt, along with SSL updates.. Ovirt internally does not use the hostname, it uses "localhost" to talk to the engine.. so we was getting an 'https://localhost:xxxx' redirect, which the virtual host does not match, but because there was just one virtual host, it worked fine for 4.0.x When I added the new, 2nd virtual host last week, it changed the order things are interpreted by Apache. 1) The new file name was before the ovirt host's virtual file name (c.conf (non-ovirt) before d.conf (ovirt)). 2) The config files name was based on the virtual host name (ie, c.conf (non-ovirt) and d.conf (ovirt)) 3) Apache sets the default virtual host to be the first config file, which with the new virtual host, was no longer the ovirt-engine virtual host, which meant c.conf (non-ovirt) was before d.conf (ovirt). Any request that comes in without the server name in the headers that matches a virtual host, will be sent to the 1st defined (aka default) virtual host, which in this case.. was not the ovirt-engine virtual host. The redirect comes in with with 'localhost' as the server, not the system host name. So, I just simply moved the old virtual host config to be the first by pre-pending 'aaaa-' to the filename, restarted httpd, and now everything works. The clue that gave this away was when I missed type a partial URL and was re-redirected to the wrong site (it should of been ovirt, but it ended up on the non-ovirt site)
(In reply to Thomas Davis from comment #15) > So, the problem boiled down to: > > The original system was an ovirt3.5, that has been upgraded over time to 4.1 > (3.5 -> 3.6 -> 4.0 -> 4.1) > > The network layout is ovirt exists inside on a private IP network, with the > ovirt engine having one of the few public IP's. This ip has a different > name than the host name. We do this to minimize the network facing devices > (this system has access to a building control system network and major > high-voltage switchboard control systems network) > > ie, the external service was hatter.nersc.gov, the internal system was > d8-r13-c1-n1.crt.nersc.gov. > > When 4.x came out, this broke; to fix, we setup a cname that matched the > internal host name, instead of changing the A record. I also set up an > Apache virtual host for this cname, so it would point to the internal > hostname, and appear to be the correct URL to ovirt, along with SSL updates.. Not sure to which 4.0.z you have upgraded, but since 4.0.4 you can specify multiple FQDNs (or IP adresses) to access oVirt engine as described in oVirt 4.0.z release notes and BZ1325746 > > Ovirt internally does not use the hostname, it uses "localhost" to talk to > the engine.. so we was getting an 'https://localhost:xxxx' redirect, which > the virtual host does not match, but because there was just one virtual > host, it worked fine for 4.0.x One correction here, you should specify FQDN during engine-setup. It was not that important in 3.x times, so unfortunately many users used localhost instead of a proper FQDN. But in 4.0 we have introduced new SSO module which uses redirection to engine FQDN (or other FQDN specified using steps in BZ1325746) and of course if users used localhost as engine FQDN it stopped working for them. You can check you engine FQDN using following command: grep ENGINE_FQDN /etc/ovirt-engine/engine.conf.d/10-setup-protocols.conf If this is localhost, then please use ovirt-engine-rename tool to setup correct engine FQDN, more information can be found at: https://www.ovirt.org/documentation/how-to/networking/changing-engine-hostname/ So if you have set engine FQDN to internal resolvable hostname and setup external resolvable hostname according to BZ1325746, you don't need any additional Apache configuration.
I had already done that for the 4.0 setup to fix the hostname. [root@d8-r13-c1-n1 ~]# grep ENGINE_FQDN /etc/ovirt-engine/engine.conf.d/10-setup-protocols.conf ENGINE_FQDN=d8-r13-c1-n1.nersc.gov [root@d8-r13-c1-n1 ~]#
(In reply to Thomas Davis from comment #17) > I had already done that for the 4.0 setup to fix the hostname. > > [root@d8-r13-c1-n1 ~]# grep ENGINE_FQDN > /etc/ovirt-engine/engine.conf.d/10-setup-protocols.conf > ENGINE_FQDN=d8-r13-c1-n1.nersc.gov > [root@d8-r13-c1-n1 ~]# So in that case you shouldn't get redirected to localhost when trying to login to webadmin/userportal, right? If you have still the error with redirection to localhost could you please share the content of /etc/ovirt-engine/engine.conf.d/11-setup-sso.conf?
[root@d8-r13-c1-n1 conf.d]# more /etc/ovirt-engine/engine.conf.d/11-setup-sso.conf ENGINE_SSO_CLIENT_ID="ovirt-engine-core" ENGINE_SSO_CLIENT_SECRET="mrqWaVxW05HF8a9cInlhvC8OHIfpmgVq" ENGINE_SSO_AUTH_URL="https://${ENGINE_FQDN}:443/ovirt-engine/sso" ENGINE_SSO_SERVICE_URL="https://localhost:443/ovirt-engine/sso" ENGINE_SSO_SERVICE_SSL_VERIFY_HOST=false ENGINE_SSO_SERVICE_SSL_VERIFY_CHAIN=true SSO_ENGINE_URL="https://${ENGINE_FQDN}:443/ovirt-engine/"
(In reply to Thomas Davis from comment #19) > [root@d8-r13-c1-n1 conf.d]# more > /etc/ovirt-engine/engine.conf.d/11-setup-sso.conf > ENGINE_SSO_CLIENT_ID="ovirt-engine-core" > ENGINE_SSO_CLIENT_SECRET="mrqWaVxW05HF8a9cInlhvC8OHIfpmgVq" > ENGINE_SSO_AUTH_URL="https://${ENGINE_FQDN}:443/ovirt-engine/sso" > ENGINE_SSO_SERVICE_URL="https://localhost:443/ovirt-engine/sso" > ENGINE_SSO_SERVICE_SSL_VERIFY_HOST=false > ENGINE_SSO_SERVICE_SSL_VERIFY_CHAIN=true > SSO_ENGINE_URL="https://${ENGINE_FQDN}:443/ovirt-engine/" This looks fine to me. So do you still suffer from redirection to localhost when using engine FQDN as mentioned in Comment 17 to access webadmin/userportal?
Created attachment 1258580 [details] apache access log for bad virtual host config
I've attached an apache log that shows the error. to get this problem: 1) define two virtual hosts using apache 2.4 2) one virtual host config, that is not for the ovirt-engine, make the config file name precede the ovirt-engine host name (ie, create 'ovirt-engine-vhost.conf, and 'error-vhost.conf'. 3) in the error-vhost.conf, enable authentication - in otherwords, get an account/password before allowing to pass. restart apache, hit ovirt-engine virtual host, get the error in the browser/engine, and in the apache access_log, find this line: 127.0.0.1 - - [28/Feb/2017:21:51:08 -0800] "GET /ovirt-engine/sso/status HTTP/1.1" 401 381 "-" "Apache-HttpClient/4.5 (Java/1.8.0_121)" This totally fails the vhost requirement, so it's sent to the default vhost.. which is not the ovirt-engine vhost. You get the 401 because you are requiring a login, and ovirt-engine cannot do it. so this is what generates the error, and borks ovirt-engine. Moving the ovirt-engine vhost config to be first fixes this problem, because it becomes the default vhost, and works because you normally do not define any accounts in the ovirt-engine vhost (it's done in the engine SSO, not in apache/browser)
Could you please try to create /etc/ovirt-engine/engine.conf.d/99-custom-sso-service.conf file with following content: ENGINE_SSO_SERVICE_URL="https://${ENGINE_FQDN}:443/ovirt-engine/sso" And then restart ovirt-engine service and try to login? Please let me know if that solves your issue with default virtual host not handling ovirt content ...
This fixed the problem.
Verified with: ovirt-engine-extension-aaa-jdbc-1.1.4-1.el7ev.noarch ovirt-engine-extension-aaa-ldap-setup-1.3.1-1.el7ev.noarch ovirt-engine-extension-aaa-ldap-1.3.1-1.el7ev.noarch rhevm-4.1.2.1-0.1.el7.noarch # cat /etc/ovirt-engine/engine.conf.d/11-setup-sso.conf | grep ENGINE_SSO_SERVICE_URL ENGINE_SSO_SERVICE_URL="https://${ENGINE_FQDN}:443/ovirt-engine/sso"