Bug 1650557 - Grafana is not working after WA upgrade to BU2
Summary: Grafana is not working after WA upgrade to BU2
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: web-admin-tendrl-ansible
Version: rhgs-3.4
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: RHGS 3.4.z Batch Update 2
Assignee: gowtham
QA Contact: Filip Balák
URL:
Whiteboard:
Depends On:
Blocks: 1652561
TreeView+ depends on / blocked
 
Reported: 2018-11-16 13:49 UTC by Filip Balák
Modified: 2022-07-09 10:16 UTC (History)
9 users (show)

Fixed In Version: tendrl-ansible-1.6.3-10.el7rhgs
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-12-17 17:06:56 UTC
Embargoed:


Attachments (Terms of Use)
grafana-server journal log (13.39 KB, text/x-vhdl)
2018-11-16 13:49 UTC, Filip Balák
no flags Details
Grafana screen with inspected element after update (120.73 KB, image/png)
2018-11-21 11:56 UTC, Filip Balák
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Github Tendrl tendrl-ansible issues 130 0 None closed Grafana root_url should updated with sub_path for reverse proxy when tendrl-ansible run 2020-10-26 01:20:34 UTC
Red Hat Bugzilla 1652590 0 high CLOSED Configuration files are not properly marked in grafana RPM package 2021-02-22 00:41:40 UTC
Red Hat Product Errata RHSA-2018:3829 0 None None None 2018-12-17 17:07:00 UTC

Internal Links: 1652590

Description Filip Balák 2018-11-16 13:49:43 UTC
Created attachment 1506414 [details]
grafana-server journal log

Description of problem:
grafana-server service is failing and tendrl-monitoring-integration is inactive when WA is upgraded according to [1] with repositories that contain BU2 packages.

Version-Release number of selected component (if applicable):
tendrl-ansible-1.6.3-9.el7rhgs.noarch
tendrl-api-1.6.3-8.el7rhgs.noarch
tendrl-api-httpd-1.6.3-8.el7rhgs.noarch
tendrl-commons-1.6.3-13.el7rhgs.noarch
tendrl-grafana-plugins-1.6.3-15.el7rhgs.noarch
tendrl-grafana-selinux-1.5.4-2.el7rhgs.noarch
tendrl-monitoring-integration-1.6.3-15.el7rhgs.noarch
tendrl-node-agent-1.6.3-11.el7rhgs.noarch
tendrl-notifier-1.6.3-4.el7rhgs.noarch
tendrl-selinux-1.5.4-2.el7rhgs.noarch
tendrl-ui-1.6.3-12.el7rhgs.noarch
grafana-4.6.4-1.el7rhgs.x86_64

How reproducible:
100% (tested 2x times)

Steps to Reproduce:
1. Install WA 3.4.0.
2. Import cluster with some volumes.
3. Update WA according to [1]. Use repositories with packages listed in section Version-Release number of selected component of this BZ.

Actual results:
UI works as expected but all links to grafana are not working and grafana-server service is failed.
In journal logs from grafana-server I see lines like:

(...)
Nov 16 14:04:53 wa-server systemd[1]: Started Grafana instance.
Nov 16 14:04:53 wa-server grafana-server[6500]: t=2018-11-16T14:04:53+0100 lvl=info msg="Starting Grafana" logger=server version=4.6.4 commit=unknown-dev compiled=2018-11-05T08:43:17+0100
Nov 16 14:04:53 wa-server grafana-server[6500]: t=2018-11-16T14:04:53+0100 lvl=info msg="Config loaded from" logger=settings file=/usr/share/grafana/conf/defaults.ini
Nov 16 14:04:53 wa-server grafana-server[6500]: t=2018-11-16T14:04:53+0100 lvl=info msg="Config loaded from" logger=settings file=/etc/grafana/grafana.ini
Nov 16 14:04:53 wa-server grafana-server[6500]: t=2018-11-16T14:04:53+0100 lvl=info msg="Config overridden from command line" logger=settings arg="default.paths.data=/var/lib/grafana"
Nov 16 14:04:53 wa-server grafana-server[6500]: t=2018-11-16T14:04:53+0100 lvl=info msg="Config overridden from command line" logger=settings arg="default.paths.logs=/var/log/grafana"
Nov 16 14:04:53 wa-server systemd[1]: grafana-server.service: main process exited, code=exited, status=1/FAILURE
Nov 16 14:04:53 wa-server systemd[1]: Unit grafana-server.service entered failed state.
(...)
Nov 16 14:04:54 wa-server grafana-server[6548]: t=2018-11-16T14:04:54+0100 lvl=info msg="Starting Grafana" logger=server version=4.6.4 commit=unknown-dev compiled=2018-11-05T08:43:17+0100
Nov 16 14:04:54 wa-server grafana-server[6548]: t=2018-11-16T14:04:54+0100 lvl=info msg="Config loaded from" logger=settings file=/usr/share/grafana/conf/defaults.ini
Nov 16 14:04:54 wa-server grafana-server[6548]: t=2018-11-16T14:04:54+0100 lvl=info msg="Config loaded from" logger=settings file=/etc/grafana/grafana.ini
Nov 16 14:04:54 wa-server grafana-server[6548]: t=2018-11-16T14:04:54+0100 lvl=info msg="Config overridden from command line" logger=settings arg="default.paths.data=/var/lib/grafana"
Nov 16 14:04:54 wa-server grafana-server[6548]: t=2018-11-16T14:04:54+0100 lvl=info msg="Config overridden from command line" logger=settings arg="default.paths.logs=/var/log/grafana"
Nov 16 14:04:54 wa-server grafana-server[6548]: t=2018-11-16T14:04:54+0100 lvl=info msg="Config overridden from command line" logger=settings arg="default.paths.plugins=/var/lib/grafana/plug
Nov 16 14:04:54 wa-server grafana-server[6548]: t=2018-11-16T14:04:54+0100 lvl=info msg="Path Home" logger=settings path=/usr/share/grafana
Nov 16 14:04:54 wa-server grafana-server[6548]: t=2018-11-16T14:04:54+0100 lvl=info msg="Path Data" logger=settings path=/usr/share/grafana/data
Nov 16 14:04:54 wa-server systemd[1]: grafana-server.service: main process exited, code=exited, status=1/FAILURE
Nov 16 14:04:54 wa-server systemd[1]: Unit grafana-server.service entered failed state.
Nov 16 14:04:54 wa-server systemd[1]: grafana-server.service failed.
Nov 16 14:04:54 wa-server systemd[1]: grafana-server.service holdoff time over, scheduling restart.
Nov 16 14:04:54 wa-server systemd[1]: Stopped Grafana instance.
Nov 16 14:04:54 wa-server systemd[1]: start request repeated too quickly for grafana-server.service
Nov 16 14:04:54 wa-server systemd[1]: Failed to start Grafana instance.
Nov 16 14:04:54 wa-server systemd[1]: Unit grafana-server.service entered failed state.
Nov 16 14:04:54 wa-server systemd[1]: grafana-server.service failed.


In tendrl-monitoring-integration journal log are only lines related to periodical starting/stopping tendrl-monitoring-integration except two lines:
(...)
Nov 16 14:04:54 wa-server systemd[1]: Dependency failed for Monitoring Integration.
Nov 16 14:04:54 wa-server systemd[1]: Job tendrl-monitoring-integration.service/start failed with result 'dependency'.
(...)


Expected results:
Grafana should work.

Additional info:

[1] https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.4/html-single/installation_guide/index#Updating_Red_Hat_Storage_in_the_Offline_Mode

Comment 2 Shubhendu Tripathi 2018-11-19 13:18:38 UTC
Gowtham, I understand this happened as grafana configuration was not proper. Please update your findings here.

Comment 3 gowtham 2018-11-20 11:06:49 UTC
I have debug in filip machine there what I found is, grafana configuration file path is not updated. It is referring its default configuration file only.

in /etc/sysconfig/grafana-server:
  CONF_DIR=/etc/grafana
   CONF_FILE=/etc/grafana/grafana.ini
this is not correct path

correct path is :
  CONF_DIR=/etc/tendrl/monitoring-integration/grafana/
  CONF_FILE=/etc/tendrl/monitoring-integration/grafana/grafana.ini

Comment 4 Filip Balák 2018-11-20 11:51:23 UTC
I have tracked file /etc/sysconfig/grafana-server during the whole process of upgrade as described in documentation and I found that the file is changed after step 2 in [1] in section `On Web Administration Server:` which is:
# yum update

Before this step:
# cat /etc/sysconfig/grafana-server| grep CONF
CONF_DIR=/etc/tendrl/monitoring-integration/grafana/
CONF_FILE=/etc/tendrl/monitoring-integration/grafana/grafana.ini

After updating all packages on WA server:
# cat /etc/sysconfig/grafana-server| grep CONF
CONF_DIR=/etc/grafana
CONF_FILE=/etc/grafana/grafana.ini

Next step where is called:
# tendrl-upgrade
is already executed with changed configuration.


[1] https://access.qa.redhat.com/documentation/en-us/red_hat_gluster_storage/3.4/html-single/quick_start_guide/#red_hat_gluster_storage_web_administration_3_4_x_to_3_4_y

Comment 6 Daniel Horák 2018-11-21 08:57:31 UTC
I didn't tried it, so I'm just guessing now, but wouldn't be sufficient to just
run tendrl-ansible after the `yum update`?

It shouldn't break anything (if there are no additional manual changes in the
configuration) and it might fix this issue (and prospectively any similar issue
in the future).

If it will work, I would consider it as the best and most systematic solution
for such issue (rather then manual steps to update the particular config file).

Comment 7 Nishanth Thomas 2018-11-21 09:26:38 UTC
I agree with Daniel here. Can you please try this out and update here?
Then we need to update the documentation as well.

Comment 8 gowtham 2018-11-21 09:30:37 UTC
I agree with Daniel, it won't break anything, it will work fine.

Comment 9 Filip Balák 2018-11-21 11:56:50 UTC
Created attachment 1507647 [details]
Grafana screen with inspected element after update

Comment 10 Filip Balák 2018-11-21 12:07:22 UTC
I tried running tendrl-ansible after upgrade is done. It fixed grafana configuration and started grafana-server service but grafana for WA is still not working correctly. As seen in attachment 1507647 [details] there are not loaded any js or css libraries.

I inspected source of this page that was updated and also of page that was freshly installed to current version without updating and main difference is in base element where is currently set href="/" but probably should be href="/grafana/". Url also seems wrong when linked from WA.

Tested with:
tendrl-ansible-1.6.3-9.el7rhgs.noarch
tendrl-api-1.6.3-8.el7rhgs.noarch
tendrl-api-httpd-1.6.3-8.el7rhgs.noarch
tendrl-commons-1.6.3-13.el7rhgs.noarch
tendrl-grafana-plugins-1.6.3-15.el7rhgs.noarch
tendrl-grafana-selinux-1.5.4-2.el7rhgs.noarch
tendrl-monitoring-integration-1.6.3-15.el7rhgs.noarch
tendrl-node-agent-1.6.3-11.el7rhgs.noarch
tendrl-notifier-1.6.3-4.el7rhgs.noarch
tendrl-selinux-1.5.4-2.el7rhgs.noarch
tendrl-ui-1.6.3-13.el7rhgs.noarch

Comment 11 Daniel Horák 2018-11-21 12:10:55 UTC
The root cause of this issue seems to be, that grafana configuration files are
not properly marked as configuration files in RPM.

There are three files in /etc/* which should be probably considered as
*configuration" files in grafana package:
  # rpm -ql grafana | grep /etc
  /etc/grafana
  /etc/grafana/grafana.ini
  /etc/grafana/ldap.toml
  /etc/sysconfig/grafana-server

But neither of them is marked as "configuration" file in RPM:
  # rpm -qc grafana | grep /etc
  # 

This leads to the situation, when package update rewrite the changes in the
configuration files to the default values.

Without fixing this, anything else will be just workaround with possible
additional consequences in the future.

So the main question now is:
Are we able to fix this problem in grafana package spec file?

Comment 12 Daniel Horák 2018-11-21 12:47:28 UTC
(In reply to Filip Balák from comment #10)
> I tried running tendrl-ansible after upgrade is done. It fixed grafana
> configuration and started grafana-server service but grafana for WA is still
> not working correctly. As seen in attachment 1507647 [details] there are not
> loaded any js or css libraries.
> 
> I inspected source of this page that was updated and also of page that was
> freshly installed to current version without updating and main difference is
> in base element where is currently set href="/" but probably should be
> href="/grafana/". Url also seems wrong when linked from WA.

The root cause of this issue is, that existing configuration file:
  /etc/tendrl/monitoring-integration/grafana/grafana.ini
is not updated by the new version from the update monitoring-integration
package (mainly the line "root_url = http://localhost:3000/grafana" is
missing).

This behaviour is correct, because:
1) the grafana.ini file is correctly marked as configuration file in
  tendrl-monitoring-integration package:
    # rpm -qc tendrl-monitoring-integration | grep grafana.ini
    /etc/tendrl/monitoring-integration/grafana/grafana.ini

2) And the configuration file was changed in comparison with the version from
  the rpm package (value for admin_password was changed by tendrl-ansible
  during installation process).

Because of the two points above, when the tendrl-monitoring-integration package
is updated, the existing configuration file is not overwritten and the new
version is is only saved with .rpmnew suffix:
  # ls /etc/tendrl/monitoring-integration/grafana/grafana.ini*
  /etc/tendrl/monitoring-integration/grafana/grafana.ini
  /etc/tendrl/monitoring-integration/grafana/grafana.ini.rpmnew

  # diff /etc/tendrl/monitoring-integration/grafana/grafana.ini \
         /etc/tendrl/monitoring-integration/grafana/grafana.ini.rpmnew 
  47c47
  < ;root_url = http://localhost:3000
  ---
  > root_url = http://localhost:3000/grafana
  146c146
  < admin_password = IPLmZggPtIagGxlejRyqrhGXoPFlTb
  ---
  > admin_password = admin

Generally in this situation it is up to the admin, to check the difference
between the two configuration files and update the existing configuration where
necessary.

In our specific case, I see two options:

a) document this as known issue and let the user manually edit the
  configuration file and add the line:
    root_url = http://localhost:3000/grafana

b) add new task to tendrl-ansible, to ensure, that this line will be present in
  the configuration file.

Comment 14 Nishanth Thomas 2018-11-22 11:04:44 UTC
We will add a new task to tendrl-ansible, to ensure, the correct root_url is present in the configuration file(grafana.ini). Also we need to add an additional step in the updgrade guide to add an additional step to run tendrl-ansible after the upgrade steps are completed.

@Daniel/Filip, Please create a documentation bug for the same.

Comment 18 Filip Balák 2018-11-28 08:35:00 UTC
I have tested the update with solution described in comment 14. tendrl-ansible was executed right after tendrl-upgrade script is finished. Grafana is working as expected after this but with verification of this bz I will wait for documentation (bz 1652561).

Tested with:
tendrl-ansible-1.6.3-10.el7rhgs.noarch
tendrl-api-1.6.3-8.el7rhgs.noarch
tendrl-api-httpd-1.6.3-8.el7rhgs.noarch
tendrl-commons-1.6.3-13.el7rhgs.noarch
tendrl-grafana-plugins-1.6.3-15.el7rhgs.noarch
tendrl-grafana-selinux-1.5.4-2.el7rhgs.noarch
tendrl-monitoring-integration-1.6.3-15.el7rhgs.noarch
tendrl-node-agent-1.6.3-11.el7rhgs.noarch
tendrl-notifier-1.6.3-4.el7rhgs.noarch
tendrl-selinux-1.5.4-2.el7rhgs.noarch
tendrl-ui-1.6.3-14.el7rhgs.noarch

Comment 19 Filip Balák 2018-12-06 12:12:42 UTC
I have retested update to BU2 with documentation from bz 1652561. Now it is configured correctly and grafana seems to work ok. --> VERIFIED

During testing of this bz there were found issues with web cache and one of these issues affects links to grafana as described in https://bugzilla.redhat.com/show_bug.cgi?id=1654331#c3 so user needs to remove web cache to see correct links to grafana but this is tracked in bz 1654331.

Tested with:
tendrl-ansible-1.6.3-10.el7rhgs.noarch
tendrl-api-1.6.3-8.el7rhgs.noarch
tendrl-api-httpd-1.6.3-8.el7rhgs.noarch
tendrl-commons-1.6.3-13.el7rhgs.noarch
tendrl-grafana-plugins-1.6.3-15.el7rhgs.noarch
tendrl-grafana-selinux-1.5.4-2.el7rhgs.noarch
tendrl-monitoring-integration-1.6.3-15.el7rhgs.noarch
tendrl-node-agent-1.6.3-11.el7rhgs.noarch
tendrl-notifier-1.6.3-4.el7rhgs.noarch
tendrl-selinux-1.5.4-2.el7rhgs.noarch
tendrl-ui-1.6.3-14.el7rhgs.noarch

Comment 20 errata-xmlrpc 2018-12-17 17:06:56 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:3829


Note You need to log in before you can comment on or make changes to this bug.