Bug 1276186
Summary: | Use of syslog results in all log messages at priority "emerg" | ||||||
---|---|---|---|---|---|---|---|
Product: | [Red Hat Storage] Red Hat Ceph Storage | Reporter: | Peter Portante <pportant> | ||||
Component: | RADOS | Assignee: | Brad Hubbard <bhubbard> | ||||
Status: | CLOSED ERRATA | QA Contact: | ceph-qe-bugs <ceph-qe-bugs> | ||||
Severity: | high | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | 1.3.1 | CC: | amaredia, bhubbard, ceph-eng-bugs, charcrou, cstpierr, dmick, dzafman, flucifre, hnallurv, kchai, kdreyer, perfbz, pportant, shmohan, sweil, tganguly, vakulkar | ||||
Target Milestone: | rc | Keywords: | Patch | ||||
Target Release: | 1.3.3 | ||||||
Hardware: | x86_64 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | RHEL: ceph-0.94.5-11.el7cp Ubuntu: ceph_0.94.5-6redhat1trusty | Doc Type: | Bug Fix | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2016-05-06 18:39:49 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Peter Portante
2015-10-29 02:30:46 UTC
I'm looking into this. This is the call site responsible for those messages: https://github.com/ceph/ceph/blob/master/src/log/Log.cc#L244 1- We're not specifying a priority. Perhaps if you don't it defaults to EMERG? That would be an easy fix. 2- We're not calling openlog()... not sure that matters, but I just noticed it on the man page. syslog is a stupid, stupid interface. Indeed, the first arg "priority" is *really* "priority | facility", and since priority is "the low three bits" and facility is "the next three bits", LOG_USER alone implies LOG_USER | LOG_EMERG. Also, it seems like we might be interested in encoding the actual priority in the log based on the ceph log level, but it's not available at Log::_flush(). That's not really this bug, but it might be good to fix both at the same time. (In reply to Dan Mick from comment #4) > syslog is a stupid, stupid interface. Indeed, the first arg "priority" is > *really* "priority | facility", and since priority is "the low three bits" > and facility is "the next three bits", LOG_USER alone implies LOG_USER | > LOG_EMERG. Right. Would you also want to fix the reference here: https://github.com/ceph/ceph/blob/master/src/log/Log.cc#L266 All of those messages would also show up as "emerg", priority 0, so perhaps line 266 would want LOG_USER|LOG_DEBUG? (In reply to Peter Portante from comment #6) > Would you also want to fix the reference here: > > https://github.com/ceph/ceph/blob/master/src/log/Log.cc#L266 > > All of those messages would also show up as "emerg", priority 0, so perhaps > line 266 would want LOG_USER|LOG_DEBUG? Sure Peter, I'm still reviewing the code and behavior but we will certainly try to catch all instances. Peter, If I create a test package that includes a patch for this would you want to test it out? What version would you want it base on? 0.94.1-13.el7cp ? Thanks, Brad. Yes, I would like to test that package if you create it, and 0.94.1-13.el7cp would be fine. Sorry about the delay here, I was traveling last week. I should have this built and tested tomorrow. Created attachment 1092042 [details]
Patch to address syslog priority issue
Once I have your feedback I'll get a tracker opened upstream and submit my patch Did you manage to test with the package I created Peter? Sorry Brad, I have not had a chance to test the changes yet. Soon, hopefully, probably by the end of the second week of December, if all goes well. Okay, no problem thanks. Created http://tracker.ceph.com/issues/13993 and submitted a PR for this to save time. The timing is sticky because upstream's test lab is still down, so upstream hasn't been able to verify the fix and merge it. (In reply to Ken Dreyer (Red Hat) from comment #18) > The timing is sticky because upstream's test lab is still down, so upstream > hasn't been able to verify the fix and merge it. Not urgent at the moment Ken. Could I get a set of bits deployable on Fedora 23 for testing? (In reply to Peter Portante from comment #20) > Could I get a set of bits deployable on Fedora 23 for testing? This bug is for RHCS, we don't ship RHCS for Fedora 23 so we can't provide you with RHCS bits for F23? Right, but there's an upstream bug/fix/package production process as well... (In reply to Dan Mick from comment #22) > Right, but there's an upstream bug/fix/package production process as well... Dan is right of course. The most appropriate place for a discussion about Fedora binaries is a Fedora bug, I can build a test package for F23 but I don't believe this is the appropriate place to discuss it, nor provide it, and doing so does not progress this Bugzilla at all. Can we verify upstream first then? Did this fix make it all the way out to master yet? If so, can I just install that on F23 first? "ceph-deploy --dev master"? If it is fixed there, then I can build a RHEL 7.2 box and try out that RHCS version explicitly after. Does that sound reasonable? (In reply to Peter Portante from comment #24) > Can we verify upstream first then? Did this fix make it all the way out to > master yet? If so, can I just install that on F23 first? "ceph-deploy > --dev master"? Upstream tracker is http://tracker.ceph.com/issues/13993 and the PR https://github.com/ceph/ceph/pull/6815 was merged into Jewel. There a PRs for backports to Hammer and Infernalis but they are not merged yet, possibly due to the season. $ git show 0a4b7ab20c2979f1de97ac4b0d8bc5a78c5bce16 commit 0a4b7ab20c2979f1de97ac4b0d8bc5a78c5bce16 Merge: d7581cd 8e93f3f Author: Sage Weil <sage> Date: Sat Dec 19 13:58:37 2015 -0500 Merge pull request #6815 from badone/wip-13993 common: log: Assign LOG_DEBUG priority to syslog calls Reviewed-by: Sage Weil <sage> $ git branch -r --contains 0a4b7ab20c2979f1de97ac4b0d8bc5a78c5bce16 origin/HEAD -> origin/master origin/jewel origin/master origin/wip-cmake-reorg origin/wip-cmake-rocksdb origin/wip-doc-style origin/wip-rgw-new-multisite origin/wip-sage-testing $ git branch * master $ git blame src/log/Log.cc|grep syslog\( 8e93f3f4 (Brad Hubbard 2015-12-07 11:31:28 +1000 259) syslog(LOG_USER|LOG_DEBUG, "%s", buf); 8e93f3f4 (Brad Hubbard 2015-12-07 11:31:28 +1000 287) syslog(LOG_USER|LOG_DEBUG, "%s", s); > > If it is fixed there, then I can build a RHEL 7.2 box and try out that RHCS > version explicitly after. > > Does that sound reasonable? Sure. Let me know via the upstream tracker if you need help with the upstream part. Okay, I was able to hack and slash my way through an F22 install of upstream master to see the logging fixed worked there. So I can build a RHEL 7.2 box to try out RHCS packages. I'll do that now, and then if you could post the proper install instructions to try this fix out there, that would be great. Sorry to be a bother, but for #11, it seems to assume I already have RHCS installed. I'd like to use something very simple and straight-forward to get an RHCS install up and running, like https://www.berrange.com/posts/2015/12/21/ceph-single-node-deployment-on-fedora-23/ Can I use ceph-deploy to install from your provided packages with the fix? (In reply to Peter Portante from comment #28) > Sorry to be a bother, but for #11, it seems to assume I already have RHCS > installed. > > I'd like to use something very simple and straight-forward to get an RHCS > install up and running, like > https://www.berrange.com/posts/2015/12/21/ceph-single-node-deployment-on- > fedora-23/ > > Can I use ceph-deploy to install from your provided packages with the fix? Just run the yum install command on each machine you want in your cluster and skip the "ceph-deploy install" step. Verified. I do not see any logging at emerg level with these installed. Thanks! dev-ack'ing since it's merged upstream; let's get this patch into RHCS 1.3.2 if we can. (In reply to Ken Dreyer (Red Hat) from comment #32) > dev-ack'ing since it's merged upstream; let's get this patch into RHCS 1.3.2 > if we can. Unfortunately this missed the 1.3.2 cut-off :( *** Bug 1312144 has been marked as a duplicate of this bug. *** Can we get a hotfix for this? Thanks This bug was accidentally moved from POST to MODIFIED via an error in automation, please see mmccune with any questions This issue appears to be resolved as we are not seeing any "emerg" message in syslog Marking this BUG as verified. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2016-0721.html |