Bug 1602156
Summary: | imfile symlink uses source filename in metadata filename instead of given name | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Rich Megginson <rmeggins> | ||||||
Component: | rsyslog | Assignee: | Rich Megginson <rmeggins> | ||||||
Status: | CLOSED ERRATA | QA Contact: | Anping Li <anli> | ||||||
Severity: | high | Docs Contact: | |||||||
Priority: | unspecified | ||||||||
Version: | 7.6 | CC: | anli, dapospis, jvymazal, mgrepl, nhosoi, rmeggins | ||||||
Target Milestone: | rc | Keywords: | Triaged | ||||||
Target Release: | --- | ||||||||
Hardware: | Unspecified | ||||||||
OS: | Unspecified | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | rsyslog-8.24.0-28.el7 | Doc Type: | No Doc Update | ||||||
Doc Text: |
undefined
|
Story Points: | --- | ||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2018-10-30 10:17:55 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Description
Rich Megginson
2018-07-17 21:54:11 UTC
Noriko, have you seen this in your testing? Created attachment 1459578 [details] patch for https://github.com/rsyslog/rsyslog/pull/2749 This is a patch to apply on top of https://github.com/rsyslog/rsyslog/pull/2749 Created attachment 1459579 [details] patch to apply to current rhel-7.6 dist-git This patch applies to the current rhel-7.6 dist-git code - it is a backport to rhel-7.6 of https://bugzilla.redhat.com/attachment.cgi?id=1459578&action=edit @jvymazal - I could not submit a PR against your github jvymazal:wildcard_symlink so please apply https://bugzilla.redhat.com/attachment.cgi?id=1459578 to your PR (In reply to Rich Megginson from comment #2) > Noriko, have you seen this in your testing? Yes, I do. The same result: { "metadata": { "filename": "\/var\/lib\/docker\/containers\/cbc3840aa3fe025d29b4452e46114045e1f4501d2e621e152eb9cd788ea7db39\/cbc3840aa3fe025d29b4452e46114045e1f4501d2e621e152eb9cd788ea7db39-json.log", ... The filename is the real path although I'm specifying the symlink to watch in the config: input(type="imfile" File="/var/log/containers/loader-*.log" Tag="docker" ruleset="docker_logs") hold on the patches - with the new code I'm getting crashes at shutdown - valgrind reports this, a double free: 123 ==13777== Invalid read of size 8 124 ==13777== at 0x8604542: act_obj_destroy (imfile.c:958) 125 ==13777== by 0x860482C: act_obj_destroy_all (imfile.c:1021) 126 ==13777== by 0x8604964: fs_node_destroy (imfile.c:1073) 127 ==13777== by 0x8604940: fs_node_destroy (imfile.c:1070) 128 ==13777== by 0x86080CB: freeCnf (imfile.c:2022) 129 ==13777== by 0x14EE25: freeCnf (rsconf.c:220) 130 ==13777== by 0x14EFFE: rsconfDestruct (rsconf.c:232) 131 ==13777== by 0x121A33: deinitAll (rsyslogd.c:1987) 132 ==13777== by 0x121C93: main (rsyslogd.c:2083) 133 ==13777== Address 0x7b86638 is 24 bytes inside a block of size 136 free'd 134 ==13777== at 0x4C2FDAC: free (vg_replace_malloc.c:530) 135 ==13777== by 0x8604789: act_obj_destroy (imfile.c:1005) 136 ==13777== by 0x860482C: act_obj_destroy_all (imfile.c:1021) 137 ==13777== by 0x8604964: fs_node_destroy (imfile.c:1073) 138 ==13777== by 0x8604940: fs_node_destroy (imfile.c:1070) 139 ==13777== by 0x86080CB: freeCnf (imfile.c:2022) 140 ==13777== by 0x14EE25: freeCnf (rsconf.c:220) 141 ==13777== by 0x14EFFE: rsconfDestruct (rsconf.c:232) 142 ==13777== by 0x121A33: deinitAll (rsyslogd.c:1987) 143 ==13777== by 0x121C93: main (rsyslogd.c:2083) 144 ==13777== Block was alloc'd at 145 ==13777== at 0x4C30B06: calloc (vg_replace_malloc.c:711) 146 ==13777== by 0x8603738: act_obj_add (imfile.c:705) 147 ==13777== by 0x8604328: poll_tree (imfile.c:911) 148 ==13777== by 0x8604A4B: fs_node_walk (imfile.c:1091) 149 ==13777== by 0x86088E5: in_processEvent (imfile.c:2204) 150 ==13777== by 0x8608BB3: do_inotify (imfile.c:2277) 151 ==13777== by 0x8608CF2: runInput (imfile.c:2426) 152 ==13777== by 0x1A1E1B: thrdStarter (threads.c:223) 153 ==13777== by 0x5059593: start_thread (in /usr/lib64/libpthread-2.27.so) 154 ==13777== by 0x624202E: clone (in /usr/lib64/libc-2.27.so) Comment on attachment 1459579 [details]
patch to apply to current rhel-7.6 dist-git
Reviewed and tested.
Hmmm, my test did not show the double free... :p Rich, is your crash observed in rsyslog-8.36 (rhlog-1.0-rhel-7) or 8.24 (rhel-7.6)? I tested with rhel-7.6 (https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=17226456). Thanks you both for testing the new build, I was not aware of the requirement to list target file name anywhere, I will take a patch and add it to the next build and also update upstream PR with it. As for the possible double free I think I found it, there was one extra dereference which might cause it in some cases. Just one more question, I just properly re-red all the examples and affected code, so If I understood it correctly you want to list *symlink* name in the metadata, and not the actual file name from which data are coming from? If yes then this definitely needs some broader discussion because that is a major change of current behavior and goes directly against documentation of imfile which clearly states that: "Name of the file where the message originated from. This is most useful when using wildcards inside file monitors, because it then is the only way to know which file the message originated from. The value can be accessed using the %$!metadata!filename% property." (In reply to Jiří Vymazal from comment #12) > Just one more question, I just properly re-red all the examples and affected > code, so If I understood it correctly you want to list *symlink* name in the > metadata, and not the actual file name from which data are coming from? Correct. That is the way it works currently. > > If yes then this definitely needs some broader discussion because that is a > major change of current behavior and goes directly against documentation of > imfile which clearly states that: "Name of the file where the message > originated from. This is most useful when using wildcards inside file > monitors, because it then is the only way to know which file the message > originated from. The value can be accessed using the %$!metadata!filename% > property." I don't understand. Before the imfile wildcard symlink patch, imfile works fine with symlinks. That is, with https://bugzilla.redhat.com/show_bug.cgi?id=1602156#c0 I get the expected result: $!:{ "metadata": { "filename": "\/var\/log\/containers\/logging-kibana-1-ctwqh_openshift-logging_kibana-1ce792f7c50f2b2e38d56050f32e6c527589515e299ee2670a099a2d08361f79.log", "fileoffset": "5955682" } } I know this for a fact because I relied on this developing mmkubernetes. mmkubernetes _will not work_ if $!metadata!filename _does not contain_ the name of the symlink. (In reply to Noriko Hosoi from comment #10) > Rich, is your crash observed in rsyslog-8.36 (rhlog-1.0-rhel-7) or 8.24 > (rhel-7.6)? I tested with rhel-7.6 > (https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=17226456). crash is observed with latest upstream master 8.36 code. If I am gathering logs from /var/log/containers/*.log, and I configure imfile like this: input(type="imfile" file="/var/log/containers/*.log" tag="kubernetes" addmetadata="on") Doesn't this mean I want $!metadata!filename to contain filenames matching /var/log/containers/*.log? It seems very counter-intuitive that $!metadata!filename would contain filenames in /var/lib/docker/containers/uuid/uuid-json.log @jvymazal - looks like you decided to go in a different direction in your latest commit to https://github.com/jvymazal/rsyslog/tree/wildcard_symlink I have a complete implementation that works for the first step mentioned at https://github.com/rsyslog/rsyslog/pull/2749#issuecomment-406024642 my implementation is here: https://github.com/richm/rsyslog/commit/d9a2e869780cdcc7068b8160485f50309e29a86e basically, for every act_obj_s object, keep track of the link_source if the file has a link to it, and report the link_source as the $!metadata!filename. For example, if we have A -> B -> C The act_obj for C has a link_source of A since it is the "top" link, and $!metadata!filename will list A (if imfile is configured with file="A"). One wrinkle is that I'm using readlink without lstat - I'm using a buffer size of PATH_MAX+1, and truncating the buffer to the returned length. I don't know if it is possible for the returned buffer size to be greater than PATH_MAX - the man pages are unclear and I cannot find the source code for the implementation of readlink(2). If so, then the returned buffer will need to be truncated at the max of the returned length or PATH_MAX. The implementation in the man page using lstat seems needlessly complex when readlink will return -1 and EINVAL if the given path is not a symlink. There is an alternate way to call readlink, listed in the glibc filesystem manual: char * readlink_malloc (const char *filename) { int size = 100; char *buffer = NULL; while (1) { buffer = (char *) xrealloc (buffer, size); int nchars = readlink (filename, buffer, size); if (nchars < 0) { free (buffer); return NULL; } if (nchars < size) return buffer; size *= 2; } } W(In reply to Rich Megginson from comment #15) > If I am gathering logs from /var/log/containers/*.log, and I configure > imfile like this: > > input(type="imfile" file="/var/log/containers/*.log" tag="kubernetes" > addmetadata="on") > > Doesn't this mean I want $!metadata!filename to contain filenames matching > /var/log/containers/*.log? It seems very counter-intuitive that > $!metadata!filename would contain filenames in > /var/lib/docker/containers/uuid/uuid-json.log Well, my first though would be that filename should be actual file name from which the data came from (which is also what is in the doc and why I have originally written in that way). On the other hand one may also want to know why the data came from where it did. I will edit the PR to preserve current behavior (have not tested the code without my changes so I did not know it was that way). (In reply to Rich Megginson from comment #17) > There is an alternate way to call readlink, listed in the glibc filesystem > manual: > > char * > readlink_malloc (const char *filename) > { > int size = 100; > char *buffer = NULL; > > while (1) > { > buffer = (char *) xrealloc (buffer, size); > int nchars = readlink (filename, buffer, size); > if (nchars < 0) > { > free (buffer); > return NULL; > } > if (nchars < size) > return buffer; > size *= 2; > } > } Thanks for note, will use it, it will be more clear what is it doing this way. The feature rsyslog-mmkubernetes-8.24.0-33.el7.x86_64 works well when Openshift use docker json driver. The imfile symlink work as expected. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2018:3135 |