| Summary: | journal data gets truncated at the 65530th(ish) character | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | mulhern <amulhern> |
| Component: | systemd | Assignee: | systemd-maint |
| Status: | CLOSED NOTABUG | QA Contact: | qe-baseos-daemons |
| Severity: | unspecified | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 7.2 | CC: | jsynacek, systemd-maint-list |
| Target Milestone: | rc | Keywords: | Patch |
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2016-11-29 11:28:23 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Bug Depends On: | |||
| Bug Blocks: | 1383699 | ||
Nice catch. After some debugging, I found out that this bug has nothing to do with the python bindings, as you have already written. Funnily enough, this bug got "fixed" by https://github.com/systemd/systemd/commit/1a2a0ac53bbe724caf41c2bcf22ab9ea32b63d4e. In my opinion, the real bug still stayed there even after the commit. I haven't verified that, though, but that should be trivial to do. Here's the reproducer: 1) Use this script to generate the data: $ cat test.py from systemd import journal # Generate 80k chars. msg = "hello!!!" * 10000 journal.send(msg) 2) Compile the following program: $ cat test.c #include <assert.h> #include <stdio.h> #include <systemd/sd-journal.h> /* Compile: gcc test.c -lsystemd -o test */ int main() { sd_journal *j; sd_id128_t boot_id; const void *data; size_t len; int r; r = sd_journal_open(&j, SD_JOURNAL_LOCAL_ONLY); assert(r >= 0); r = sd_id128_get_boot(&boot_id); assert(r >= 0); r = sd_journal_add_match(j, "_COMM=python", 0); assert(r >= 0); SD_JOURNAL_FOREACH(j) { sd_journal_get_data(j, "MESSAGE", &data, &len); if (len > 60000) printf("len:%d\n", len); } return 0; } 3) Run this reproducer script (WARNING: It will wipe the existing journal files!): $ cat reproduce.sh #!/bin/bash sudo systemctl daemon-reload sudo rm -rf /var/log/journal/* sudo systemctl restart systemd-journald python test.py sleep 0.5 ./test The reproducer could probably be even simpler, but this one works all the time. In my testing, I found out that to reproduce this problem, it is necessary to first write the 80k-char data in the journal and then read it. Only writing it once and then reading it with different builds of systemd didn't work. Also, even when the data is truncated to 65535 bytes, it can all be accessed from the journal using 'journalctl -o verbose', which is kind of weird. To sum up, in my opinion, we should definitely backport the patch mentioned above, because there seem to be some inconsistencies between the journal data and how its presented using the journal API and that, I think, could lead to very hard-to-find problems. (In reply to Jan Synacek from comment #2) > In my opinion, the real bug still stayed there even after the commit. And yes, it's still reproducible with the latest upstream master. https://github.com/systemd/systemd/issues/4757 ... and it's not a bug, see the upstream bug. |
Description of problem: It is possible to store messages with fields with values that exceed 65530 characters in length. It is possible to inspect the journal using journalctl and see all 65530+ characters in the field. But it is not possible to use python-systemd's [] operator for log entries (these are essentially dicts), to get all these characters, one only gets the first 65530 characters. Version-Release number of selected component (if applicable): systemd 219 How reproducible: Without fail. Steps to Reproduce: 1. Write an entry into the log with a field that contains more than 65530 chars. 2. Inspect the entry using journalctl, it is all there, to the last char. 3. Inspect the entry using systemd-journald, with code like: entry = journal_handle.get_previous() while entry: if graph_key in entry: return pydevDAG.StringUtils.from_string( entry[graph_key], pydevDAG.Reader.read ) entry = journal_handle.get_previous() return None Notice that using the [] operator to extract the field with more than 65530 chars from the entry gives but the first 65530 characters. Actual results: See steps to reproduce. Expected results: Get the entire contents of the field. Additional info: This may actually be a bug in systemd's own sd_journal_enumerate_data function since this function calculates the size of the field, and it is this size that the code in python-systemd seems to be relying on for the expected size of the field in Reader_get_all.