Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1194252

Summary: Remove binary data dumping on uploads in server_calls.log
Product: [Retired] Pulp Reporter: Irina Gulina <igulina>
Component: puppet-supportAssignee: pulp-bugs
Status: CLOSED UPSTREAM QA Contact: pulp-qe-list
Severity: high Docs Contact:
Priority: medium    
Version: unspecifiedCC: bmbouter, cduryee, igulina, rbarlow
Target Milestone: ---Keywords: Triaged
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-02-28 23:25:01 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Irina Gulina 2015-02-19 12:10:07 UTC
Description of problem:

>> rpm -qa pulp-server
pulp-server-2.6.0-0.7.beta.el6.noarch
>> rpm -qa pulp-puppet*
pulp-puppet-plugins-2.6.0-0.7.beta.el6.noarch
pulp-puppet-consumer-extensions-2.6.0-0.7.beta.el6.noarch
pulp-puppet-admin-extensions-2.6.0-0.7.beta.el6.noarch
pulp-puppet-handlers-2.6.0-0.7.beta.el6.noarch



>> pulp-admin puppet repo create --repo-id "Test_puppet"
>> wget https://forgeapi.puppetlabs.com/v3/files/jasonc-rabbitmq-2.0.0.tar.gz
>> pulp-admin puppet repo uploads upload --repo-id Test_puppet --file ./jasonc-rabbitmq-2.0.0.tar.gz 
>>  wget https://forgeapi.puppetlabs.com/v3/files/stahnma-epel-1.0.2.tar.gz
>> pulp-admin puppet repo uploads upload --repo-id Test_puppet --file ./stahnma-epel-1.0.2.tar.gz 
>> pulp-admin puppet repo publish run --repo-id Test_puppet

Modules were successfully published in: 
>> ls -l /var/lib/pulp/published/puppet/http/repos/Test_puppet/system/releases/
total 8
drwxr-xr-x. 3 apache apache 4096 Feb 19 06:18 j
drwxr-xr-x. 3 apache apache 4096 Feb 19 06:18 s

>> ls -l /var/lib/pulp/published/puppet/http/repos/Test_puppet/system/releases/j/jasonc/
total 4
lrwxrwxrwx. 1 apache apache 64 Feb 19 06:18 jasonc-rabbitmq-2.0.0.tar.gz -> /var/lib/pulp/content/puppet_module/jasonc-rabbitmq-2.0.0.tar.gz

Open ./.pulp/server_calls.log and observe a binary code there:

2015-02-19 06:16:03,506 - INFO - PUT request to /pulp/api/v2/content/uploads/c68fcfd4-db08-440c-bd15-50c6afdaf0c0/0/ with parameters ^_<8b>^H^HÚÃ@Q^@^Cjasonc-rabbitmq-2.0.0.tar^@í}ks^[G<92> ?ׯ¨Àx<83>ä^E^H<82>^DA<8a><92>í8<8a><82>,ÌP$<87>¤ìQllÌ5Ð^E°- ^[Ó^ORX¯÷·_¾êÑ^M<80>ôC§<89>¹^Uì^HI@uUVV¾3+û§¨ÈÒñn^^<8d>FI9ÿÇîA§Ûéî


2015-02-19 06:17:27,238 - INFO - PUT request to /pulp/api/v2/content/uploads/be80a459-d38d-44aa-bd73-8771359b7c4b/0/ with parameters ^_<8b>^H^@^R;<8b>T^@^Cì\{SÛÊ<92>Ïßú^T³<84>ZÈ<89>ßÏ<84>
çÆ^X^C^F^[<8c>mÌ#<9b>J<8d>å±-,i^T=lL.ûÙ·{F~<80>,arrr6u£ª^D[êî<99>éîùMwÏÈ<8e>K<87>¦AãÌbz<<9d>H%2ÉW?úJÁUÌçño&]Ìâßt:<9b>^R÷ýëU:<93>Ëe
Ù|.<93>y<95>Jç<80>þ^UÉÿð<9e>¬¸<Ç¥6!¯^\©<88>P:ªë@Òïÿ<8c>>ýÄË     Úß`.íQ<97>&n^]nþ<88>6ÐÀ<85>\.Ìþ`ù§ö/äó©W$õ#^Z^?îú^

Comment 1 Brian Bouterse 2015-02-19 12:18:11 UTC
The URL being submitted to is content/uploads/<uuid>/ so it makes sense that the payload would be binary data. That is the binary data of the uploaded puppet module. I would expect that same from rpm binary data or any kind of binary content being uploaded.

It looks a little strange, but it seems correct to me. Besides that the data looks strange is there some other correctness problem?

Comment 2 Irina Gulina 2015-02-19 12:30:38 UTC
Can't a log file contain a binary code? and if a pulp-server would dump a binary logs to jounralctl?

Comment 3 Brian Bouterse 2015-02-19 12:42:32 UTC
I think this example is really about binary 'data' not binary 'code'. Those are different terms.

Yes log files can contain binary or non-binary data.

What does pulp-server have to do with this? The log name is server_calls.log, but its really written by the client as it makes calls to the server. The server isn't involved with that log at all.

Also I don't think a proper logger (ie: journalctl or syslog) is involved in writing the server_calls.log file. It's literally just written out to a file as calls occur to the server. This is my understanding. It doesn't matter much anyway since the logger just logs the data as it receives it regardless of if it is binary data or not.

I don't think the question I asked was answered. I'll re-ask:

It looks a little strange, but it seems correct to me. Besides that the data looks strange because its binary data, is there some other correctness problem?

Comment 4 Irina Gulina 2015-02-19 13:07:09 UTC
If you consider this behaviour correct, close this bug, please. Thank you.

Comment 5 Irina Gulina 2015-02-19 14:00:43 UTC
If not possible to filter the binary data, please consider logging it in a terminal- and user-friendly way e.g. in form of base64-encoded chunks

Comment 7 Randy Barlow 2015-02-19 15:16:48 UTC
I agree that logging binary data is not particularly helpful or useful. In fact, the syslog RFCs specify that log data must be presented in UTF-8 or ASCII, so there is a precedent for log data being for human parsing and not for machine parsing.

On an implementation note, it wouldn't be hard to just remove the spots that log binary data (should only be uploads, right?) and replace that data with <binary data>, or something along those lines.

Comment 8 Randy Barlow 2015-02-19 15:17:34 UTC
Also, I don't view this as an RFE. I think it's a defect.

Comment 9 Brian Bouterse 2015-02-19 15:44:52 UTC
Story 145 [0] has a lot to do with this request. Once that is completed then the only way you would experience this is if you did an upload with pulp-admin and had -vv set. I expect the server_calls.log to log only with -vv.

I think of the -vv logger as logging the URL, headers, and request body. Is having the binary data present such a problem that we should deviate from this consistency? Also perhaps seeing the binary data could be useful in some situation.

Ensuring that log data adheres to UTF-8 or ASCII would be good for us to check, but that is a separate issue.

[0]:  https://pulp.plan.io/issues/145

Comment 10 Randy Barlow 2015-02-19 15:51:14 UTC
Hi Brian,

I think Story #145 will make this BZ even more important to fix. If you send binary data into a console it becomes corrupted. You can try this by catting /dev/urandom or a large binary file.

I think the server calls are reasonable to log when they are textual data (and 99% of our calls are).

Comment 11 Brian Bouterse 2015-02-19 16:23:22 UTC
After some reading, I agree that logging binary data is not useful. It's encoded as UTF-8 and can't usefully be compared against the original data. Given that ISOs could be huge files it could be bad if we don't stop server_calls.log from logging binary data.

Comment 12 Brian Bouterse 2015-02-28 23:25:01 UTC
Moved to https://pulp.plan.io/issues/705