Bug 1305744

Summary: Package data importes with incorrect character encoding
Product: Red Hat Satellite 5 Reporter: Kamudini Gazdikova <kshirsal>
Component: APIAssignee: Gennadii Altukhov <galtukho>
Status: CLOSED CURRENTRELEASE QA Contact: Pavel Studeník <pstudeni>
Severity: medium Docs Contact:
Priority: medium    
Version: 570CC: ahumbe, dyordano, galtukho, ogajduse, pstudeni, sean, tlestach
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: spacewalk-utils-2.5.1-20-sat Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1437011 (view as bug list) Environment:
Last Closed: 2017-06-21 12:15:54 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1385726    
Bug Blocks: 1358815    

Description Kamudini Gazdikova 2016-02-09 06:03:18 UTC
Description of problem:

Reproduced on Satellite 5.7 receiving an UTF-8 character in literal form \x{c2}\x{ae}. 

$ spacewalk-api --user=admin --password=password --server=satellite.example.com packages.getDetails "%session%" 4016
$result = {
            'arch_label' => 'x86_64',
            'build_host' => 'x86-012.build.bos.redhat.com',
            'file' => 'checkpolicy-2.0.22-1.el6.x86_64.rpm',
            'build_date' => '2010-06-16 12:18:16.0',
            'size' => '194260',
            'checksum' => 'fe3f25f8afee914c1cf7b191799238a082d4a0cc2ff2d095e528d315efdafc3d',
            'summary' => 'SELinux policy compiler
',
            'id' => '4016',
            'providing_channels' => [
                                    'rhel-x86_64-server-6'
                                  ],
            'epoch' => '',
            'checksum_type' => 'sha256',
            'version' => '2.0.22',
            'cookie' => 'None',
            'name' => 'checkpolicy',
            'path' => 'redhat/NULL/fe3/checkpolicy/2.0.22-1.el6/x86_64/fe3f25f8afee914c1cf7b191799238a082d4a0cc2ff2d095e528d315efdafc3d/checkpolicy-2.0.22-1.el6.x86_64.rpm',
            'release' => '1.el6',
            'description' => "Security-enhanced Linux is a feature of the Linux\x{c2}\x{ae} kernel and a number
of utilities with enhanced security functionality designed to add
mandatory access controls to Linux.  The Security-enhanced Linux
kernel contains new architectural components originally developed to
improve the security of the Flask operating system. These
architectural components provide general support for the enforcement
of many kinds of mandatory access control policies, including those
based on the concepts of Type Enforcement\x{c2}\x{ae}, Role-based Access
Control, and Multi-level Security.

This package contains checkpolicy, the SELinux policy compiler.
Only required for building policies.
",
            'license' => 'GPLv2',
            'payload_size' => '871188',
            'last_modified_date' => '2010-08-25 01:11:03.0',
            'vendor' => 'Red Hat, Inc.'
          };
$ locale
LANG=en_AU.utf8
LC_CTYPE="en_AU.utf8"
LC_NUMERIC="en_AU.utf8"
LC_TIME="en_AU.utf8"
LC_COLLATE="en_AU.utf8"
LC_MONETARY="en_AU.utf8"
LC_MESSAGES="en_AU.utf8"
LC_PAPER="en_AU.utf8"
LC_NAME="en_AU.utf8"
LC_ADDRESS="en_AU.utf8"
LC_TELEPHONE="en_AU.utf8"
LC_MEASUREMENT="en_AU.utf8"
LC_IDENTIFICATION="en_AU.utf8"
LC_ALL=

Version-Release number of selected component (if applicable):
spacewalk-utils-2.3.2-23.el6sat.noarch 
spacewalk-java-2.3.8-123.el6sat.noarch

How reproducible:
Every-time

Actual results:
To correct the discrepancy one has to  identify and correct the package summaries, descriptions, and copyrights which have corrupted encodings stored.

Expected results:
Encoding should be correctly imported/stored in the Postgres database.

Comment 2 Ondrej Gajdusek 2016-09-26 16:17:51 UTC
This happens only if the content is synchronized via satelltie-sync from RHN. 

Steps I did:
1)
Created channel, imported packages with corrupted description from any channel synced from RHN into it. Exported it via rhn-satellite-exporter (dump). Reimport it via satellite-sync and description is still corrupted.
2)
Another way... Download packages from RHN, tested if description is OK (rpm -qip) and import it into server via rhnpush. Checked again, desc OK. Then I tried the same process as in the 1) - export and import using rhn-satellite-exporter and satsync. After that description is OK.

checksum

I've investigated that the description is corrupted already in the moment when satsync is importing package metadata. According to this and step 2) I reckon that satellite-sync receive corrupted data already on its input. I assume that this is not Satellite bug.

Comment 3 Ondrej Gajdusek 2016-09-26 16:20:37 UTC
I forgot to add info about checksums. 
Checksum is the same in both cases, corrupted and non-corrupted description.

Comment 4 Gennadii Altukhov 2017-03-29 09:27:10 UTC
This bug is only for the problem with UTF-8 encoding in spacewalk-api tool. For wrong RPM data encoding imported by satellite-sync I created BZ 1437011.

Comment 5 Gennadii Altukhov 2017-03-29 10:45:20 UTC
bug is fixed in upstream. spacewalk.git:
e18d4f3c8bf744b46eb84a13634df869721ef1ed

Comment 10 Pavel Studeník 2017-05-22 15:55:08 UTC
Reproducer:

*) find package checkpolicy
 - show package on webui and get id of this package (in url)
 - show by command by id package from webui
>> spacewalk-api --user=admin --password=secure --server=localhost packages.getDetails %session% <id>
...
 'path' => 'redhat/NULL/fe3/checkpolicy/2.0.22-1.el6/x86_64/fe3f25f8afee914c1cf7b191799238a082d4a0cc2ff2d095e528d315efdafc3d/checkpolicy-2.0.22-1.el6.x86_64.rpm',
            'description' => 'Security-enhanced Linux is a feature of the Linux® kernel and a number
of utilities with enhanced security functionality designed to add
...

Verified with spacewalk-utils-2.5.1-23.el6sat.noarch