Bug 1974658

Summary:

Python zlib test fails on z15

Product:

Red Hat Enterprise Linux 8

Reporter:

Honza Horak <hhorak>

Component:

zlib

Assignee:

Ondrej Dubaj <odubaj>

Status:

CLOSED NOTABUG

QA Contact:

RHEL CS Apps Subsystem QE <rhel-cs-apps-subsystem-qe>

Severity:

unspecified

Docs Contact:

Priority:

unspecified

Version:

---

CC:

bugproxy, databases-maint, dhorak, fweimer, jomiller, mhroncok, odubaj, panovotn, pkubat, praiskup, tstaudt, vstinner

Target Milestone:

beta

Target Release:

---

Hardware:

Unspecified

OS:

Linux

Whiteboard:

Fixed In Version:

Doc Type:

If docs needed, set a value

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2021-07-01 14:40:39 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Bug Depends On:

Bug Blocks:

1916117

Attachments:

Description	Flags
Minimal reproducer (taken from the python unit test)	none

Description Honza Horak 2021-06-22 09:11:35 UTC

Description of problem:
zlib compression has different output on z15 than on a different architecture, starting RHEL-8.4.0. This is likely related to changes done as part of RHEL-8.4.0 zlib update.

Version-Release number of selected component (if applicable):
# rpm -q platform-python python3-test
platform-python-3.6.8-37.el8.s390x
python3-test-3.6.8-37.el8.s390x


How reproducible:
using a python unit test

Steps to Reproduce:

(with the Python unit test)
1. /usr/libexec/platform-python -m test test_zlib -vv

Actual results:
======================================================================
FAIL: test_pair (test.test_zlib.CompressObjectTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/lib64/python3.6/test/test_zlib.py", line 238, in test_pair
    self.assertEqual(x1 + x2, datazip)
AssertionError: b'x\x9c\xe2\xf2qt\r\nq\r\xe6\xe2R\x80\x00\x7f\x1d\x[100999 chars]\x95' != b'x\x9c\xe3\xf2qt\r\nq\r\xe6\xe2R\x80\x00\x7f\x1d\x[114454 chars]\x95'

======================================================================
FAIL: test_speech128 (test.test_zlib.CompressTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/lib64/python3.6/test/test_zlib.py", line 181, in test_speech128
    self.assertEqual(zlib.compress(bytearray(data)), x)
AssertionError: b'x\x[4346 chars]8\xa6\x9a\xd8\nx\xa4W\xc00\xbf\x10[\x01\xc3t\x[110519 chars]\x95' != b'x\x[4346 chars]8\xa6z\xa4W\xc0\xc4V\xc00\xbf\x10[\x01\xc3t\xc[110108 chars]\x95'

----------------------------------------------------------------------

Ran 62 tests in 0.133s

FAILED (failures=2, skipped=1)
test test_zlib failed
test_zlib failed

== Tests result: FAILURE ==

1 test failed:
    test_zlib

Total duration: 150 ms
Tests result: FAILURE


Expected results:
Tests result: SUCCESS

Comment 1 Honza Horak 2021-06-22 09:14:03 UTC

Created attachment 1792946 [details]
Minimal reproducer (taken from the python unit test)

Another reproducer (extracted a more minimal one from the python test suite)

ON z13:

[root@ibm-z-117 ~]# /usr/libexec/platform-python 210622_zlib_reproducer
b'x\x9c\xed\xd5\xc1\x92\xdb\xc6\x11\x00'
b'x\x9c\xed\xd5\xc1\x92\xdb\xc6\x11\x00'
b'x\x9c\xed\xd5\xc1\x92\xdb\xc6\x11\x00'
b'x\x9c\xed\xd5\xc1\x92\xdb\xc6\x11\x00'

[root@ibm-z-117 ~]# echo $?
0

ON z15:

[root@s390x-kvm-030 ~]# /usr/libexec/platform-python 210622_zlib_reproducer
b'x\x9c\xe2\xf2qt\r\nq\r'
b'x\x9c\xe3\xf2qt\r\nq\r'
Traceback (most recent call last):
  File "test.sh", line 77, in <module>
    assert (x1 + x2) == datazip
AssertionError

[root@s390x-kvm-030 ~]# echo $?
1

Comment 2 Honza Horak 2021-06-22 09:16:55 UTC

Thomas, as you're mentioned as a contact in BZ#1847438, can you, please, confirm, whether it is expected behavior?

Comment 3 Thomas Staudt 2021-06-22 09:45:03 UTC

(In reply to Honza Horak from comment #2)
> Thomas, as you're mentioned as a contact in BZ#1847438, can you, please,
> confirm, whether it is expected behavior?

Hi Honza, sure, I'll mirror it to IBM and we'll take a look.

Comment 4 Victor Stinner 2021-06-22 14:05:14 UTC

Using attached script, I wrote the output of datazip in file "A" and x1+x2 in file "B". I analyzed the binary content with my tool Hachoir (a binary parser written in Python).

(A) Python zlib.compress(datasrc) creates a single "compressed_block" of 319.9 Kbit with final=True.

File size: 40000 bytes.

(B) Python "co = zlib.compressobj(); x1 = co.compress(data); x2 = co.flush()" creates 3 "compressed_blocks":

* block 0: 262.2 Kbit (final=False)
* block 1: 42.9 Kbit (final=False)
* block 2: 10 bits (final=True)

File size: 38140 bytes, smaller than (A) file!


I can understand that case (B) produces multiple compressed blocks, since it's a stream, and zlib doesn't know in advance if the write is done or not. So it writes a final "final" block with length=0 (total block size: 10 bits).

But I don't understand why case (B) produces two compressed blocks (262.2 Kbit and 42.9 Kbit) to compress datasrc (245 888 bytes).

Does the IBM hardware accelerator implementation have a limit on the maximum compress block size?

---

hachoir-urwid output.

datazip (file "A"):

0) file:a: ZLIB Data (39.1 KB)
   0.0) compression_method= deflate (4 bits)
   0.4) compression_info= 7: base-2 log of the window size (4 bits)
   1.0) flag_check_bits= 28 (5 bits)
   1.5) flag_dictionary_present= False (1 bits)
   1.6) flag_compression_level= Default (2 bits)
 - 2) data: Compressed Data (39.1 KB)
    + 0.0) compressed_block[0] (319.9 Kbit)                                                                                                    
         0.0) final= True: Is this the final block? (1 bits)
         0.1) compression_type= Fixed Huffman (2 bits)
         0.3) length_code[0]= 58: Literal Code '\n' (Huffman Code 58) (1 bytes)
         1.3) length_code[1]= 124: Literal Code 'L' (Huffman Code 124) (1 bytes)
         (...)
      39993.2) padding[0]= 0: Padding (6 bits)
   39996) data_checksum= 0x9519fed7: ADLER32 checksum of compressed data (4 bytes)


x1+x2 (file "B"):

0) file:b: ZLIB Data (37.2 KB)
   0.0) compression_method= deflate (4 bits)
   0.4) compression_info= 7: base-2 log of the window size (4 bits)
   1.0) flag_check_bits= 28 (5 bits)
   1.5) flag_dictionary_present= False (1 bits)
   1.6) flag_compression_level= Default (2 bits)
 - 2) data: Compressed Data (37.2 KB)
    - 0.0) compressed_block[0] (262.2 Kbit)                                                                                                    
         0.0) final= False: Is this the final block? (1 bits)
         0.1) compression_type= Fixed Huffman (2 bits)
         0.3) length_code[0]= 58: Literal Code '\n' (Huffman Code 58) (1 bytes)
         1.3) length_code[1]= 124: Literal Code 'L' (Huffman Code 124) (1 bytes)
         2.3) length_code[2]= 113: Literal Code 'A' (Huffman Code 113) (1 bytes)
         3.3) length_code[3]= 117: Literal Code 'E' (Huffman Code 117) (1 bytes)
         (...)
    - 32769.5) compressed_block[1] (42.9 Kbit)                                                                                                 
         0.0) final= False: Is this the final block? (1 bits)
         0.1) compression_type= Dynamic Huffman (2 bits)
         0.3) huff_num_length_codes= 29: Number of Literal/Length Codes, minus 257 (5 bits)
         1.0) huff_num_distance_codes= 29: Number of Distance Codes, minus 1 (5 bits)
         1.5) huff_num_code_length_codes= 15: Number of Code Length Codes, minus 4 (4 bits)
         2.1) huff_code_length_code[16]= 6: Code lengths for the code length alphabet (3 bits)                                                
         2.4) huff_code_length_code[17]= 5: Code lengths for the code length alphabet (3 bits)
         2.7) huff_code_length_code[18]= 6: Code lengths for the code length alphabet (3 bits)
         3.2) huff_code_length_code[0]= 5: Code lengths for the code length alphabet (3 bits)
         (...)
    - 38132.1) compressed_block[2] (10 bits)                                                                                                   
         0.0) final= True: Is this the final block? (1 bits)
         0.1) compression_type= Fixed Huffman (2 bits)
         0.3) length_code[0]= 0: Block Terminator Code (256) (Huffman Code 0) (7 bits)
      38133.3) padding[0]= 0: Padding (5 bits)
   38136) data_checksum= 0x9519fed7: ADLER32 checksum of compressed data (4 bytes)      


---


Command to create the two binary files using attached bug.py script:

# /usr/libexec/platform-python -i bug.py 
(...)
>>> open("a", "wb").write(datazip)
40000
>>> open("bb", "wb").write(x1 + x2)
33572


Command to open a binary file:

dnf install -y git
git clone https://github.com/vstinner/hachoir
/usr/libexec/platform-python -m venv env
env/bin/python -m pip install urwid 


Open file A:

env/bin/python /root/hachoir/hachoir-urwid --parser=zlib A


Open file B:

env/bin/python /root/hachoir/hachoir-urwid --parser=zlib B

Comment 5 Victor Stinner 2021-06-22 14:17:51 UTC

It is *not a bug* in zlib: the decompression gives back the original content as expected (see below).

The issue is that Python test_zlib makes assumptions about how "streamed" data is compressed. The test expected that compression "at once" (zlib.compress) gives the exact same binary output than "stream compresssion" ("co = zlib.compressobj(); x1 = co.compress(data); x2 = co.flush()").

Maybe zlib could be modified to produced a single compressed block for this specific code, but I don't think that it *has to*. Again, decompression works as expected.

I proposed to only skip the two tests (test_pair, test_speech128) on IBM z15 running Linux if the hardware accelerator is available.

---

Decompression works as expected (same MD5 checksum):

$ /usr/libexec/platform-python -i bug.py
(...)
>>> open("datasrc", "wb").write(datasrc)
245888

>>> import zlib
>>> C=zlib.decompress(datazip)
>>> open("C", "wb").write(C)
245888

>>> D=zlib.decompress(x1+x2)
245888

$ md5sum C D datasrc 
aa70f8965a19d5ec1ff5e83fad80cc24  C
aa70f8965a19d5ec1ff5e83fad80cc24  D
aa70f8965a19d5ec1ff5e83fad80cc24  datasrc

Comment 6 IBM Bug Proxy 2021-06-29 10:10:46 UTC

------- Comment From iii.com 2021-06-29 06:09 EDT-------
Another reason the output differs is the FHT/DHT heuristic. The zlib deflate algorithm can analyze the data distribution and decide whether it wants to use FHT or DHT for the next block, but the accelerator can't. Furthermore, looking at data in software would kill the accelerator performance. Therefore the following heuristic is used: the first 4k are compressed with FHT and the rest of the data with DHT. There are two quirks in the current implementation:
- It doesn't kick in if the the user passes a huge buffer with Z_FINISH.
- It doesn't try to be particularly strict about 4k in general - if the user passes a large buffer, then it all goes into a FHT block.

So, compress() creates a single FHT block. compressobj() creates a FHT block, a DHT block and a trailing block.

As mentioned in comment 8, this is not a bug, since zlib does not guarantee a particular compressed data structure (it's also subject to change between versions). So I'm closing this.

Comment 7 Ondrej Dubaj 2021-07-01 14:40:39 UTC

As I understand this issue is not a bug and the behaviour is expected, in my opinion it is valid to close this issue as NOTABUG. If there is a reason to keep this tracker opened, please reopen it.

Comment 8 Victor Stinner 2021-07-09 23:15:13 UTC

Honza Horak: if the test_zlib failure blocks someone, I can work on a fix on Python upstream, and then backport it in RHEL. It's unclear to me who is affected by the test_zlib failure, and if it blocks any Red Hat team. In case of doubt, I leave the issue as closed ;-)