Bug 2106979 - Python ctypes regression causing rawhide CI failures
Summary: Python ctypes regression causing rawhide CI failures
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Fedora
Classification: Fedora
Component: python3.11
Version: rawhide
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Python Maintainers
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks: PYTHON3.11 2098773
TreeView+ depends on / blocked
 
Reported: 2022-07-14 03:57 UTC by Nathan Scott
Modified: 2022-07-18 00:02 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-07-17 23:17:38 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
Reproducible test case (2.64 KB, text/plain)
2022-07-14 03:57 UTC, Nathan Scott
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Github performancecopilot pcp pull 1629 0 None open python api: correct refcounting on pmParseMetricSpec source buffer 2022-07-18 00:00:00 UTC

Description Nathan Scott 2022-07-14 03:57:08 UTC
Created attachment 1896936 [details]
Reproducible test case

Recent updates to python in rawhide has resulted in CI failure in the PCP package, in the libpcp ctypes-based python wrapper module.

I've reduced it down to a minimal reproducer (attached).  This fails every time on rawhide, but this code has worked for the past ~8 years or so (i.e. through many iterations of python2 and all of python3) - having stared at it for several hours now, I'm fairly confident the PCP side of things is correct and this is an issue in python itself.

The core of the problem involves the 'source' field of a ctypes-defined structure mapping to a libpcp C structure (pmMetricSpec).  Its a relatively simple structure of c_int and c_char_p fields.  Note the value of the source vs metrics fields in the output below.  The reproduced requires libpcp and it might also help to install pcp-doc which contains the pmParseMetricSpec(3) man page with further details of the API (its all fairly simple).

Please let me know if any further clarification is needed.  Thanks!

[pcpqa@rawhide ctypesbug]$ python --version
Python 3.11.0b3

[pcpqa@rawhide ctypesbug]$ python metricspec.py
b'localhost'
<class 'bytes'>
b'kernel.all.load'
<class 'bytes'>
<__main__.LP_pmMetricSpec object at 0x7f37c7975910>
<class '__main__.LP_pmMetricSpec'>
pmParseMetricSpec raw: pmMetricSpec@0x55ed37cbc5d0 arch=0 source=b'\xa0Y\x97\xc77\x7f' metric=b'kernel.all.load' insts=[]
pmParseMetricSpec rawsrctype: <class 'bytes'>
pmParseMetricSpec source type0: <class 'bytes'>
pmParseMetricSpec source0: b'\xa0Y\x97\xc77\x7f'
pmParseMetricSpec source type: <class 'str'>
pmParseMetricSpec source: kernel.all.load
pmParseMetricSpec metric type: <class 'str'>
pmParseMetricSpec metric: kernel.all.load
Expected result source localhost not kernel.all.load
[pcpqa@rawhide ctypesbug]$ 


That final line ("Expected result...") should not be printed.  The encoded value for source0 5 lines prior looks very suspect.

cheers.

Comment 1 Miro Hrončok 2022-07-14 09:30:19 UTC
On Fedora 35 I installed pcp-libs and run the reproducer with various Pythons. I got "Expected result source localhost not kernel.all.load" with python3.6, python3.7, python3.8, python3.9, python3.10, python3.11.


$ for python in python2.7 python3.6 python3.7 python3.8 python3.9 python3.10 python3.11; do $python --version; $python metricspec.py; echo; echo; done
Python 2.7.18
localhost
<type 'str'>
kernel.all.load
<type 'str'>
<__main__.LP_pmMetricSpec object at 0x7fbccc5f80e0>
<class '__main__.LP_pmMetricSpec'>
('pmParseMetricSpec raw:', <__main__.pmMetricSpec object at 0x7fbcd9bc1dd0>)
('pmParseMetricSpec rawsrctype:', <type 'str'>)
('pmParseMetricSpec source type0:', <type 'str'>)
('pmParseMetricSpec source0:', 'localhost')
('pmParseMetricSpec source type:', <type 'unicode'>)
('pmParseMetricSpec source:', u'localhost')
('pmParseMetricSpec metric type:', <type 'unicode'>)
('pmParseMetricSpec metric:', u'kernel.all.load')


Python 3.6.15
b'localhost'
<class 'bytes'>
b'kernel.all.load'
<class 'bytes'>
<__main__.LP_pmMetricSpec object at 0x7ff7ba47cd40>
<class '__main__.LP_pmMetricSpec'>
pmParseMetricSpec raw: pmMetricSpec@0x556e0e58e6a0 arch=0 source=b' ' metric=b'kernel.all.load' insts=[]
pmParseMetricSpec rawsrctype: <class 'bytes'>
pmParseMetricSpec source type0: <class 'bytes'>
pmParseMetricSpec source0: b' '
pmParseMetricSpec source type: <class 'str'>
pmParseMetricSpec source: kernel.all.load
pmParseMetricSpec metric type: <class 'str'>
pmParseMetricSpec metric: kernel.all.load
Expected result source localhost not kernel.all.load


Python 3.7.13
b'localhost'
<class 'bytes'>
b'kernel.all.load'
<class 'bytes'>
<__main__.LP_pmMetricSpec object at 0x7fc48a2a93b0>
<class '__main__.LP_pmMetricSpec'>
pmParseMetricSpec raw: pmMetricSpec@0x55669cf44140 arch=0 source=b' ' metric=b'kernel.all.load' insts=[]
pmParseMetricSpec rawsrctype: <class 'bytes'>
pmParseMetricSpec source type0: <class 'bytes'>
pmParseMetricSpec source0: b' '
pmParseMetricSpec source type: <class 'str'>
pmParseMetricSpec source: kernel.all.load
pmParseMetricSpec metric type: <class 'str'>
pmParseMetricSpec metric: kernel.all.load
Expected result source localhost not kernel.all.load


Python 3.8.13
b'localhost'
<class 'bytes'>
b'kernel.all.load'
<class 'bytes'>
<__main__.LP_pmMetricSpec object at 0x7f62085cc8c0>
<class '__main__.LP_pmMetricSpec'>
pmParseMetricSpec raw: pmMetricSpec@0x5563a99e3d50 arch=0 source=b'@\xd3H\x08b\x7f' metric=b'kernel.all.load' insts=[]
pmParseMetricSpec rawsrctype: <class 'bytes'>
pmParseMetricSpec source type0: <class 'bytes'>
pmParseMetricSpec source0: b'@\xd3H\x08b\x7f'
pmParseMetricSpec source type: <class 'str'>
pmParseMetricSpec source: kernel.all.load
pmParseMetricSpec metric type: <class 'str'>
pmParseMetricSpec metric: kernel.all.load
Expected result source localhost not kernel.all.load


Python 3.9.13
b'localhost'
<class 'bytes'>
b'kernel.all.load'
<class 'bytes'>
<__main__.LP_pmMetricSpec object at 0x7fbf49f700c0>
<class '__main__.LP_pmMetricSpec'>
pmParseMetricSpec raw: pmMetricSpec@0x5586215256c0 arch=0 source=b'\xc0\x9b\xd5I\xbf\x7f' metric=b'kernel.all.load' insts=[]
pmParseMetricSpec rawsrctype: <class 'bytes'>
pmParseMetricSpec source type0: <class 'bytes'>
pmParseMetricSpec source0: b'\xc0\x9b\xd5I\xbf\x7f'
pmParseMetricSpec source type: <class 'str'>
pmParseMetricSpec source: kernel.all.load
pmParseMetricSpec metric type: <class 'str'>
pmParseMetricSpec metric: kernel.all.load
Expected result source localhost not kernel.all.load


Python 3.10.5
b'localhost'
<class 'bytes'>
b'kernel.all.load'
<class 'bytes'>
<__main__.LP_pmMetricSpec object at 0x7f06adbf6f40>
<class '__main__.LP_pmMetricSpec'>
pmParseMetricSpec raw: pmMetricSpec@0x55c3ef0f1d80 arch=0 source=b'@=\xab\xad\x06\x7f' metric=b'kernel.all.load' insts=[]
pmParseMetricSpec rawsrctype: <class 'bytes'>
pmParseMetricSpec source type0: <class 'bytes'>
pmParseMetricSpec source0: b'@=\xab\xad\x06\x7f'
pmParseMetricSpec source type: <class 'str'>
pmParseMetricSpec source: kernel.all.load
pmParseMetricSpec metric type: <class 'str'>
pmParseMetricSpec metric: kernel.all.load
Expected result source localhost not kernel.all.load


Python 3.11.0b4
b'localhost'
<class 'bytes'>
b'kernel.all.load'
<class 'bytes'>
<__main__.LP_pmMetricSpec object at 0x7f0ea3521400>
<class '__main__.LP_pmMetricSpec'>
pmParseMetricSpec raw: pmMetricSpec@0x55fb06cdcac0 arch=0 source=b' \x15R\xa3\x0e\x7f' metric=b'kernel.all.load' insts=[]
pmParseMetricSpec rawsrctype: <class 'bytes'>
pmParseMetricSpec source type0: <class 'bytes'>
pmParseMetricSpec source0: b' \x15R\xa3\x0e\x7f'
pmParseMetricSpec source type: <class 'str'>
pmParseMetricSpec source: kernel.all.load
pmParseMetricSpec metric type: <class 'str'>
pmParseMetricSpec metric: kernel.all.load
Expected result source localhost not kernel.all.load




In rawhide mock, I still get it with both 3.10 and 3.11:

<mock-chroot> sh-5.1$ python3.10 metricspec.py 
b'localhost'
<class 'bytes'>
b'kernel.all.load'
<class 'bytes'>
<__main__.LP_pmMetricSpec object at 0x7f3be3bc84c0>
<class '__main__.LP_pmMetricSpec'>
pmParseMetricSpec raw: pmMetricSpec@0x55fb4bce6d80 arch=0 source=b'\xc0p\xa1\xe3;\x7f' metric=b'kernel.all.load' insts=[]
pmParseMetricSpec rawsrctype: <class 'bytes'>
pmParseMetricSpec source type0: <class 'bytes'>
pmParseMetricSpec source0: b'\xc0p\xa1\xe3;\x7f'
pmParseMetricSpec source type: <class 'str'>
pmParseMetricSpec source: kernel.all.load
pmParseMetricSpec metric type: <class 'str'>
pmParseMetricSpec metric: kernel.all.load
Expected result source localhost not kernel.all.load

<mock-chroot> sh-5.1$ python3.11 metricspec.py 
b'localhost'
<class 'bytes'>
b'kernel.all.load'
<class 'bytes'>
<__main__.LP_pmMetricSpec object at 0x7f55cf461250>
<class '__main__.LP_pmMetricSpec'>
pmParseMetricSpec raw: pmMetricSpec@0x561b75b7f270 arch=0 source=b'p\x13F\xcfU\x7f' metric=b'kernel.all.load' insts=[]
pmParseMetricSpec rawsrctype: <class 'bytes'>
pmParseMetricSpec source type0: <class 'bytes'>
pmParseMetricSpec source0: b'p\x13F\xcfU\x7f'
pmParseMetricSpec source type: <class 'str'>
pmParseMetricSpec source: kernel.all.load
pmParseMetricSpec metric type: <class 'str'>
pmParseMetricSpec metric: kernel.all.load
Expected result source localhost not kernel.all.load



I don't really understand what am I looking at, but I don't see any regression myself. You say that final line ("Expected result...") should not be printed, but I clearly get it on any Python version except of Python 2.7. Am I running it wrongly?

Comment 2 Miro Hrončok 2022-07-14 09:35:24 UTC
Looking at this code:

    source = 'localhost'
    metric = 'kernel.all.load'
    resultp = pmMetricSpec.fromString(metric, 0, source)


I am pretty sure you should be feeding bytes, not unicode strings, to a C API like this:

    source = b'localhost'
    metric = b'kernel.all.load'
    resultp = pmMetricSpec.fromString(metric, 0, source)

If I also remove the .decode() calls from the excepted results, like this:

        resultm = resultp.contents.metric
        results = resultp.contents.source

I get:


$ for python in python2.7 python3.6 python3.7 python3.8 python3.9 python3.10 python3.11; do $python --version; $python metricspec.py; echo; echo; done
Python 2.7.18
localhost
<type 'str'>
kernel.all.load
<type 'str'>
<__main__.LP_pmMetricSpec object at 0x7fcf3b3aa0e0>
<class '__main__.LP_pmMetricSpec'>
('pmParseMetricSpec raw:', <__main__.pmMetricSpec object at 0x7fcf48973dd0>)
('pmParseMetricSpec rawsrctype:', <type 'str'>)
('pmParseMetricSpec source type0:', <type 'str'>)
('pmParseMetricSpec source0:', 'localhost')
('pmParseMetricSpec source type:', <type 'str'>)
('pmParseMetricSpec source:', 'localhost')
('pmParseMetricSpec metric type:', <type 'str'>)
('pmParseMetricSpec metric:', 'kernel.all.load')


Python 3.6.15
b'localhost'
<class 'bytes'>
b'kernel.all.load'
<class 'bytes'>
<__main__.LP_pmMetricSpec object at 0x7f4afcb9cd40>
<class '__main__.LP_pmMetricSpec'>
pmParseMetricSpec raw: pmMetricSpec@0x5576d462dbd0 arch=0 source=b'localhost' metric=b'kernel.all.load' insts=[]
pmParseMetricSpec rawsrctype: <class 'bytes'>
pmParseMetricSpec source type0: <class 'bytes'>
pmParseMetricSpec source0: b'localhost'
pmParseMetricSpec source type: <class 'bytes'>
pmParseMetricSpec source: b'localhost'
pmParseMetricSpec metric type: <class 'bytes'>
pmParseMetricSpec metric: b'kernel.all.load'


Python 3.7.13
b'localhost'
<class 'bytes'>
b'kernel.all.load'
<class 'bytes'>
<__main__.LP_pmMetricSpec object at 0x7f9dd0bc33b0>
<class '__main__.LP_pmMetricSpec'>
pmParseMetricSpec raw: pmMetricSpec@0x55780f9c88c0 arch=0 source=b'localhost' metric=b'kernel.all.load' insts=[]
pmParseMetricSpec rawsrctype: <class 'bytes'>
pmParseMetricSpec source type0: <class 'bytes'>
pmParseMetricSpec source0: b'localhost'
pmParseMetricSpec source type: <class 'bytes'>
pmParseMetricSpec source: b'localhost'
pmParseMetricSpec metric type: <class 'bytes'>
pmParseMetricSpec metric: b'kernel.all.load'


Python 3.8.13
b'localhost'
<class 'bytes'>
b'kernel.all.load'
<class 'bytes'>
<__main__.LP_pmMetricSpec object at 0x7f99c2bee8c0>
<class '__main__.LP_pmMetricSpec'>
pmParseMetricSpec raw: pmMetricSpec@0x56390e66b360 arch=0 source=b'localhost' metric=b'kernel.all.load' insts=[]
pmParseMetricSpec rawsrctype: <class 'bytes'>
pmParseMetricSpec source type0: <class 'bytes'>
pmParseMetricSpec source0: b'localhost'
pmParseMetricSpec source type: <class 'bytes'>
pmParseMetricSpec source: b'localhost'
pmParseMetricSpec metric type: <class 'bytes'>
pmParseMetricSpec metric: b'kernel.all.load'


Python 3.9.13
b'localhost'
<class 'bytes'>
b'kernel.all.load'
<class 'bytes'>
<__main__.LP_pmMetricSpec object at 0x7f71be3b70c0>
<class '__main__.LP_pmMetricSpec'>
pmParseMetricSpec raw: pmMetricSpec@0x556fec705df0 arch=0 source=b'localhost' metric=b'kernel.all.load' insts=[]
pmParseMetricSpec rawsrctype: <class 'bytes'>
pmParseMetricSpec source type0: <class 'bytes'>
pmParseMetricSpec source0: b'localhost'
pmParseMetricSpec source type: <class 'bytes'>
pmParseMetricSpec source: b'localhost'
pmParseMetricSpec metric type: <class 'bytes'>
pmParseMetricSpec metric: b'kernel.all.load'


Python 3.10.5
b'localhost'
<class 'bytes'>
b'kernel.all.load'
<class 'bytes'>
<__main__.LP_pmMetricSpec object at 0x7fc30346af40>
<class '__main__.LP_pmMetricSpec'>
pmParseMetricSpec raw: pmMetricSpec@0x55e44e8c2cc0 arch=0 source=b'localhost' metric=b'kernel.all.load' insts=[]
pmParseMetricSpec rawsrctype: <class 'bytes'>
pmParseMetricSpec source type0: <class 'bytes'>
pmParseMetricSpec source0: b'localhost'
pmParseMetricSpec source type: <class 'bytes'>
pmParseMetricSpec source: b'localhost'
pmParseMetricSpec metric type: <class 'bytes'>
pmParseMetricSpec metric: b'kernel.all.load'


Python 3.11.0b4
b'localhost'
<class 'bytes'>
b'kernel.all.load'
<class 'bytes'>
<__main__.LP_pmMetricSpec object at 0x7f6607205400>
<class '__main__.LP_pmMetricSpec'>
pmParseMetricSpec raw: pmMetricSpec@0x55c245897650 arch=0 source=b'localhost' metric=b'kernel.all.load' insts=[]
pmParseMetricSpec rawsrctype: <class 'bytes'>
pmParseMetricSpec source type0: <class 'bytes'>
pmParseMetricSpec source0: b'localhost'
pmParseMetricSpec source type: <class 'bytes'>
pmParseMetricSpec source: b'localhost'
pmParseMetricSpec metric type: <class 'bytes'>
pmParseMetricSpec metric: b'kernel.all.load'

Comment 3 Miro Hrončok 2022-07-14 10:18:16 UTC
Disregard my previous comment. I see that the unicode strings are .encode('utf-8')'d in the .fromString() classmethod.

Looking at the output, we see:

b'localhost'
<class 'bytes'>
b'kernel.all.load'
<class 'bytes'>

When we pass unicode strings as well as bytes to the .fromString() classmethod.

Comment 4 Miro Hrončok 2022-07-14 10:33:53 UTC
Ok, so interestingly, it works with:

            source = b'localhost'

But fails with:

            source = 'localhost'.encode('utf-8')





Explicitly adding a terminating NULL make sit work:

        if not isinstance(source, bytes):
            source = source.encode('utf-8') + b'\x00'
        if not isinstance(string, bytes):
            string = string.encode('utf-8') + b'\x00'

Hence, I think the problem is hidden in a way the byte-string is created in memory. When using literals, I guess it's NULL-terminated either by chance or by design. When using encode, it apparently does not have to be.




Reading the ctypes documentation, I believe the proper way of doing this is using https://docs.python.org/3/library/ctypes.html#ctypes.create_string_buffer when passing the byte-strings ot the C function like this:


        status = LIBPCP.pmParseMetricSpec(ctypes.create_string_buffer(string),
                                          isarch,
                                          ctypes.create_string_buffer(source),
                                          byref(result),
                                          byref(errmsg))



However, this still does not explain why do you assume this is a regression in 3.11.

Comment 5 Petr Viktorin (pviktori) 2022-07-14 12:27:20 UTC
Since you have the reproducer set up, could you also run it with python3-debug?

Comment 6 Miro Hrončok 2022-07-14 14:59:38 UTC
As attached:

$ python3.11d metricspec.py 
b'localhost'
<class 'bytes'>
b'kernel.all.load'
<class 'bytes'>
<__main__.LP_pmMetricSpec object at 0x7f77418ed780>
<class '__main__.LP_pmMetricSpec'>
pmParseMetricSpec error:  'utf-8' codec can't decode byte 0xdd in position 0: invalid continuation byte


With ctypes.create_string_buffer:

$ python3.11d metricspec.py 
b'localhost'
<class 'bytes'>
b'kernel.all.load'
<class 'bytes'>
<__main__.LP_pmMetricSpec object at 0x7f7ee03ad6e0>
<class '__main__.LP_pmMetricSpec'>
pmParseMetricSpec error:  'utf-8' codec can't decode byte 0xdd in position 0: invalid continuation byte


Letting it raise:

$ python3.11d metricspec.py 
b'localhost'
<class 'bytes'>
b'kernel.all.load'
<class 'bytes'>
<__main__.LP_pmMetricSpec object at 0x7feb4ab616e0>
<class '__main__.LP_pmMetricSpec'>
pmParseMetricSpec error:  'utf-8' codec can't decode byte 0xdd in position 0: invalid continuation byte
Traceback (most recent call last):
  File "/home/churchyard/Stažené/metricspec.py", line 56, in <module>
    results = resultp.contents.source.decode()
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xdd in position 0: invalid continuation byte


Using decode(errors='ignore'):

With ctypes.create_string_buffer:

$ python3.11d metricspec.py 
b'localhost'
<class 'bytes'>
b'kernel.all.load'
<class 'bytes'>
<__main__.LP_pmMetricSpec object at 0x7fdee6b556e0>
<class '__main__.LP_pmMetricSpec'>
pmParseMetricSpec raw: pmMetricSpec@0x557f327a7070 arch=0 source=b'\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd' metric=b'kernel.all.load' insts=[]
pmParseMetricSpec rawsrctype: <class 'bytes'>
pmParseMetricSpec source type0: <class 'bytes'>
pmParseMetricSpec source0: b'\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd'
pmParseMetricSpec source type: <class 'str'>
pmParseMetricSpec source: 
pmParseMetricSpec metric type: <class 'str'>
pmParseMetricSpec metric: kernel.all.load
Expected result source localhost not

Without:

$ python3.11d metricspec.py 
b'localhost'
<class 'bytes'>
b'kernel.all.load'
<class 'bytes'>
<__main__.LP_pmMetricSpec object at 0x7ff443ddd780>
<class '__main__.LP_pmMetricSpec'>
pmParseMetricSpec raw: pmMetricSpec@0x556605887630 arch=0 source=b'' metric=b'kernel.all.load' insts=[]
pmParseMetricSpec rawsrctype: <class 'bytes'>
pmParseMetricSpec source type0: <class 'bytes'>
pmParseMetricSpec source0: b''
pmParseMetricSpec source type: <class 'str'>
pmParseMetricSpec source: 
pmParseMetricSpec metric type: <class 'str'>
pmParseMetricSpec metric: kernel.all.load
Expected result source localhost not

Comment 7 Nathan Scott 2022-07-15 01:37:04 UTC
Hi folks,

Thanks for looking into this so quickly!

Hmm, there's something not quite right about my reproducer if its also failing on older versions.  I was trying to extract just the failing part, but somehow that's now showing a different behavior.  Not sure what's happened there - sorry about that.

The reason I'm pointing toward the latest python release as problematic is because that's the only place our CI is failing when running this (similar) code.

If you could try this recipe instead of my earlier attempt to minimize it, it will more directly show the failure:

- dnf install pcp-testsuite python3-pcp
- python /var/lib/pcp/testsuite/src/test_pcp.py

if this latter fails with:

  File "/var/lib/pcp/testsuite/src/test_pcp.py", line 78, in test_pcp
    self.assertTrue(result == source)
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

you've reproduced the problem locally.  What that assert is saying is that the returned string value (result) in the newly created structure, differs to the original string value (source) - both should have the value 'localhost'.  This comparison only fails on rawhide, passes everywhere else.

I tried adding use of ctypes.create_string_buffer for the two strings passed in, but it didn't have any effect.

Comment 8 Petr Viktorin (pviktori) 2022-07-15 08:02:30 UTC
> pmParseMetricSpec source0: b'\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd'
> pmParseMetricSpec error:  'utf-8' codec can't decode byte 0xdd in position 0: invalid continuation byte

Ah, I had a hunch you'd see some \xdd! python3-debug fills freed memory with \xdd (D for “Deleted”).

Without looking at the code, here's what I think is the issue:
- if the object b'localhost' is part of the source, it'll most likely not be garbage-collected until Python exits. The memory that contains the 'localhost' bytes stays valid
- On the other hand, when you do 'localhost'.encode(), the encoded object is garbage-collected when that bytes object is no longer needed in Python. If you assign a pointer to the underlying memory to a C structure, it'll point to deleted memory.

You need to ensure that whatever object you pass to ctypes (be it bytes or a buffer from create_string_buffer) is kept around until the C structure is discarded. ctypes doesn't do reference counting for you.

Comment 9 Miro Hrončok 2022-07-15 09:25:21 UTC
I'm sorry, but even the new reproducer does not reproduce:


[root@e1044b735520 /]# python3 /var/lib/pcp/testsuite/src/test_pcp.py
python3: can't open file '/var/lib/pcp/testsuite/src/test_pcp.py': [Errno 2] No such file or directory

[root@e1044b735520 /]# python3 /var/lib/pcp/testsuite/src/test_pcp.python 
Running as local:
E
======================================================================
ERROR: test_context (__main__.TestSequenceFunctions.test_context)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/var/lib/pcp/testsuite/src/test_pcp.python", line 522, in test_context
    test_pcp(self)
  File "/var/lib/pcp/testsuite/src/test_pcp.python", line 60, in test_pcp
    ctx = pmapi.pmContext(api.PM_CONTEXT_HOST, "local:")
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib64/python3.11/site-packages/pcp/pmapi.py", line 1301, in __init__
    raise pmErr(self._ctx, [target])
pcp.pmapi.pmErr: Connection refused ['local:']

----------------------------------------------------------------------
Ran 1 test in 0.004s

FAILED (errors=1)

Comment 10 Miro Hrončok 2022-07-15 09:36:16 UTC
I got it:

[root@e1044b735520 /]# dnf install /usr/bin/ps
...

[root@e1044b735520 /]# /usr/libexec/pcp/lib/pmcd start
Starting pmcd ... 

[root@e1044b735520 /]# python3.11 /var/lib/pcp/testsuite/src/test_pcp.python 
Running as local:
pmGetContextHostName: e1044b735520
pmParseMetricSpec: 
F
======================================================================
FAIL: test_context (__main__.TestSequenceFunctions.test_context)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/var/lib/pcp/testsuite/src/test_pcp.python", line 522, in test_context
    test_pcp(self)
  File "/var/lib/pcp/testsuite/src/test_pcp.python", line 78, in test_pcp
    self.assertTrue(result == source)
AssertionError: False is not true

----------------------------------------------------------------------
Ran 1 test in 0.003s

FAILED (failures=1)


# python3.11d -u /var/lib/pcp/testsuite/src/test_pcp.python 
Running as local:
pmGetContextHostName: e1044b735520
E
======================================================================
ERROR: test_context (__main__.TestSequenceFunctions.test_context)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/var/lib/pcp/testsuite/src/test_pcp.python", line 523, in test_context
    test_pcp(self)
  File "/var/lib/pcp/testsuite/src/test_pcp.python", line 76, in test_pcp
    result = rsltp.contents.source.decode()
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xdd in position 0: invalid continuation byte

----------------------------------------------------------------------
Ran 1 test in 0.003s




[root@e1044b735520 /]# dnf --releasever=36 --repo=updates,fedora install 'python3-libs < 3.11'
...downgrades the entire Python stack...


[root@e1044b735520 /]# python3.10d -u /var/lib/pcp/testsuite/src/test_pcp.python 
Running as local:
pmGetContextHostName: e1044b735520
E
======================================================================
ERROR: test_context (__main__.TestSequenceFunctions)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/var/lib/pcp/testsuite/src/test_pcp.python", line 522, in test_context
    test_pcp(self)
  File "/var/lib/pcp/testsuite/src/test_pcp.python", line 76, in test_pcp
    result = rsltp.contents.source.decode()
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xdd in position 0: invalid continuation byte

----------------------------------------------------------------------
Ran 1 test in 0.002s

FAILED (errors=1)


Makes me think that this problem was pre-existing, and caused by what Petr describes in comment #8. The fact that it works with the optimized Python 3.10 build is probably just luck.

Comment 11 Nathan Scott 2022-07-17 23:17:38 UTC
Aha - excellent, thanks folks - with the insights Petr provided in #c8 I've been able to adjust the PCP code to ensure the source buffer is not freed too early, and our test case passes once again.  And I think you're right Miro in #c10 - it was passing by good fortune all these years and is a subtle bug that's been there from the day this code was written (circa 2009!)

Thanks again for all the assistance, really appreciate it!

Comment 12 Miro Hrončok 2022-07-17 23:58:48 UTC
I'm glad that you've been able to find out where the issue in pcp code!

You might want to consider running your CI with python3-debug as well and maybe even -X dev: https://docs.python.org/3/library/devmode.html

Comment 13 Nathan Scott 2022-07-18 00:02:06 UTC
Thanks for the tips Miro - will look into those options.


Note You need to log in before you can comment on or make changes to this bug.