We are observing a significant number of new failures in the PCP test suite on rawhide with the recent python update there. It's unfortunately taken us this long to report due to problems with python3.12 making python3-pyodbc uninstallable, which caused dependent PCP sub-packages to become uninstallable, on rawhide. We've had to (temporarily?) disable PCP metrics based on that module (pcp-pmda-mssql) in order to observe the remaining fallout. Anyway, the ctypes issues seem to have two similar but slightly different signatures: > desc.contents.type, PM_TYPE_U64) > ^^^^^^^^^^^^^^^^^^ > AttributeError: 'dict' object has no attribute 'type' and > if desc.contents.indom == pmapi.c_api.PM_INDOM_NULL: > ^^^^^^^^^^^^^ > ValueError: Unexpected NULL pointer in _objects This ctypes structure for variable 'desc' (pmDesc) is here: https://github.com/performancecopilot/pcp/blob/83cf926e507ab1302c18663daa4ce7d2129b99a1/src/python/pcp/pmapi.py#L600 These failure modes can be observed from daily PCP QA: https://github.com/performancecopilot/pcp/actions (click on a current "QA" action, then "fedora-rawhide-container", then the "QA" step therein - there's many examples of the above two failure signatures if you need a reproducible test case) And the test cases are available in rawhide via pcp-testsuite package, which puts failing cases (such as qa/991) below /var/lib/pcp/testsuite Please let me know if any further information is needed. FWIW there's been no change to PCP python wrapper library recently, and these tests pass on every other version of python (incl. python2) so we're 100% certain this is directly related to the python3.12 upgrade. Reproducible: Always Steps to Reproduce: Run PCP python tools e.g. via pcp-testsuite Actual Results: Numerous tests fail due to crashes in python ctypes code. Expected Results: Tests pass, python tools using ctypes do not fail.
This bug appears to have been reported against 'rawhide' during the Fedora Linux 39 development cycle. Changing version to 39.
> We are observing a significant number of new failures in the PCP test suite on rawhide with the recent python update there. Do you still have the issue? The latest GitHub Action QA job is a success, even on fedora-rawhide-container: https://github.com/performancecopilot/pcp/actions/runs/6015971693 I built successfully the package with Python 3.12 in the Python 3.12 COPR. But then I'm not sure how to run PCP test suite. On my Fedora 38, I ran successfully the QA tests using commands given in .github/workflows/qa.yml for the "fedora-rawhide-container" platform: --- set -e -x PLATFORM=fedora-rawhide-container source $PWD/VERSION.pcp PACKAGE_BUILD="0.$(date +'%Y%m%d').$(git rev-parse --short HEAD)" PCP_VERSION=${PACKAGE_MAJOR}.${PACKAGE_MINOR}.${PACKAGE_REVISION} PCP_BUILD_VERSION=${PCP_VERSION}-${PACKAGE_BUILD} sed -i "s/PACKAGE_BUILD=.*/PACKAGE_BUILD=${PACKAGE_BUILD}/" VERSION.pcp sed -i "1 s/(.*)/(${PCP_BUILD_VERSION})/" debian/changelog python3 -c 'import yaml' || python3 -m pip install pyyaml mkdir -p artifacts/build artifacts/test touch artifacts/build/.keep build/ci/ci-run.py $PLATFORM setup build/ci/ci-run.py $PLATFORM task build build/ci/ci-run.py $PLATFORM artifacts build --path artifacts/build build/ci/ci-run.py $PLATFORM task install build/ci/ci-run.py $PLATFORM task init_qa build/ci/ci-run.py $PLATFORM task qa --- ... Well, right now, the QA tests are still running: "[14%] 243". But so far, so good: so far, all QA tests succeeded. Can you please explain me how to reproduce the issue on Python 3.12? Do you have a reproducer which doesn't require PCP, but only Python stdlib modules (ctypes)?
Can you please explain me how to reproduce the issue on Python 3.12? Do you have a reproducer which doesn't require PCP, but only Python stdlib modules (ctypes)?
This sounds a lot like https://github.com/python/cpython/issues/107940, a regression that appeared between 3.12.0b4 and 3.12.0rc1, and should be fixed in 3.12.0rc2. I’m waiting to see if rc2 fixes https://github.com/Toblerity/rtree/issues/277, which also looks similar.
Oh wait, I kept the QA test running in background a many minutes later, I started to get errors! --- ... [51%] 855 [51%] 856 [51%] 857 [51%] 858 [51%] 859 [failed, exit status 1] - output mismatch (see 859.out.bad) 2c2,15 < pmfg - OK --- > pmfg - File "/var/lib/pcp/testsuite/src/test_pmfg.py", line 132, in test_pmfg > test_pmfg_live(self, c_api.PM_CONTEXT_HOST, "local:") > File "/var/lib/pcp/testsuite/src/test_pmfg.py", line 28, in test_pmfg_live > v1 = pmfg.extend_item("sample.ulong.one") # infer type > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > File "/usr/lib64/python3.12/site-packages/pcp/pmapi.py", line 3201, in extend_item > mtype = descs[0].type > ^^^^^^^^^^^^^ > File "/usr/lib64/python3.12/site-packages/pcp/pmapi.py", line 614, in <lambda> > pmDescPtr.type = property(lambda x: x.contents.type, None, None, None) > ^^^^^^^^^^ > ValueError: Unexpected NULL pointer in _objects > > ---------------------------------------------------------------------- Check local PMCD is still alive ... PMDA probe: pminfo -h b620c05c31c0 -f sample.milliseconds PMDA probe: pminfo -h b620c05c31c0 -f sampledso.milliseconds PMDA probe: pminfo -h b620c05c31c0 -f simple.numfetch [51%] 860 [51%] 861 [51%] 862 ... [52%] 879 [52%] 880 - output mismatch (see 880.out.bad) 2a3,20 > Traceback (most recent call last): > File "/usr/lib64/python3.12/site-packages/pcp/pmconfig.py", line 568, in check_metric > if desc.contents.indom == pmapi.c_api.PM_INDOM_NULL: > ^^^^^^^^^^^^^ > ValueError: Unexpected NULL pointer in _objects > No matching instances found. > Traceback (most recent call last): > File "/usr/lib64/python3.12/site-packages/pcp/pmconfig.py", line 568, in check_metric > if desc.contents.indom == pmapi.c_api.PM_INDOM_NULL: > ^^^^^^^^^^^^^ > ValueError: Unexpected NULL pointer in _objects > No matching instances found. > Traceback (most recent call last): > File "/usr/lib64/python3.12/site-packages/pcp/pmconfig.py", line 568, in check_metric > if desc.contents.indom == pmapi.c_api.PM_INDOM_NULL: > ^^^^^^^^^^^^^ > ValueError: Unexpected NULL pointer in _objects > No matching instances found. Check local PMCD is still alive ... PMDA probe: pminfo -h b620c05c31c0 -f sample.milliseconds PMDA probe: pminfo -h b620c05c31c0 -f sampledso.milliseconds PMDA probe: pminfo -h b620c05c31c0 -f simple.numfetch [52%] 881 [52%] 882 ... [59%] 991 - output mismatch (see 991.out.bad) ... [59%] 991 - output mismatch (see 991.out.bad) 4,6c4,13 < total used free shared buff/cache available < Mem: 16010088 7817828 2956804 651076 5235456 7208668 < Swap 8093692 1785600 6308092 --- > Traceback (most recent call last): > File "/usr/libexec/pcp/bin/pcp-free", line 239, in <module> > FREE.execute() > File "/usr/libexec/pcp/bin/pcp-free", line 156, in execute > values = self.extract(descs, result) > ^^^^^^^^^^^^^^^^^^^^^^^^^^^ > File "/usr/libexec/pcp/bin/pcp-free", line 126, in extract > desc.contents.type, PM_TYPE_U64) > ^^^^^^^^^^^^^^^^^^ > AttributeError: 'dict' object has no attribute 'type' 9,11c16,25 < total used free shared buff/cache available < Mem: 16010088 16010088 0 0 0 0 < Swap 0 0 0 --- > Traceback (most recent call last): > File "/usr/libexec/pcp/bin/pcp-free", line 239, in <module> > FREE.execute() > File "/usr/libexec/pcp/bin/pcp-free", line 156, in execute > values = self.extract(descs, result) > ^^^^^^^^^^^^^^^^^^^^^^^^^^^ > File "/usr/libexec/pcp/bin/pcp-free", line 126, in extract > desc.contents.type, PM_TYPE_U64) > ^^^^^^^^^^^^^^^^^^ > AttributeError: 'dict' object has no attribute 'type' 14,16c28,37 < total used free shared buff/cache available < Mem: 16394330112 8005455872 3027767296 666701824 5361106944 7381676032 < Swap 8287940608 1828454400 6459486208 --- ... [59%] 992 [60%] 993 ... [60%] 999 - output mismatch (see 999.out.bad) ... > Traceback (most recent call last): > File "/usr/lib64/python3.12/site-packages/pcp/pmconfig.py", line 568, in check_metric > if desc.contents.indom == pmapi.c_api.PM_INDOM_NULL: > ^^^^^^^^^^^^^ > ValueError: Unexpected NULL pointer in _objects ... [62%] 1062 - output mismatch (see 1062.out.bad) ... > Traceback (most recent call last): > File "/usr/libxxx/pythonxxx.xxx/site-packages/pcp/pmconfig.py", line xxx, in check_metric > if desc.contents.indom == pmapi.c_api.PM_INDOM_NULL: > ^^^^^^^^^^^^^ > ValueError: Unexpected NULL pointer in _objects > Traceback (most recent call last): > File "/usr/libxxx/pythonxxx.xxx/site-packages/pcp/pmconfig.py", line xxx, in check_metric > if desc.contents.indom == pmapi.c_api.PM_INDOM_NULL: > ^^^^^^^^^^^^^ ... ---
Thanks for looking into it Victor. Looks like you've reproduced it now (I don't have any simpler test case at hand, sorry) and also found the numpy project reference where they've possibly encountered the same issue. https://github.com/numpy/numpy/issues/24399 https://github.com/python/cpython/issues/107940
I reverted a recent ctypes change which introduced a regression in numpy. It's likely the same bug which affects PCP. The revert got merged in the Python 3.12 branch: https://github.com/python/cpython/commit/e0f6080819f00d456215646e3117ae77b9af40d1 You can expect it as part of the incoming Python 3.12.0rc2 release, which is expected to be released today. I propose to leave this issue open until Python 3.12.0rc2 is shipped in Fedora Rawhide and someone can confirm that the bug is fixed. Or at least, test it in the PCP upstream CI.
Thanks Victor! Soon as its in rawhide we should see it in PCP daily CI on the next day - I'll report back then.
*** Bug 2237699 has been marked as a duplicate of this bug. ***
Fedora 40: https://bodhi.fedoraproject.org/updates/FEDORA-2023-d683d7a18e Fedora 39: https://bodhi.fedoraproject.org/updates/FEDORA-2023-623962bb38
I can confirm the issue is resolved now in rawhide - thanks!
FEDORA-2023-623962bb38 has been submitted as an update to Fedora 39. https://bodhi.fedoraproject.org/updates/FEDORA-2023-623962bb38
FEDORA-2023-9d033517d4 has been pushed to the Fedora 39 testing repository. Soon you'll be able to install the update with the following command: `sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2023-9d033517d4` You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2023-9d033517d4 See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.
FEDORA-2023-9d033517d4 has been pushed to the Fedora 39 stable repository. If problem still persists, please make note of it in this bug report.