Bug 2099102 - scipy makes python-scikit-learn crash during build
Summary: scipy makes python-scikit-learn crash during build
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Fedora
Classification: Fedora
Component: scipy
Version: rawhide
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
Assignee: Nikola Forró
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks: PYTHON3.11 2098998 2100925
TreeView+ depends on / blocked
 
Reported: 2022-06-20 09:41 UTC by Miro Hrončok
Modified: 2022-07-04 09:00 UTC (History)
14 users (show)

Fixed In Version: scipy-1.8.1-5.fc37
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-07-04 09:00:37 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
Reproducer (3.28 KB, text/x-matlab)
2022-06-24 17:18 UTC, Petr Viktorin (pviktori)
no flags Details
Full valgrind output (17.59 KB, text/plain)
2022-06-30 13:50 UTC, Nikola Forró
no flags Details

Description Miro Hrončok 2022-06-20 09:41:58 UTC
Hello,

Please note that this comment was generated automatically. If you feel that this output has mistakes, please contact me via email (mhroncok).

Your package (python-scikit-learn) Fails To Install in Fedora 37:

can't install python3-scikit-learn:
  - nothing provides python(abi) = 3.10 needed by python3-scikit-learn-1.0.2-2.fc36.x86_64
  - nothing provides python3.10dist(scipy) >= 1.1 needed by python3-scikit-learn-1.0.2-2.fc36.x86_64
  - nothing provides python3.10dist(numpy) >= 1.14.6 needed by python3-scikit-learn-1.0.2-2.fc36.x86_64
  
If you know about this problem and are planning on fixing it, please acknowledge so by setting the bug status to ASSIGNED. If you don't have time to maintain this package, consider orphaning it, so maintainers of dependent packages realize the problem.


If you don't react accordingly to the policy for FTBFS/FTI bugs (https://docs.fedoraproject.org/en-US/fesco/Fails_to_build_from_source_Fails_to_install/), your package may be orphaned in 8+ weeks.


P.S. The data was generated solely from koji buildroot, so it might be newer than the latest compose or the content on mirrors. To reproduce, use the koji/local repo only, e.g. in mock:

    $ mock -r fedora-37-x86_64 --disablerepo='*' --enablerepo=local install python3-scikit-learn


P.P.S. If this bug has been reported in the middle of upgrading multiple dependent packages, please consider using side tags: https://docs.fedoraproject.org/en-US/fesco/Updates_Policy/#updating-inter-dependent-packages

Thanks!

Comment 1 Miro Hrončok 2022-06-20 10:10:07 UTC
This bugzilla is likely a fallout from the Python 3.11 rebuild.

If your package (or some of the dependencies it has) failed to rebuild during the Python 3.11 rebuild, they now fail to install. To fix this, packages need to be rebuilt in Rawhide.

We will slowly triage the bugzillas, but we'd appreciate your help.

If you know this is blocked by an existing reported build failure or another package not yet rebuilt with Python 3.11, please mark it as such by using the "Depends On"/"Blocks" bugzilla fields. That will help us determine what failures to prioritize.

If this is not Python 3.11 related, please remove the PYTHON3.11 blocking tracker.

Thank you and sorry for the inconvenience. Let me know if you need any help.

Comment 2 Miro Hrončok 2022-06-20 21:50:50 UTC
The build failed at least on s390x with:

.                                                                        [  5%]
sklearn/decomposition/tests/test_kernel_pca.py ....realloc(): invalid next size
Fatal Python error: Aborted
Thread 0x000003fec37fe840 (most recent call first):
  File "/usr/lib64/python3.11/threading.py", line 320 in wait
  File "/usr/lib/python3.11/site-packages/joblib/externals/loky/backend/queues.py", line 141 in _feed
  File "/usr/lib64/python3.11/threading.py", line 975 in run
  File "/usr/lib64/python3.11/threading.py", line 1038 in _bootstrap_inner
  File "/usr/lib64/python3.11/threading.py", line 995 in _bootstrap
Thread 0x000003fed137e840 (most recent call first):
  File "/usr/lib64/python3.11/selectors.py", line 415 in select
  File "/usr/lib64/python3.11/multiprocessing/connection.py", line 935 in wait
  File "/usr/lib/python3.11/site-packages/joblib/externals/loky/process_executor.py", line 617 in wait_result_broken_or_wakeup
  File "/usr/lib/python3.11/site-packages/joblib/externals/loky/process_executor.py", line 563 in run
  File "/usr/lib64/python3.11/threading.py", line 1038 in _bootstrap_inner
  File "/usr/lib64/python3.11/threading.py", line 995 in _bootstrap
Current thread 0x000003ff8def2720 (most recent call first):
  File "/usr/lib/python3.11/site-packages/_pytest/logging.py", line 346 in reset
  File "/usr/lib/python3.11/site-packages/_pytest/logging.py", line 697 in _runtest_for
  File "/usr/lib/python3.11/site-packages/_pytest/logging.py", line 725 in pytest_runtest_teardown
  File "/usr/lib/python3.11/site-packages/pluggy/_callers.py", line 34 in _multicall
  File "/usr/lib/python3.11/site-packages/pluggy/_manager.py", line 80 in _hookexec
  File "/usr/lib/python3.11/site-packages/pluggy/_hooks.py", line 265 in __call__
  File "/usr/lib/python3.11/site-packages/_pytest/runner.py", line 259 in <lambda>
  File "/usr/lib/python3.11/site-packages/_pytest/runner.py", line 338 in from_call
  File "/usr/lib/python3.11/site-packages/_pytest/runner.py", line 258 in call_runtest_hook
  File "/usr/lib/python3.11/site-packages/_pytest/runner.py", line 219 in call_and_report
  File "/usr/lib/python3.11/site-packages/_pytest/runner.py", line 131 in runtestprotocol
  File "/usr/lib/python3.11/site-packages/_pytest/runner.py", line 111 in pytest_runtest_protocol
  File "/usr/lib/python3.11/site-packages/pluggy/_callers.py", line 39 in _multicall
  File "/usr/lib/python3.11/site-packages/pluggy/_manager.py", line 80 in _hookexec
  File "/usr/lib/python3.11/site-packages/pluggy/_hooks.py", line 265 in __call__
  File "/usr/lib/python3.11/site-packages/_pytest/main.py", line 347 in pytest_runtestloop
  File "/usr/lib/python3.11/site-packages/pluggy/_callers.py", line 39 in _multicall
  File "/usr/lib/python3.11/site-packages/pluggy/_manager.py", line 80 in _hookexec
  File "/usr/lib/python3.11/site-packages/pluggy/_hooks.py", line 265 in __call__
  File "/usr/lib/python3.11/site-packages/_pytest/main.py", line 322 in _main
  File "/usr/lib/python3.11/site-packages/_pytest/main.py", line 268 in wrap_session
  File "/usr/lib/python3.11/site-packages/_pytest/main.py", line 315 in pytest_cmdline_main
  File "/usr/lib/python3.11/site-packages/pluggy/_callers.py", line 39 in _multicall
  File "/usr/lib/python3.11/site-packages/pluggy/_manager.py", line 80 in _hookexec
  File "/usr/lib/python3.11/site-packages/pluggy/_hooks.py", line 265 in __call__
  File "/usr/lib/python3.11/site-packages/_pytest/config/__init__.py", line 164 in main
  File "/usr/lib/python3.11/site-packages/_pytest/config/__init__.py", line 187 in console_main
  File "/usr/bin/pytest", line 8 in <module>
Extension modules: sklearn.__check_build._check_build, numpy.core._multiarray_umath, numpy.core._multiarray_tests, numpy.linalg._umath_linalg, numpy.fft._pocketfft_internal, numpy.random._common, numpy.random.bit_generator, numpy.random._bounded_integers, numpy.random._mt19937, numpy.random.mtrand, numpy.random._philox, numpy.random._pcg64, numpy.random._sfc64, numpy.random._generator, scipy._lib._ccallback_c, scipy.sparse._sparsetools, scipy.sparse._csparsetools, scipy.sparse.csgraph._tools, scipy.sparse.csgraph._shortest_path, scipy.sparse.csgraph._traversal, scipy.sparse.csgraph._min_spanning_tree, scipy.sparse.csgraph._flow, scipy.sparse.csgraph._matching, scipy.sparse.csgraph._reordering, sklearn.utils.murmurhash, scipy.spatial._ckdtree, scipy._lib.messagestream, scipy.spatial._qhull, scipy.spatial._voronoi, scipy.linalg._fblas, scipy.linalg._flapack, scipy.linalg._cythonized_array_utils, scipy.linalg._flinalg, scipy.linalg._solve_toeplitz, scipy.linalg._matfuncs_sqrtm_triu, scipy.linalg.cython_blas, scipy.linalg.cython_lapack, scipy.linalg._decomp_update, scipy.spatial._distance_wrap, scipy.spatial._hausdorff, scipy.special._ufuncs_cxx, scipy.special._ufuncs, scipy.special._specfun, scipy.special._comb, scipy.special._ellip_harm_2, scipy.spatial.transform._rotation, scipy.ndimage._nd_image, _ni_label, scipy.ndimage._ni_label, scipy.sparse.linalg._isolve._iterative, scipy.sparse.linalg._dsolve._superlu, scipy.sparse.linalg._eigen.arpack._arpack, scipy.optimize._minpack2, scipy.optimize._group_columns, scipy.optimize._trlib._trlib, numpy.linalg.lapack_lite, scipy.optimize._lbfgsb, _moduleTNC, scipy.optimize._moduleTNC, scipy.optimize._cobyla, scipy.optimize._slsqp, scipy.optimize._minpack, scipy.optimize._lsq.givens_elimination, scipy.optimize._zeros, scipy.optimize.__nnls, scipy.optimize._highs.cython.src._highs_wrapper, scipy.optimize._highs._highs_wrapper, scipy.optimize._highs.cython.src._highs_constants, scipy.optimize._highs._highs_constants, scipy.linalg._interpolative, scipy.optimize._bglu_dense, scipy.optimize._lsap_module, scipy.integrate._odepack, scipy.integrate._quadpack, scipy.integrate._vode, scipy.integrate._dop, scipy.integrate._lsoda, scipy.interpolate._fitpack, scipy.interpolate.dfitpack, scipy.interpolate._bspl, scipy.interpolate._ppoly, scipy.interpolate.interpnd, scipy.interpolate._rbfinterp_pythran, scipy.special.cython_special, scipy.stats._stats, beta_ufunc, scipy.stats._boost.beta_ufunc, binom_ufunc, scipy.stats._boost.binom_ufunc, nbinom_ufunc, scipy.stats._boost.nbinom_ufunc, hypergeom_ufunc, scipy.stats._boost.hypergeom_ufunc, scipy.stats._biasedurn, scipy.stats._hypotests_pythran, scipy.stats._statlib, scipy.stats._mvn, scipy.stats._sobol, scipy.stats._qmc_cy, scipy.stats._unuran.unuran_wrapper, sklearn.utils._openmp_helpers, sklearn.feature_extraction._hashing_fast, sklearn.utils._logistic_sigmoid, sklearn.utils.sparsefuncs_fast, sklearn.preprocessing._csr_polynomial_expansion, sklearn.utils._random, sklearn.datasets._svmlight_format_fast, scipy.io.matlab._mio_utils, scipy.io.matlab._streams, scipy.io.matlab._mio5_utils, sklearn.utils._typedefs, sklearn.utils._readonly_array_wrapper, sklearn.metrics._dist_metrics, sklearn.metrics.cluster._expected_mutual_info_fast, sklearn.metrics._pairwise_fast, sklearn.neighbors._partition_nodes, sklearn.neighbors._ball_tree, sklearn.neighbors._kd_tree, sklearn.decomposition._cdnmf_fast, sklearn.utils._seq_dataset, sklearn.utils._cython_blas, sklearn.utils.arrayfuncs, sklearn.linear_model._cd_fast, sklearn.utils._weight_vector, sklearn.linear_model._sgd_fast, sklearn.linear_model._sag_fast, sklearn.svm._libsvm, sklearn.svm._liblinear, sklearn.svm._libsvm_sparse, sklearn.decomposition._online_lda_fast, sklearn._isotonic, sklearn.manifold._utils, sklearn.tree._utils, sklearn.tree._tree, sklearn.tree._splitter, sklearn.tree._criterion, sklearn.neighbors._quad_tree, sklearn.manifold._barnes_hut_tsne, sklearn.cluster._k_means_common, sklearn.cluster._k_means_minibatch, sklearn.cluster._k_means_lloyd, sklearn.cluster._k_means_elkan, sklearn.utils._fast_dict, sklearn.cluster._hierarchical_fast, sklearn.cluster._dbscan_inner, scipy.cluster._vq, scipy.cluster._hierarchy, scipy.cluster._optimal_leaf_ordering, sklearn.ensemble._gradient_boosting, sklearn.ensemble._hist_gradient_boosting.common, sklearn.ensemble._hist_gradient_boosting._gradient_boosting, sklearn.ensemble._hist_gradient_boosting._binning, sklearn.ensemble._hist_gradient_boosting._bitset, sklearn.ensemble._hist_gradient_boosting.splitting, sklearn.ensemble._hist_gradient_boosting.histogram, sklearn.ensemble._hist_gradient_boosting._predictor, sklearn.ensemble._hist_gradient_boosting.utils, sklearn.ensemble._hist_gradient_boosting._loss, scipy._lib._uarray._uarray, sklearn.svm._newrand (total: 160)
/var/tmp/rpm-tmp.8Z4jjN: line 56: 655654 Aborted                 (core dumped) pytest --deselect "metrics/tests/test_common.py::test_not_symmetric_metric[precision_recall_curve]" --deselect "metrics/tests/test_common.py::test_binary_sample_weight_invariance[precision_recall_curve]" --deselect "datasets/tests/test_openml.py::test_fetch_openml_verify_checksum[True]" --deselect "datasets/tests/test_openml.py::test_fetch_openml_verify_checksum[False]" --deselect "cross_decomposition/tests/test_pls.py::test_loadings_converges" --deselect "covariance/tests/test_graphical_lasso.py::test_graphical_lasso" --deselect "gaussian_process/tests/test_gpr.py::test_lml_precomputed[kernel3]" --deselect "gaussian_process/tests/test_gpr.py::test_lml_precomputed[kernel4]" sklearn

Comment 3 Mamoru TASAKA 2022-06-22 12:51:16 UTC
Looks like this breakage is now causing long dependency chain breakage.

Note that build fails also on F-36, on multiple archs similarly:
https://koji.fedoraproject.org/koji/taskinfo?taskID=88585049

so seemingly not python3.11 specific.

Comment 4 Miro Hrončok 2022-06-22 13:03:46 UTC
Koschei says this has started failing with the scipy 1.7 -> 1.8 update.

https://koschei.fedoraproject.org/build/12496668

But the logs from that time are already garbage collected, so I cannot verify if the failure changed since.

Comment 5 Petr Viktorin (pviktori) 2022-06-24 17:17:49 UTC
I have reduced the reproducer to use Scipy only -- in particular, memory is corrupted in a call to the LAPACK function `syevr`.
Either half of the attached script will crash on exit in Rawhide: 

$ python3 syevr_corruption.py
double free or corruption (out)
Aborted (core dumped)

Reassigning to Scipy to CC the maintainer. (I have no idea what this script is actually doing, it might not be a valid call.)



This is reduced from scikit-learn's test_kernel_pca_consistent_transform, relevant "traceback":
- test_kernel_pca.py (sklearn): kpca = KernelPCA(random_state=state).fit(X)
- _kernel_pca.py (sklearn): linalg.eigh(K, eigvals=(K.shape[0] - n_components, K.shape[0] - 1))
- linalg/_decomp.py (scipy): w, v, *other_args, info = drv(a=a1, **drv_args, **lwork_args)

Comment 6 Petr Viktorin (pviktori) 2022-06-24 17:18:42 UTC
Created attachment 1892563 [details]
Reproducer

Reproducer

Comment 7 Miro Hrončok 2022-06-24 17:29:14 UTC
The Fails To Install bug script now created bz2100925 so I've played with the blocks a depends-on fields a bit. Sorry about the noise.

Comment 8 Nikola Forró 2022-06-28 11:31:16 UTC
Here is even more minimalistic reproducer:

import scipy.linalg.lapack
f = scipy.linalg.lapack.get_lapack_funcs('syevr')
f(a=[[1, 2, 3, 4, 5], [5, 4, 3, 2, 1], [4, 5, 2, 1, 3], [5, 2, 1, 3, 4], [3, 1, 4, 2, 5]], range='I')

It seems the input matrix doesn't actually matter at all, but the larger it is the sooner it crashes.

Comment 9 Nikola Forró 2022-06-30 13:49:39 UTC
I believe the issue is in lapack, valgrind shows multiple invalid writes/reads in dlarrv_/dlar1v_ (pasting just the first one):

==1956== Invalid write of size 4
==1956==    at 0x8C463E0: dlar1v_ (dlar1v.f:405)
==1956==    by 0x8C4D2B0: dlarrv_ (dlarrv.f:875)
==1956==    by 0x8CCB80A: dstemr_ (dstemr.f:669)
==1956==    by 0x8C896B2: dsyevr_ (dsyevr.f:580)
==1956==    by 0x4CB7028A: f2py_rout__flapack_dsyevr.lto_priv.0 (_flapackmodule.c:49839)
==1956==    by 0x4A0CDC2: _PyObject_MakeTpCall (call.c:214)
==1956==    by 0x4A16B99: _PyEval_EvalFrameDefault (ceval.c:4773)
==1956==    by 0x4A15389: UnknownInlinedFun (pycore_ceval.h:72)
==1956==    by 0x4A15389: _PyEval_Vector (ceval.c:6421)
==1956==    by 0x4AA03DB: PyEval_EvalCode (ceval.c:1155)
==1956==    by 0x4AD0F82: run_eval_code_obj (pythonrun.c:1714)
==1956==    by 0x4ACE489: run_mod (pythonrun.c:1735)
==1956==    by 0x4ACCC01: pyrun_file (pythonrun.c:1630)
==1956==  Address 0x5b60764 is 0 bytes after a block of size 4 alloc'd
==1956==    at 0x484286F: malloc (vg_replace_malloc.c:381)
==1956==    by 0x5D3D621: PyDataMem_UserNEW (alloc.c:402)
==1956==    by 0x5D7D74B: PyArray_NewFromDescr_int (ctors.c:840)
==1956==    by 0x5FAB7C6: UnknownInlinedFun (ctors.c:954)
==1956==    by 0x5FAB7C6: UnknownInlinedFun (ctors.c:939)
==1956==    by 0x5FAB7C6: UnknownInlinedFun (ufunc_object.c:1284)
==1956==    by 0x5FAB7C6: UnknownInlinedFun (ufunc_object.c:2657)
==1956==    by 0x5FAB7C6: ufunc_generic_fastcall (ufunc_object.c:4872)
==1956==    by 0x4A2BEC5: UnknownInlinedFun (pycore_call.h:92)
==1956==    by 0x4A2BEC5: object_vacall (call.c:819)
==1956==    by 0x4AA1AB0: PyObject_CallFunctionObjArgs (call.c:925)
==1956==    by 0x5E3A725: UnknownInlinedFun (number.c:270)
==1956==    by 0x5E3A725: array_power.lto_priv.0 (number.c:534)
==1956==    by 0x4A6CFEB: ternary_op.constprop.0 (abstract.c:1002)
==1956==    by 0x4A17984: _PyEval_EvalFrameDefault (ceval.c:5550)
==1956==    by 0x4A15389: UnknownInlinedFun (pycore_ceval.h:72)
==1956==    by 0x4A15389: _PyEval_Vector (ceval.c:6421)
==1956==    by 0x4A0FB40: _PyObject_FastCallDictTstate (call.c:152)
==1956==    by 0x4A398EB: UnknownInlinedFun (call.c:482)
==1956==    by 0x4A398EB: slot_tp_init (typeobject.c:7829)

Comment 10 Nikola Forró 2022-06-30 13:50:44 UTC
Created attachment 1893673 [details]
Full valgrind output

Comment 11 Mamoru TASAKA 2022-07-03 07:15:10 UTC
https://github.com/scipy/scipy/issues/16527
https://github.com/scipy/scipy/pull/16528

Once switching scipy (I don't know if this should be fixed in numpy f2py side)

Comment 12 Nikola Forró 2022-07-03 08:02:55 UTC
Thanks! No, I don't think there is anything to be fixed in f2py.

Comment 13 Nikola Forró 2022-07-03 08:08:30 UTC
I can confirm that scipy-1.8.1-5.fc37 fixes the crash.

Downstream commit by Mamoru: https://src.fedoraproject.org/rpms/scipy/c/f824e735d7c8027a34334b04362b485703a6b642?branch=rawhide


Note You need to log in before you can comment on or make changes to this bug.