Description of problem: Package python-pingouin fails to build from source in Fedora rawhide. Version-Release number of selected component (if applicable): 0.3.8-1.fc34 Steps to Reproduce: koji build --scratch f34 python-pingouin-0.3.8-1.fc34.src.rpm Additional info: This package is tracked by Koschei. See: https://koschei.fedoraproject.org/package/python-pingouin =================================== FAILURES =================================== ____________________ TestRegression.test_linear_regression _____________________ self = <pingouin.tests.test_regression.TestRegression testMethod=test_linear_regression> def test_linear_regression(self): """Test function linear_regression. Compare against JASP and R lm() function. """ # Simple regression (compare to R lm()) lm = linear_regression(df['X'], df['Y']) # Pingouin sc = linregress(df['X'], df['Y']) # SciPy # When using assert_equal, we need to use .to_numpy() assert_equal(lm['names'].to_numpy(), ['Intercept', 'X']) assert_almost_equal(lm['coef'][1], sc.slope) assert_almost_equal(lm['coef'][0], sc.intercept) assert_almost_equal(lm['se'][1], sc.stderr) assert_almost_equal(lm['pval'][1], sc.pvalue) assert_almost_equal(np.sqrt(lm['r2'][0]), sc.rvalue) assert lm.residuals_.size == df['Y'].size assert_equal(lm['CI[2.5%]'].round(5).to_numpy(), [1.48155, 0.17553]) assert_equal(lm['CI[97.5%]'].round(5).to_numpy(), [4.23286, 0.61672]) assert round(lm['r2'].iloc[0], 4) == 0.1147 assert round(lm['adj_r2'].iloc[0], 4) == 0.1057 assert lm.df_model_ == 1 assert lm.df_resid_ == 98 # Multiple regression with intercept (compare to JASP) X = df[['X', 'M']].to_numpy() y = df['Y'].to_numpy() lm = linear_regression(X, y, as_dataframe=False) # Pingouin sk = LinearRegression(fit_intercept=True).fit(X, y) # SkLearn assert_equal(lm['names'], ['Intercept', 'x1', 'x2']) assert_almost_equal(lm['coef'][1:], sk.coef_) assert_almost_equal(lm['coef'][0], sk.intercept_) assert_almost_equal(sk.score(X, y), lm['r2']) assert lm['residuals'].size == y.size # No need for .to_numpy here because we're using a dict and not pandas assert_equal([.605, .110, .101], np.round(lm['se'], 3)) assert_equal([3.145, 0.361, 6.321], np.round(lm['T'], 3)) assert_equal([0.002, 0.719, 0.000], np.round(lm['pval'], 3)) assert_equal([.703, -.178, .436], np.round(lm['CI[2.5%]'], 3)) assert_equal([3.106, .257, .835], np.round(lm['CI[97.5%]'], 3)) # No intercept lm = linear_regression(X, y, add_intercept=False, as_dataframe=False) sk = LinearRegression(fit_intercept=False).fit(X, y) assert_almost_equal(lm['coef'], sk.coef_) # Scikit-learn gives wrong R^2 score when no intercept present because # sklearn.metrics.r2_score always assumes that an intercept is present # https://stackoverflow.com/questions/54614157/scikit-learn-statsmodels-which-r-squared-is-correct # assert_almost_equal(sk.score(X, y), lm['r2']) # Instead, we compare to R lm() function: assert round(lm['r2'], 4) == 0.9096 assert round(lm['adj_r2'], 4) == 0.9078 assert lm['df_model'] == 2 assert lm['df_resid'] == 98 # Test other arguments linear_regression(df[['X', 'M']], df['Y'], coef_only=True) linear_regression(df[['X', 'M']], df['Y'], alpha=0.01) linear_regression(df[['X', 'M']], df['Y'], alpha=0.10) # With missing values linear_regression(df_nan[['X', 'M']], df_nan['Y'], remove_na=True) # With columns with only one unique value lm1 = linear_regression(df[['X', 'M', 'One']], df['Y']) lm2 = linear_regression(df[['X', 'M', 'One']], df['Y'], add_intercept=False) assert lm1.shape[0] == 3 assert lm2.shape[0] == 3 assert np.isclose(lm1.at[0, 'r2'], lm2.at[0, 'r2']) # With zero-only column lm1 = linear_regression(df[['X', 'M', 'Zero', 'One']], df['Y']) lm2 = linear_regression(df[['X', 'M', 'Zero', 'One']], df['Y'].to_numpy(), add_intercept=False) lm3 = linear_regression(df[['X', 'Zero', 'M', 'Zero']].to_numpy(), df['Y'], add_intercept=False) assert_equal(lm1.loc[:, 'names'].to_numpy(), ['Intercept', 'X', 'M']) assert_equal(lm2.loc[:, 'names'].to_numpy(), ['X', 'M', 'One']) assert_equal(lm3.loc[:, 'names'].to_numpy(), ['x1', 'x3']) # With duplicate columns lm1 = linear_regression(df[['X', 'One', 'Zero', 'M', 'M', 'X']], df['Y']) lm2 = linear_regression( df[['X', 'One', 'Zero', 'M', 'M', 'X']].to_numpy(), df['Y'], add_intercept=False ) assert_equal(lm1.loc[:, 'names'].to_numpy(), ['Intercept', 'X', 'M']) assert_equal(lm2.loc[:, 'names'].to_numpy(), ['x1', 'x2', 'x4']) # Relative importance # Compare to R package relaimpo # >>> data <- read.csv('mediation.csv') # >>> lm1 <- lm(Y ~ X + M, data = data) # >>> calc.relimp(lm1, type=c("lmg")) > lm = linear_regression(df[['X', 'M']], df['Y'], relimp=True) X = array([[ 6, 5], [ 7, 5], [ 7, 7], [ 8, 4], [ 4, 3], [ 4, 4], [ 9, 7],... [ 7, 3], [ 6, 7], [ 5, 2], [ 8, 4], [ 7, 4], [ 2, 2], [ 5, 4]]) lm = {'CI[2.5%]': array([0.1017897 , 0.51242501]), 'CI[97.5%]': array([0.43932124, 0.91602289]), 'T': array([3.18138289, 7.... [ 7, 3], [ 6, 7], [ 5, 2], [ 8, 4], [ 7, 4], [ 2, 2], [ 5, 4]]), ...} lm1 = names coef se ... adj_r2 CI[2.5%] CI[97.5%] 0 Intercept 1.904269 0.605458 ... 0.360071 ....360071 -0.178018 0.257226 2 M 0.635495 0.100534 ... 0.360071 0.435963 0.835027 [3 rows x 9 columns] lm2 = names coef se T ... r2 adj_r2 CI[2.5%] CI[97.5%] 0 x1 0.039604 0.109648 0.361...01 3.105936 2 x4 0.635495 0.100534 6.321194 ... 0.372999 0.360071 0.435963 0.835027 [3 rows x 9 columns] lm3 = names coef se T ... r2 adj_r2 CI[2.5%] CI[97.5%] 0 x1 0.270555 0.085043 3.181...90 0.439321 1 x3 0.714224 0.101689 7.023596 ... 0.909607 0.907762 0.512425 0.916023 [2 rows x 9 columns] sc = LinregressResult(slope=0.3961261171467491, intercept=2.8572045582909733, rvalue=0.33869891283150894, pvalue=0.0005671128490823392, stderr=0.11115978809240681, intercept_stderr=0.6932129732105715) self = <pingouin.tests.test_regression.TestRegression testMethod=test_linear_regression> sk = LinearRegression(fit_intercept=False) y = array([ 6, 5, 4, 8, 5, 7, 8, 4, 7, 4, 4, 3, 10, 6, 4, 3, 5, 4, 4, 6, 7, 6, 4, 4, 7, 8, ... 3, 4, 10, 4, 6, 4, 7, 7, 4, 1, 8, 5, 6, 3, 5, 9, 8, 8, 6, 5, 4, 6, 5, 2, 1, 5, 1, 5]) pingouin/tests/test_regression.py:130: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ pingouin/regression.py:454: in linear_regression reli = _relimp(data.drop(columns=['Intercept']).cov()) T = array([3.1451684 , 0.36119001, 6.32119439]) X = array([[ 1., 6., 5.], [ 1., 7., 5.], [ 1., 7., 7.], [ 1., 8., 4.], [ 1., 4., 3.]... [ 1., 5., 2.], [ 1., 8., 4.], [ 1., 7., 4.], [ 1., 2., 2.], [ 1., 5., 4.]]) X_gd = True Xw = array([[ 1., 6., 5.], [ 1., 7., 5.], [ 1., 7., 7.], [ 1., 8., 4.], [ 1., 4., 3.]... [ 1., 5., 2.], [ 1., 8., 4.], [ 1., 7., 4.], [ 1., 2., 2.], [ 1., 5., 4.]]) _ = array([80.82058916, 13.03421192, 2.67239351]) add_intercept = True adj_r2 = 0.36007139062480464 alpha = 0.05 as_dataframe = True beta_se = array([0.60545846, 0.10964844, 0.10053394]) beta_var = array([0.36657995, 0.01202278, 0.01010707]) coef = array([1.90426882, 0.03960392, 0.63549459]) coef_only = False constant = 1 crit = 1.984723185927883 data = y Intercept X M 0 6 1.0 6.0 5.0 1 5 1.0 7.0 5.0 2 4 1.0 7.0 7.0 3 8 ... 1.0 8.0 4.0 97 5 1.0 7.0 4.0 98 1 1.0 2.0 2.0 99 5 1.0 5.0 4.0 [100 rows x 4 columns] df_model = 2 df_resid = 97 idx_duplicate = [] idx_unique = array([0]) idx_zero = array([], dtype=int64) ll = array([ 0.70260138, -0.17801788, 0.43596254]) ll_name = 'CI[2.5%]' marg_error = array([1.20166745, 0.2176218 , 0.19953204]) mse = 2.661262704705674 n = 100 n_nonzero = array([100, 99, 99]) names = ['Intercept', 'X', 'M'] p = 3 pair = (1, 2) pred = array([5.31936528, 5.3589692 , 6.62995838, 4.76307854, 3.96916827, 4.60466285, 6.70916622, 2.10228843, 6.669562...52, 5.99446379, 7.30505688, 4.08798003, 6.59035446, 3.3732776 , 4.76307854, 4.72347461, 3.25446584, 4.64426677]) pval = array([2.20383086e-03, 7.18742902e-01, 7.92264206e-09]) r2 = 0.372999241319253 rank = 3 relimp = True remove_na = False resid = array([ 0.68063472, -0.3589692 , -2.62995838, 3.23692146, 1.03083173, 2.39533715, 1.29083378, 1.89771157, ...446379, -3.30505688, 1.91201997, -1.59035446, -1.3732776 , -3.76307854, 0.27652539, -2.25446584, 0.35573323]) ss_res = 258.1424823564504 ss_tot = 3147 ss_wtot = 411.71000000000004 stats = {'CI[2.5%]': array([ 0.70260138, -0.17801788, 0.43596254]), 'CI[97.5%]': array([3.10593627, 0.25722572, 0.83502663]), 'T': array([3.1451684 , 0.36119001, 6.32119439]), 'adj_r2': 0.36007139062480464, ...} ul = array([3.10593627, 0.25722572, 0.83502663]) ul_name = 'CI[97.5%]' w = array([1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., ...1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.]) weights = None y = array([ 6, 5, 4, 8, 5, 7, 8, 4, 7, 4, 4, 3, 10, 6, 4, 3, 5, 4, 4, 6, 7, 6, 4, 4, 7, 8, ... 3, 4, 10, 4, 6, 4, 7, 7, 4, 1, 8, 5, 6, 3, 5, 9, 8, 8, 6, 5, 4, 6, 5, 2, 1, 5, 1, 5]) y_gd = True yw = array([ 6, 5, 4, 8, 5, 7, 8, 4, 7, 4, 4, 3, 10, 6, 4, 3, 5, 4, 4, 6, 7, 6, 4, 4, 7, 8, ... 3, 4, 10, 4, 6, 4, 7, 7, 4, 1, 8, 5, 6, 3, 5, 9, 8, 8, 6, 5, 4, 6, 5, 2, 1, 5, 1, 5]) pingouin/regression.py:535: in _relimp ss_reg_without = pinv(S.iloc[p, p]) @ S_without @ S_without S = y X M y 4.158687 1.204343 2.365859 X 1.204343 3.040303 1.705657 M 2.365859 1.705657 3.616566 S_without = Series([], Name: y, dtype: float64) all_preds = [] betas = array([0.03960392, 0.63549459]) cols = ['y', 'X', 'M'] k = 0 loo = array([2]) npred = 2 p = [] p_with = [1] pred = 1 predictors = ['X', 'M'] predictors_int = array([1, 2]) r2_full = 0.3729992413192531 r2_seq = [] r2_seq_mean = [] ss_reg_precomp = {} ss_tot = 4.158686868686868 target_int = 0 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = array([], shape=(0, 0), dtype=float64), cond = None, rcond = None return_rank = False, check_finite = True def pinv(a, cond=None, rcond=None, return_rank=False, check_finite=True): """ Compute the (Moore-Penrose) pseudo-inverse of a matrix. Calculate a generalized inverse of a matrix using a least-squares solver. Parameters ---------- a : (M, N) array_like Matrix to be pseudo-inverted. cond, rcond : float, optional Cutoff factor for 'small' singular values. In `lstsq`, singular values less than ``cond*largest_singular_value`` will be considered as zero. If both are omitted, the default value ``max(M, N) * eps`` is passed to `lstsq` where ``eps`` is the corresponding machine precision value of the datatype of ``a``. .. versionchanged:: 1.3.0 Previously the default cutoff value was just `eps` without the factor ``max(M, N)``. return_rank : bool, optional if True, return the effective rank of the matrix check_finite : bool, optional Whether to check that the input matrix contains only finite numbers. Disabling may give a performance gain, but may result in problems (crashes, non-termination) if the inputs do contain infinities or NaNs. Returns ------- B : (N, M) ndarray The pseudo-inverse of matrix `a`. rank : int The effective rank of the matrix. Returned if return_rank == True Raises ------ LinAlgError If computation does not converge. Examples -------- >>> from scipy import linalg >>> a = np.random.randn(9, 6) >>> B = linalg.pinv(a) >>> np.allclose(a, np.dot(a, np.dot(B, a))) True >>> np.allclose(B, np.dot(B, np.dot(a, B))) True """ a = _asarray_validated(a, check_finite=check_finite) # If a is sufficiently tall it is cheaper to compute using the transpose > trans = a.shape[0] / a.shape[1] >= 1.1 E ZeroDivisionError: division by zero a = array([], shape=(0, 0), dtype=float64) check_finite = True cond = None rcond = None return_rank = False /usr/lib64/python3.10/site-packages/scipy/linalg/basic.py:1290: ZeroDivisionError =========================== short test summary info ============================ FAILED pingouin/tests/test_regression.py::TestRegression::test_linear_regression ============ 1 failed, 85 passed, 3006 warnings in 66.17s (0:01:06) ============
Dear Maintainer, your package has an open Fails To Build From Source bug for Fedora 34. Action is required from you. If you can fix your package to build, perform a build in koji, and either create an update in bodhi, or close this bug without creating an update, if updating is not appropriate [1]. If you are working on a fix, set the status to ASSIGNED to acknowledge this. If you have already fixed this issue, please close this Bugzilla report. Following the policy for such packages [2], your package will be orphaned if this bug remains in NEW state more than 8 weeks (not sooner than 2021-03-15). A week before the mass branching of Fedora 35 according to the schedule [3], any packages not successfully rebuilt at least on Fedora 33 will be retired regardless of the status of this bug. [1] https://docs.fedoraproject.org/en-US/fesco/Updates_Policy/ [2] https://docs.fedoraproject.org/en-US/fesco/Fails_to_build_from_source_Fails_to_install/ [3] https://fedorapeople.org/groups/schedule/f-35/f-35-key-tasks.html
This bug appears to have been reported against 'rawhide' during the Fedora 34 development cycle. Changing version to 34.
FEDORA-2021-f24a52bc68 has been submitted as an update to Fedora 34. https://bodhi.fedoraproject.org/updates/FEDORA-2021-f24a52bc68
FEDORA-2021-f24a52bc68 has been pushed to the Fedora 34 testing repository. Soon you'll be able to install the update with the following command: `sudo dnf upgrade --enablerepo=updates-testing --advisory=FEDORA-2021-f24a52bc68` You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2021-f24a52bc68 See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.
FEDORA-2021-f24a52bc68 has been pushed to the Fedora 34 stable repository. If problem still persists, please make note of it in this bug report.