Skip to content

REGR: fix rank algo for read-only data #37439

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Oct 27, 2020

Conversation

jorisvandenbossche
Copy link
Member

Closes #37290

@jorisvandenbossche jorisvandenbossche added Regression Functionality that used to work in a prior pandas version Algos Non-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diff labels Oct 27, 2020
@jorisvandenbossche jorisvandenbossche added this to the 1.1.4 milestone Oct 27, 2020
@jreback
Copy link
Contributor

jreback commented Oct 27, 2020

hmm i think this is actually failing see the traivs CI: https://travis-ci.org/github/pandas-dev/pandas/jobs/739214922

>   ranked_mat[:, i] = rank_1d(mat[:, i])
E   TypeError: Argument 'in_arr' has incorrect type (expected numpy.ndarray, got pandas._libs.algos._memoryviewslice)
pandas/_libs/algos.pyx:347: TypeError
_ TestDataFrameCorr.test_corr_nullable_integer[spearman-other_column2-nullable_column1] _
[gw0] linux -- Python 3.7.9 /home/travis/miniconda3/envs/pandas-dev/bin/python
self = <pandas.tests.frame.methods.test_cov_corr.TestDataFrameCorr object at 0x7fcbc4f5a4d0>
nullable_column = <IntegerArray>
[1, 2, <NA>]
Length: 3, dtype: Int64
other_column = array([ 1.,  2., nan]), method = 'spearman'
    @td.skip_if_no_scipy
    @pytest.mark.parametrize(
        "nullable_column", [pd.array([1, 2, 3]), pd.array([1, 2, None])]
    )
    @pytest.mark.parametrize(
        "other_column",
        [pd.array([1, 2, 3]), np.array([1.0, 2.0, 3.0]), np.array([1.0, 2.0, np.nan])],
    )
    @pytest.mark.parametrize("method", ["pearson", "spearman", "kendall"])
    def test_corr_nullable_integer(self, nullable_column, other_column, method):
        # https://github.com/pandas-dev/pandas/issues/33803
        data = DataFrame({"a": nullable_column, "b": other_column})
>       result = data.corr(method=method)
pandas/tests/frame/methods/test_cov_corr.py:190: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
pandas/core/frame.py:8300: in corr
    correl = libalgos.nancorr_spearman(mat, minp=min_periods)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
>   ranked_mat[:, i] = rank_1d(mat[:, i])
E   TypeError: Argument 'in_arr' has incorrect type (expected numpy.ndarray, got pandas._libs.algos._memoryviewslice)

@jbrockmendel
Copy link
Member

looks like you need to update the types in nancorr_spearman and rank_2d (which call rank_1d)

@jorisvandenbossche
Copy link
Member Author

Thanks for the note, indeed needed to update some other places where rank_1d is called

@jreback jreback merged commit 9c5500e into pandas-dev:master Oct 27, 2020
@jreback
Copy link
Contributor

jreback commented Oct 27, 2020

thanks @jorisvandenbossche

@jreback
Copy link
Contributor

jreback commented Oct 28, 2020

@meeseeksdev backport 1.1.x

meeseeksmachine pushed a commit to meeseeksmachine/pandas that referenced this pull request Oct 28, 2020
@jorisvandenbossche jorisvandenbossche deleted the gh-37290-rank-readonly branch October 28, 2020 07:23
jorisvandenbossche added a commit that referenced this pull request Oct 28, 2020
kesmit13 pushed a commit to kesmit13/pandas that referenced this pull request Nov 2, 2020
ukarroum pushed a commit to ukarroum/pandas that referenced this pull request Nov 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Algos Non-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diff Regression Functionality that used to work in a prior pandas version
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: rank raises error with read-only data
3 participants