Skip to content

BUG: Operations with SparseArray return SA with wrong indices #45125

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jan 8, 2022

Conversation

bdrum
Copy link
Contributor

@bdrum bdrum commented Dec 30, 2021

Two tests have failed but looks like the reason is not my changes
(Perhaps won't see in checks)

====================================================================== FAILURES =======================================================================
__________________________________________ TestChaining.test_detect_chained_assignment_warning_stacklevel[3] __________________________________________

self = <pandas.tests.indexing.test_chaining_and_caching.TestChaining object at 0x0000022FC3FBFB80>, rhs = 3

@pytest.mark.parametrize("rhs", [3, DataFrame({0: [1, 2, 3, 4]})])
def test_detect_chained_assignment_warning_stacklevel(self, rhs):
    # GH#42570
    df = DataFrame(np.arange(25).reshape(5, 5))
    chained = df.loc[:3]
    with option_context("chained_assignment", "warn"):
        with tm.assert_produces_warning(com.SettingWithCopyWarning) as t:
            chained[2] = rhs
          assert t[0].filename == __file__

E AssertionError: assert 'c:\Users\b...nd_caching.py' == 'C:\Users\b...nd_caching.py'
E - C:\Users\bdrum\Development\python\pandas\pandas\tests\indexing\test_chaining_and_caching.py
E ? ^
E + c:\Users\bdrum\Development\python\pandas\pandas\tests\indexing\test_chaining_and_caching.py
E ? ^

pandas\tests\indexing\test_chaining_and_caching.py:444: AssertionError
------------------------------------- generated xml file: C:\Users\bdrum\Development\python\pandas\test-data.xml --------------------------------------
================================================================ slowest 30 durations =================================================================
0.29s call pandas/tests/indexing/test_chaining_and_caching.py::TestChaining::test_detect_chained_assignment_warning_stacklevel[3]

(2 durations < 0.005s hidden. Use -vv to show these durations.)
=============================================================== short test summary info ===============================================================
FAILED pandas/tests/indexing/test_chaining_and_caching.py::TestChaining::test_detect_chained_assignment_warning_stacklevel[3] - AssertionError: asser...
========================================================== 1 failed, 31 deselected in 0.84s ===========================================================

  • Ensure all linting tests pass, see here for how to run them
  • whatsnew entry already there

This is only part of solutiion in order to close regression. I will create separate issue that describes global SparseArray indices problem.

Current behavior as expected in #45110

s = pd.arrays.SparseArray([1, 2, 3, 4, np.nan, np.nan], fill_value=np.nan)
s[s>2]

# [3.0, 4.0]
# Fill: nan
# IntIndex
# Indices: array([0, 1])

@bdrum bdrum changed the title BUG: Operation with SparseArray returns SA with wrong indices BUG: Operations with SparseArray return SA with wrong indices Dec 30, 2021
@phofl
Copy link
Member

phofl commented Dec 30, 2021

Have you added a test for #45110?

@bdrum
Copy link
Contributor Author

bdrum commented Dec 30, 2021

I've added 12 test for comparison with scalars. Bug that you've caught consist from two operations mask array and comparison with scalar. In previous version SparseArray operators was tested via Series. Now it's not. And I decided that exactly such combined test is redundant.

@phofl
Copy link
Member

phofl commented Dec 30, 2021

When you close issues, you should always add tests ensuring, that they do not pop again

@bdrum
Copy link
Contributor Author

bdrum commented Dec 30, 2021

Ok, let me add it

@jreback
Copy link
Contributor

jreback commented Jan 3, 2022

@phofl if you can have a look

Copy link
Member

@phofl phofl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really familiar with SparseArrays, but some small comments

@bdrum bdrum marked this pull request as draft January 4, 2022 08:39
@bdrum bdrum marked this pull request as ready for review January 4, 2022 15:57
@jreback
Copy link
Contributor

jreback commented Jan 5, 2022

@jbrockmendel if you can have a look

@@ -26,6 +26,8 @@ def mix(request):
return request.param


# FIXME: There are not SparseArray tests. There are numpy array tests.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC these are SparseArray tests just not useful ones bc they don't have non-trivial indexes? if correct, can you clarify this for future readers

@jreback jreback added this to the 1.4 milestone Jan 8, 2022
@jreback
Copy link
Contributor

jreback commented Jan 8, 2022

@jbrockmendel ok here?

@jbrockmendel
Copy link
Member

LGTM.

@bdrum is there a gameplan for the two xfailed tests?

@jreback
Copy link
Contributor

jreback commented Jan 8, 2022

ok merging this and can followup with the failed ones, @bdrum if you can create an issue for those, PR would be great as well :->

@jreback jreback merged commit ec79b2b into pandas-dev:master Jan 8, 2022
@jreback
Copy link
Contributor

jreback commented Jan 8, 2022

@meeseeksdev backport 1.4.x

@lumberbot-app
Copy link

lumberbot-app bot commented Jan 8, 2022

Something went wrong ... Please have a look at my logs.

@bdrum
Copy link
Contributor Author

bdrum commented Jan 8, 2022

LGTM.

@bdrum is there a gameplan for the two xfailed tests?

@jbrockmendel I think so. Comparisons with arrays still work wrong, I'll try to fix it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Sparse Sparse Data Type
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: SparseArray doesn't recalculate indices after comparison with scalar
4 participants