-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: Multiindexed series .at fix #32520
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This comment has been minimized.
This comment has been minimized.
pandas/tests/indexing/test_scalar.py
Outdated
def test_multiindex_series_loc_set(): | ||
series = pd.Series([1, 2], index=[[1, 2], [3, 4]]) | ||
series.loc[1, 3] = 3 | ||
assert series.loc[1, 3] == 3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i dont think the loc tests are needed, unless that behavior is being fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe @jreback requested loc to be added because the rest of the tests found in test_scalar test both at and loc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tests comments, ping on green.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pls add a whatsnew note in 1.1, bugfixes for indexing
@jreback ping |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
@rhshadrach can you rebase and ill take another look |
689f52d
to
26fbf7c
Compare
Hello @rhshadrach! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found: There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻 Comment last updated at 2020-05-18 20:00:11 UTC |
@jbrockmendel I've rebased and all checks pass. |
lgtm. can you merge master. cc @jbrockmendel merge when happy. |
aeea52d
to
e3bc8b9
Compare
lgtm. cc @jbrockmendel merge when ready. |
A few comments, generally looks very good |
e3bc8b9
to
bee54e7
Compare
@jbrockmendel Thanks for the comments. I've addressed all of them. I believe failing checks are unrelated, but I'm having a hard time understanding the error messages. |
LGTM. Yes, the azure failure is unrelated. To be on the safe side we usually try to sort these out before merging anything new. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just opened #33153 for a general API question that also impacts this PR.
So the fix to allow .at
indexing for a Series with MultiIndex is certainly a welcome fix, but I only question that it should also work if not all levels of the MultiIndex are specified (a behaviour that this PR also enables).
assert series.index.nlevels == 2 | ||
assert series.at[1, 3] == 1 | ||
assert series.loc[1, 3] == 1 | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a test for not providing all MultiIndex levels? (so eg series.at[1]
).
And also a test with a duplicated label?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And maybe also an equivalent test for a DataFrame (it seems to work, but would be good to ensure we also have an explicit test for that)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure - I have it mostly done, but ran into a question. For the series
series = pd.Series([1, 2, 3], index=[[1, 2, 2], [3, 4, 4]])
i.e.
1 3 1
2 4 2
4 3
dtype: int64
what should be the output of series.at[1, 3]? I was expecting to just get the scalar 1, but currently I'm getting
1 3 1
dtype: int64
After thinking about this - perhaps what we have currently is the desired output. My reason for this is that the shape doesn't change depending on the input - (1, 3) vs (2, 4).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO, that should be a scalar (but that is also the reason that I asked to test it, to be sure this is covered ànd to discuss the semantics we want)
It was not described in that issue, but this PR is still enabling it:
This fails on 0.25.3 and 1.0.3 |
Ah, but it seems this also already works on master, so it has been "accidentally" enabled due to some other refactoring, I suppose. So OK, indeed not directly related to your PR. |
bee54e7
to
67b8150
Compare
With this, I think we should leave as-is in regards to non-scalar results. These can be fixed or documented and then tested for after discussion in #33153. I added the tests for DataFrame though. Note that DataFrame.loc fails with an Index of tuples, so I did not add this test. |
@jorisvandenbossche changes made. Let me know if you agree with my assessment above. |
@rhshadrach can you merge master. @jorisvandenbossche ok here? |
@jorisvandenbossche Let me know if there are any more changes you'd like. |
@rhshadrach can you rebase I think this may make #32855 unnecessary |
@jbrockmendel - done. Seems to me the only intersection with the PR you mentioned is the one test dealing with Series there. This PR only deals with Series, where as most of that is dealing with DataFrames. |
@jorisvandenbossche @jreback I think this PR is all set. |
76f196a
to
f325e09
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm. a doc issue, ping when green.
@jreback Note added back, green |
series = df["a"] | ||
assert series.index.nlevels == 2 | ||
series.at[1, 3] = 5 | ||
assert series.at[1, 3] == 5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this the behavior now on master? IOW is this an api change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Current behavior on master is to raise ValueError for any of the at lines, behavior with the loc lines is the same as master. I would describe this as a bugfix and not an api change.
thanks @rhshadrach |
black pandas
git diff upstream/master -u -- "*.py" | flake8 --diff