-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: Redefine IndexOpsMixin.size, fix #25580. #25584
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: HE, Tao <[email protected]>
Signed-off-by: HE, Tao <[email protected]>
Codecov Report
@@ Coverage Diff @@
## master #25584 +/- ##
==========================================
+ Coverage 91.26% 91.26% +<.01%
==========================================
Files 173 173
Lines 52966 52968 +2
==========================================
+ Hits 48337 48340 +3
+ Misses 4629 4628 -1
Continue to review full report at Codecov.
|
Codecov Report
@@ Coverage Diff @@
## master #25584 +/- ##
===========================================
- Coverage 91.26% 41.71% -49.55%
===========================================
Files 173 173
Lines 52966 52968 +2
===========================================
- Hits 48337 22094 -26243
- Misses 4629 30874 +26245
Continue to review full report at Codecov.
|
Signed-off-by: HE, Tao <[email protected]>
@sighingnow can you avoid force pushing? Makes it a bit easier to review incrementally. It'd be good to include a release note for the bug this is fixing too. @jorisvandenbossche do you remember if we discussed a |
Noted, thanks @TomAugspurger. The |
Whoops, yes it should return an integer (I was mixing it up with shape). That said, in the name of keeping a smaller API footprint, I'm leaning towards redefining IndexOpsMixin.size to use I believe that this will work for MultiIndex (have to be careful about unused levels) def size(self):
return np.prod(self.remove_unused_levels().levshape) |
I agree to move redefine >>> a = pd.MultiIndex.from_arrays(np.array([[1,2,3], [4,5,6], [7,8,9]]))
>>> a.size
3
>>> np.prod(a.remove_unused_levels().levshape)
27 I just want to confirm whether current behavior of Edit: if redefine |
Oh, hmm that's surprising. But I don't think we can easily change it at this point... I guess that means that MultiIndex.size can continue to use IndexOpsMixin.size. |
Signed-off-by: HE, Tao <[email protected]>
+1 on for now not adding The
so a MultiIndex is regarded as a 1D object, not 2D (for sure can be confusing :-)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you also add a test for the actual bug in size
? (because it was surfaced first through resample, but size
itself is also public API which currently fails)
pandas/core/base.py
Outdated
@@ -745,7 +745,7 @@ def nbytes(self): | |||
""" | |||
Return the number of bytes in the underlying data. | |||
""" | |||
return self._values.nbytes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is the wrong property that is being changed? (nbytes
, not size
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoops, will fix that soon.
doc/source/whatsnew/v0.24.2.rst
Outdated
@@ -31,6 +31,7 @@ Fixed Regressions | |||
- Fixed regression in ``IntervalDtype`` construction where passing an incorrect string with 'Interval' as a prefix could result in a ``RecursionError``. (:issue:`25338`) | |||
- Fixed regression in :class:`Categorical`, where constructing it from a categorical ``Series`` and an explicit ``categories=`` that differed from that in the ``Series`` created an invalid object which could trigger segfaults. (:issue:`25318`) | |||
- Fixed pip installing from source into an environment without NumPy (:issue:`25193`) | |||
- Fixed bug where :meth:`core.base.IndexOpsMixin.size` could not return correct result for :class:`api.extensions.ExtensionArray` (:issue:`25580`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's best to mention public API, so instead of core.base.IndexOpsMixin.size
, I would use Series.size
or Index.size
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
I think we should add .size to ea generally this has come a number of times (and to be honest it’s a trivial addition to the base class ) |
Signed-off-by: HE, Tao <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this lgtm. small whatsnew change; ping on green.
doc/source/whatsnew/v0.24.2.rst
Outdated
@@ -31,6 +31,7 @@ Fixed Regressions | |||
- Fixed regression in ``IntervalDtype`` construction where passing an incorrect string with 'Interval' as a prefix could result in a ``RecursionError``. (:issue:`25338`) | |||
- Fixed regression in :class:`Categorical`, where constructing it from a categorical ``Series`` and an explicit ``categories=`` that differed from that in the ``Series`` created an invalid object which could trigger segfaults. (:issue:`25318`) | |||
- Fixed pip installing from source into an environment without NumPy (:issue:`25193`) | |||
- Fixed bug where :meth:`Index.size` could not return correct result for :class:`api.extensions.ExtensionArray` (:issue:`25580`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not correct, it raised previously ot 'returned an incorrect value'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is also a bug fix and not a regression, can you move to the appopriate section (bug fixes numeric is ok)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually move to resampling bug fixes is good (you could mention that in the whatsnew note)
lgtm. though something failing. |
It's the same sparse failure that should be fixed on master. |
@sighingnow thanks! |
MeeseeksDev doesn't seem to be picking this one up (and I also assume there will be a conflict in the whatsnew), so will do a manual backport |
…5584) Signed-off-by: HE, Tao <[email protected]> (cherry picked from commit 6375549)
* upstream/master: (110 commits) DOC: hardcode contributors for 0.24.x releases (pandas-dev#25662) DOC: restore toctree maxdepth (pandas-dev#25134) BUG: Redefine IndexOpsMixin.size, fix pandas-dev#25580. (pandas-dev#25584) BUG: to_csv line endings with compression (pandas-dev#25625) DOC: file obj for to_csv must be newline='' (pandas-dev#25624) Suppress incorrect warning in nargsort for timezone-aware DatetimeIndex (pandas-dev#25629) TST: fix incorrect sparse test (now failing on scipy master) (pandas-dev#25653) CLN: Removed debugging code (pandas-dev#25647) DOC: require Return section only if return is not None nor commentary (pandas-dev#25008) DOC:Remove hard-coded examples from _flex_doc_SERIES (pandas-dev#24589) (pandas-dev#25524) TST: xref pandas-dev#25630 (pandas-dev#25643) BUG: Fix pandas-dev#25481 by fixing the error message in TypeError (pandas-dev#25540) Fixturize tests/frame/test_mutate_columns.py (pandas-dev#25642) Fixturize tests/frame/test_join.py (pandas-dev#25639) Fixturize tests/frame/test_combine_concat.py (pandas-dev#25634) Fixturize tests/frame/test_asof.py (pandas-dev#25628) BUG: Fix user-facing AssertionError with to_html (pandas-dev#25608) (pandas-dev#25620) DOC: resolve all GL03 docstring validation errors (pandas-dev#25525) TST: failing wheel building on PY2 and old numpy (pandas-dev#25631) DOC: Remove makePanel from docs (pandas-dev#25609) (pandas-dev#25612) ...
@jorisvandenbossche many thanks for helping me get this PR merged into master! Sorry for the late response. BTW, I'm wondering if there were conflicts between pull request and master, how should I resolve that? Should I merge the master into the pull request, or rebase the pull request on top of master? |
@sighingnow you should merge master into the branch |
Noted, thanks! @WillAyd |
git diff upstream/master -u -- "*.py" | flake8 --diff