Implement take for EA mixins #23159

jbrockmendel · 2018-10-15T03:44:55Z

Also _concat_same_type and copy (both based on implementations in #22862)

closes #xxxx
tests added / passed
passes git diff upstream/master -u -- "*.py" | flake8 --diff
whatsnew entry

pep8speaks · 2018-10-15T03:45:00Z

Hello @jbrockmendel! Thanks for submitting the PR.

There are no PEP8 issues in the file pandas/core/arrays/datetimelike.py !
There are no PEP8 issues in the file pandas/core/arrays/datetimes.py !
There are no PEP8 issues in the file pandas/core/arrays/period.py !
There are no PEP8 issues in the file pandas/core/arrays/timedeltas.py !
There are no PEP8 issues in the file pandas/tests/arrays/test_datetimelike.py !

jorisvandenbossche

Looks good to me in general (didn't look at the tests in detail), added some comments.

jorisvandenbossche · 2018-10-15T09:31:34Z

pandas/tests/arrays/test_datetimelike.py

+class SharedTests(object):
+    index_cls = None
+
+    def test_take(self):


I think such basic tests are already covered in the base extension tests?

No idea. I'm pretty sure the dtype-specific tests must be non-redundant.

jorisvandenbossche · 2018-10-15T09:33:51Z

pandas/core/arrays/datetimelike.py

+        return self._shallow_copy(new_values)
+
+    @classmethod
+    def _concat_same_type(cls, to_concat):


This is not really tested for now, is that correct?

Sorry, but I am -1 on moving things out of the PeriodArray PR that cannot stand on its own

Assuming you're implicitly saying you'd be OK with this if tests were implemented as part of this PR, I think this is an excellent comment.

No, that is not really what I meant :-)
Because now you added tests for something that would already be tested in the PeriodArray PR. We don't need explicit tests for this method, as it is tested through all existing methods for concat, and it already has explicit tests in the base extension tests.

So I just think this simply belongs in a PR where we actually make PeriodArray an ExtensionArray.

that would already be tested in the PeriodArray PR.

The benefit of doing things in smaller chunks is that each chunk gets more focused. e.g. I don't want to throw shade on Tom's PR, but the take implementation missed fill_value that is a Period with non-matching freq. Doing a couple methods at a time lets both authors and reviewers focus more precisely.

And while "too many tests" does have a downside, it really isn't in the top 5 things I'm going to worry about today.

but the take implementation missed fill_value that is a Period with non-matching freq

And you can review the PR in detail (which of course takes time), and have pointed that out there.

Yes, for sure smaller PRs are better, and to the extent possible, we should strive for that. I fully agree the huge PR makes it very difficult to do a proper review.

But getting out chunks that cannot live on its own can also be hard to review. And in this case it also means to need to completely duplicate tests, and also testing the implementation detail and not the end result.

it really isn't in the top 5 things I'm going to worry about today.

Agreeing on what the end result of this refactor will look like, and having a good view on that, is a worry for me. And so what I am saying is, that many small PRs also has a clear downside of making this harder.

jorisvandenbossche · 2018-10-15T09:45:28Z

pandas/core/arrays/datetimelike.py

@@ -211,6 +271,10 @@ def astype(self, dtype, copy=True):
    # ------------------------------------------------------------------
    # Null Handling

+    def isna(self):
+        # EA Interface
+        return self._isnan


Same comment as in the other PRs, I think we should keep this concept of _isnan on the Index (basically my question about it is: for the array classes, what is the advantages of having a _isnan instead of simply using isna() everywhere?)

(but since there are a lot of places here where it is used, I am fine with keeping _isna itself in here until we de the actual split)

This is a reasonable view, but one that I'd like to put off dealing with for as long as possible.

The eventual move to composition is going to cause a mismatch between what is and is not cacheable (like _isnan is now on the Index subclasses). Since the EA subclasses actually use _isnan in arithmetic/comparison ops, there is a performance hit if we're not careful. Since I haven't thought of a good solution to this, putting it off is the next best thing.

…ike_ea

codecov · 2018-10-15T16:39:03Z

Codecov Report

Merging #23159 into master will increase coverage by <.01%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master   #23159      +/-   ##
==========================================
+ Coverage   92.19%    92.2%   +<.01%     
==========================================
  Files         169      169              
  Lines       50959    51008      +49     
==========================================
+ Hits        46980    47030      +50     
+ Misses       3979     3978       -1

Flag	Coverage Δ
#multiple	`90.62% <100%> (+0.01%)`	⬆️
#single	`42.27% <24.07%> (-0.03%)`	⬇️

Impacted Files	Coverage Δ
pandas/core/arrays/timedeltas.py	`94.2% <100%> (+0.23%)`	⬆️
pandas/core/arrays/datetimelike.py	`95.07% <100%> (+0.17%)`	⬆️
pandas/core/arrays/period.py	`95.7% <100%> (+0.13%)`	⬆️
pandas/core/arrays/datetimes.py	`97.48% <100%> (+0.1%)`	⬆️
pandas/tseries/offsets.py	`97.36% <0%> (+0.08%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 913f71f...23d2724. Read the comment docs.

…ike_ea

jorisvandenbossche

To be clear / put it a bit more strongly: let's not merge any PR related to periods until we agree on the way forward (smaller like this one or Tom's PR), unless there is an explicit blessing of Tom.

…ike_ea

jbrockmendel · 2018-10-18T02:00:08Z

Closing this. @TomAugspurger i think some of the “take” code/tests may be worth salvaging for the PeriodArray PR.

jorisvandenbossche · 2018-10-18T15:10:12Z

We could also leave this open, so you can rebase once the PeriodArray PR is merged?

jbrockmendel · 2018-10-18T15:26:50Z

We could also leave this open, so you can rebase once the PeriodArray PR is merged?

I'd wait to re-open until it becomes actionable.

jorisvandenbossche · 2018-10-18T20:34:30Z

but the take implementation missed fill_value that is a Period with non-matching freq

Is that fixed on Tom's PR, or is there an issue for it? Or is that already tested through existing tests?

TomAugspurger · 2018-10-18T21:36:20Z

Fixed locally. Haven't pushed yet.

jbrockmendel added 2 commits October 14, 2018 19:34

implement take for EA mixins

084cb34

remove unused import

0971615

jorisvandenbossche reviewed Oct 15, 2018

View reviewed changes

jbrockmendel added 2 commits October 15, 2018 07:59

un-implement copy

a83adad

Merge branch 'master' of https://github.com/pandas-dev/pandas into dl…

a16c900

…ike_ea

jbrockmendel mentioned this pull request Oct 15, 2018

REF: Make PeriodArray an ExtensionArray #22862

Merged

Avoid timezone-loss warning

755bc5c

jbrockmendel added 4 commits October 15, 2018 14:30

Tests for concat_same_type

50234e9

Merge branch 'master' of https://github.com/pandas-dev/pandas into dl…

16c361b

…ike_ea

use Index.take for Index subclasses

dcc985e

troubleshoot

a1fea06

jbrockmendel force-pushed the dlike_ea branch from fb4a6af to a1fea06 Compare October 16, 2018 01:14

TomAugspurger mentioned this pull request Oct 16, 2018

Datetimelike Array Refactor #23185

Closed

jorisvandenbossche requested changes Oct 16, 2018

View reviewed changes

Merge branch 'master' of https://github.com/pandas-dev/pandas into dl…

23d2724

…ike_ea

jreback added Indexing Related to indexing on series/frames, not to indexes themselves ExtensionArray Extending pandas with custom dtypes or arrays. labels Oct 17, 2018

jbrockmendel closed this Oct 18, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement take for EA mixins #23159

Implement take for EA mixins #23159

jbrockmendel commented Oct 15, 2018

pep8speaks commented Oct 15, 2018

jorisvandenbossche left a comment

jorisvandenbossche Oct 15, 2018

jbrockmendel Oct 15, 2018

jorisvandenbossche Oct 15, 2018

jbrockmendel Oct 15, 2018

jorisvandenbossche Oct 15, 2018

jbrockmendel Oct 15, 2018

jorisvandenbossche Oct 16, 2018

jbrockmendel Oct 16, 2018

jorisvandenbossche Oct 16, 2018

jorisvandenbossche Oct 15, 2018

jbrockmendel Oct 15, 2018

codecov bot commented Oct 15, 2018 •

edited

Loading

jorisvandenbossche left a comment

jbrockmendel commented Oct 18, 2018

jorisvandenbossche commented Oct 18, 2018

jbrockmendel commented Oct 18, 2018

jorisvandenbossche commented Oct 18, 2018

TomAugspurger commented Oct 18, 2018

Implement take for EA mixins #23159

Implement take for EA mixins #23159

Conversation

jbrockmendel commented Oct 15, 2018

pep8speaks commented Oct 15, 2018

jorisvandenbossche left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov bot commented Oct 15, 2018 • edited Loading

Codecov Report

jorisvandenbossche left a comment

Choose a reason for hiding this comment

jbrockmendel commented Oct 18, 2018

jorisvandenbossche commented Oct 18, 2018

jbrockmendel commented Oct 18, 2018

jorisvandenbossche commented Oct 18, 2018

TomAugspurger commented Oct 18, 2018

codecov bot commented Oct 15, 2018 •

edited

Loading