BUG: Cannot create third-party ExtensionArrays for datetime types (xfail) #34987

xhochy · 2020-06-25T11:05:58Z

closes BUG: Cannot create third-party ExtensionArrays for datetime types #34986
tests added / passed
passes black pandas
passes git diff upstream/master -u -- "*.py" | flake8 --diff
whatsnew entry

xhochy · 2020-06-25T11:06:47Z

This is just the failing test for now, happy to implement a fix if someone could tell me the location where this should be fixed.

WillAyd · 2020-06-25T17:41:00Z

@jbrockmendel

jbrockmendel · 2020-06-26T01:46:16Z

pandas/tests/extension/arrow/test_timestamp.py

+from .arrays import ArrowTimestampUSArray  # isort:skip
+
+
+def test_constructor_extensionblock():


should be xfailed?

I can xfail this, so this can be merged. I would prefer to fix this myself though.

I just need a pointer at which code section I should apply a fix. Should I change the order in pandas/pandas/core/internals/blocks.py so that we only create a DatetimeTZBlock for pandas-provided datetime-based ExtensionArrays or shouldn't is_datetime64tz_dtype return True for my ExtensionDtype?

I would prefer to fix this myself though.

Sounds good. Is this a use case you have a need to get working near-term, or more of a Principle Of The Thing? I ask because...

I just need a pointer at which code section I should apply a fix.

This is pretty daunting, as I expect this is scattered across the code. There are lots of places where we either a) implicitly assume nanoseconds or b) check dtype.kind in ["M", "m"] (much more performant than the is_foo_dtype checks)

Should I change the order in pandas/pandas/core/internals/blocks.py so that we only create a DatetimeTZBlock for pandas-provided datetime-based ExtensionArrays

That will probably be part of a solution.

or shouldn't is_datetime64tz_dtype return True for my ExtensionDtype?

I'd be very reticent to make that change, since I think a lot of code expects that to imply its getting our Datetime64TZDtype. Maybe a is_3rd_party_ea_dtype that we would check for before checking for any 1st-party dtypes? That runs into the "ideally we should treat 3rd party EAs symmetrically with 1st-party" problems.

So getting back to the motivation: how high a priority is this?

One thing I can unambiguously encourage is more tests, even if xfailed:

what happens if you pass one of these to the DatetimeIndex constructor? vice-versa?

what happens if i do DatetimeIndex.astype(this_new_ea_dtype)

addition/subtraction with the gamut of datetime/timedelta scalars/arrays we already support?

How does this behave if you stuff it inside a Categorical/CategoricalIndex?

I would prefer to fix this myself though.

Sounds good. Is this a use case you have a need to get working near-term, or more of a Principle Of The Thing? I ask because...

More in the next 6 months range, thus I'm definitely going to add an xfail here as the points below indicate that we should rather think more than "fix quick".

I would love to have a nullable, non-nanosecond timestamp (actually I desparately need it but e.g. having a performant string is more important to me) but there are several other places that either assume that all timestamps are nanoseconds or backed by a numpy-array, so this is going to be a major effort.

or shouldn't is_datetime64tz_dtype return True for my ExtensionDtype?

I'd be very reticent to make that change, since I think a lot of code expects that to imply its getting our Datetime64TZDtype. Maybe a is_3rd_party_ea_dtype that we would check for before checking for any 1st-party dtypes? That runs into the "ideally we should treat 3rd party EAs symmetrically with 1st-party" problems.

So getting back to the motivation: how high a priority is this?

As already pointed out: Less than other things I want to contribute to pandas, so xfailing and adding more (possibly) xfailing tests is the way to go.

so xfailing and adding more (possibly) xfailing tests is the way to go.

Sounds good.

actually I desparately need [...] that either assume that all timestamps are nanoseconds or backed by a numpy-array

Would your need be solved if we get numpy-backed non-nano in place? There's a reasonable chance of that happening in the next 6 months.

Would your need be solved if we get numpy-backed non-nano in place? There's a reasonable chance of that happening in the next 6 months.

For now: Yes.

For now: Yes.

I'm slowly tackling this from the cython side of the code. The parallelizable step is to comb through the rest of the code to find all the places where we implicitly/explicitly assume nanos. I'd start with pandas/plotting and pandas/io.

lets see if we can at least get this one working.

i think we'll need to edit the dtype.kind check in is_datetime64tz_dtype, and possible the issubclass(vtype, np.datetime64) check in internals.blocks.get_block_type

xhochy · 2020-07-01T07:05:04Z

xfail added, CI is now happy.

jreback · 2020-07-02T17:02:02Z

pandas/tests/extension/arrow/arrays.py

@@ -67,6 +68,26 @@ def construct_array_type(cls) -> Type["ArrowStringArray"]:
        return ArrowStringArray


+@register_extension_dtype


can you put these in the test file for now as I am not sure we agree on these names (and is just used for testing ATM).

I can move them but I wanted to keep the dtype here as done for the other test-Arrow-dtypes.

Done, CI passed except the Docs but the warning about missing sparse methods are unrelated to this PR.

jbrockmendel · 2020-10-09T15:25:22Z

can you merge master and we'll see if we can get this in

xhochy · 2020-10-14T12:56:38Z

@jbrockmendel Rebased and all green except one Windows job that timeouted.

jbrockmendel · 2020-10-16T15:50:12Z

I think the edit to get_block_type in #34683 might fix the test that fails here. can you confirm? if that is fixed, presumably the rest of the EA test suite still needs to be enabled for this EA?

xhochy · 2020-11-11T20:56:27Z

I think the edit to get_block_type in #34683 might fix the test that fails here. can you confirm? if that is fixed, presumably the rest of the EA test suite still needs to be enabled for this EA?

Yes, merging in #34683 fixes the test.

I'm not sure whether it would be really worth to get the full suite running for this test EA. It is basically here to check for the regression but getting the whole suite to pass would be a lot more work that I don't see worthwhile currently.

jbrockmendel · 2020-11-11T21:00:11Z

I'm not sure whether it would be really worth to get the full suite running for this test EA. It is basically here to check for the regression but getting the whole suite to pass would be a lot more work that I don't see worthwhile currently.

totally reasonable. i guess we can merge this now and then if/when #34683 makes this pass we can revisit getting other bits working.

cc @jreback

pandas/tests/extension/arrow/test_timestamp.py

jbrockmendel · 2020-12-09T16:24:54Z

@xhochy can you merge master, hopefully we'll get the CI green and can get this in

pep8speaks · 2020-12-09T16:29:07Z

Hello @xhochy! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2021-01-14 13:50:47 UTC

jbrockmendel

LGTM

github-actions · 2021-01-14T00:48:17Z

This pull request is stale because it has been open for thirty days with no activity. Please update or respond to this comment if you're still interested in working on this.

xhochy · 2021-01-14T14:37:52Z

@jbrockmendel @jreback Rebased and removed xfail as it is working now.

jreback · 2021-01-14T15:40:28Z

pandas/tests/extension/arrow/test_timestamp.py

+import pandas as pd
+from pandas.api.extensions import ExtensionDtype, register_extension_dtype
+
+pytest.importorskip("pyarrow", minversion="0.13.0")


this could technicaly be later but ok for now

jreback · 2021-01-14T15:41:35Z

thanks @xhochy

…ail) (pandas-dev#34987)

xhochy mentioned this pull request Jun 25, 2020

BUG: Cannot create third-party ExtensionArrays for datetime types #34986

Closed

3 tasks

jbrockmendel reviewed Jun 26, 2020

View reviewed changes

jreback changed the title ~~BUG: Add failing unit test for GH#34986~~ BUG: Cannot create third-party ExtensionArrays for datetime types (xfail) Jul 2, 2020

jreback added the ExtensionArray Extending pandas with custom dtypes or arrays. label Jul 2, 2020

jreback requested changes Jul 2, 2020

View reviewed changes

xhochy force-pushed the issue-34986 branch from 8027bfc to c028455 Compare July 5, 2020 18:41

simonjayhawkins mentioned this pull request Sep 15, 2020

CI: Add stale PR action #36336

Merged

simonjayhawkins added the Needs Review label Sep 15, 2020

xhochy force-pushed the issue-34986 branch 3 times, most recently from fd562df to 6d92caa Compare October 14, 2020 11:26

jbrockmendel mentioned this pull request Oct 16, 2020

CLN: dont consolidate in reshape.concat #34683

Merged

xhochy force-pushed the issue-34986 branch from 6d92caa to 3262c2e Compare November 11, 2020 17:09

jbrockmendel reviewed Nov 11, 2020

View reviewed changes

pandas/tests/extension/arrow/test_timestamp.py Show resolved Hide resolved

xhochy force-pushed the issue-34986 branch from 3262c2e to 9a870d4 Compare December 9, 2020 16:29

xhochy force-pushed the issue-34986 branch 3 times, most recently from 877c401 to e393ca6 Compare December 10, 2020 08:21

jbrockmendel approved these changes Dec 10, 2020

View reviewed changes

simonjayhawkins removed the Needs Review label Dec 10, 2020

xhochy force-pushed the issue-34986 branch from e393ca6 to d8ba32c Compare December 14, 2020 08:43

github-actions bot added the Stale label Jan 14, 2021

BUG: Add failing unit test for GH#34986

caf6c68

xhochy force-pushed the issue-34986 branch from d8ba32c to caf6c68 Compare January 14, 2021 08:42

Remove xfail

64dd1d7

jreback added this to the 1.3 milestone Jan 14, 2021

jreback approved these changes Jan 14, 2021

View reviewed changes

jreback reviewed Jan 14, 2021

View reviewed changes

jreback merged commit de8fd00 into pandas-dev:master Jan 14, 2021

xhochy deleted the issue-34986 branch January 14, 2021 18:59

luckyvs1 pushed a commit to luckyvs1/pandas that referenced this pull request Jan 20, 2021

BUG: Cannot create third-party ExtensionArrays for datetime types (xf…

f74d2b6

…ail) (pandas-dev#34987)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: Cannot create third-party ExtensionArrays for datetime types (xfail) #34987

BUG: Cannot create third-party ExtensionArrays for datetime types (xfail) #34987

xhochy commented Jun 25, 2020

xhochy commented Jun 25, 2020 •

edited

Loading

WillAyd commented Jun 25, 2020

jbrockmendel Jun 26, 2020

xhochy Jun 29, 2020

xhochy Jun 29, 2020

jbrockmendel Jun 29, 2020

xhochy Jul 1, 2020

jbrockmendel Jul 2, 2020

xhochy Jul 2, 2020

jbrockmendel Jul 8, 2020

jbrockmendel Oct 9, 2020

xhochy commented Jul 1, 2020

jreback Jul 2, 2020

xhochy Jul 3, 2020

xhochy Jul 6, 2020

jbrockmendel commented Oct 9, 2020

xhochy commented Oct 14, 2020

jbrockmendel commented Oct 16, 2020

xhochy commented Nov 11, 2020

jbrockmendel commented Nov 11, 2020

jbrockmendel commented Dec 9, 2020

pep8speaks commented Dec 9, 2020 •

edited

Loading

jbrockmendel left a comment

github-actions bot commented Jan 14, 2021

xhochy commented Jan 14, 2021

jreback Jan 14, 2021

jreback commented Jan 14, 2021

		from .arrays import ArrowTimestampUSArray # isort:skip


		def test_constructor_extensionblock():

		@@ -67,6 +68,26 @@ def construct_array_type(cls) -> Type["ArrowStringArray"]:
		return ArrowStringArray


		@register_extension_dtype

BUG: Cannot create third-party ExtensionArrays for datetime types (xfail) #34987

BUG: Cannot create third-party ExtensionArrays for datetime types (xfail) #34987

Conversation

xhochy commented Jun 25, 2020

xhochy commented Jun 25, 2020 • edited Loading

WillAyd commented Jun 25, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

xhochy commented Jul 1, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jbrockmendel commented Oct 9, 2020

xhochy commented Oct 14, 2020

jbrockmendel commented Oct 16, 2020

xhochy commented Nov 11, 2020

jbrockmendel commented Nov 11, 2020

jbrockmendel commented Dec 9, 2020

pep8speaks commented Dec 9, 2020 • edited Loading

Comment last updated at 2021-01-14 13:50:47 UTC

jbrockmendel left a comment

Choose a reason for hiding this comment

github-actions bot commented Jan 14, 2021

xhochy commented Jan 14, 2021

Choose a reason for hiding this comment

jreback commented Jan 14, 2021

xhochy commented Jun 25, 2020 •

edited

Loading

pep8speaks commented Dec 9, 2020 •

edited

Loading