BUG: need better inference for path in Series construction (GH9456) #9924

patrickfournier · 2015-04-17T20:45:31Z

closes #9456.

- dict with datetime64 keys now working - when isinstance(index, DatetimeIndex), use lib.fast_multiget correctly to avoid raising a TypeError exception

jreback · 2015-05-29T12:43:45Z

pandas/core/series.py

@@ -166,9 +166,9 @@ def __init__(self, data=None, index=None, dtype=None, name=None,
                    else:
                        index = Index(_try_sort(data))
                try:
-                    if isinstance(index, DatetimeIndex):
+                    if isinstance(index, DatetimeIndex) and lib.infer_dtype(data) != 'datetime64':


see why this path is hit in the first place and can go from there.

Before I added .value at line 171, lib.fast_multiget was throwing a TypeError. Now, the else clause must handles the conversion of datetime64 dicts because we cannot compare a Timestamp with a datetime64:

>>> d = {datetime.datetime(2015, 1, 7, 2, 0): 42544017.198965244} >>> i = pandas.core.index.Index(d) >>> i.astype('O').values array([Timestamp('2015-01-07 02:00:00')], dtype=object) >>> datetime.datetime(2015, 1, 7, 2, 0) == pandas.lib.Timestamp('2015-01-07 02:00:00') True >>> d = {numpy.datetime64('2015-01-06T19:00:00.000000000-0500'): 42544017.198965244} >>> i = pandas.core.index.Index(d) >>> i.astype('O').values array([Timestamp('2015-01-07 00:00:00')], dtype=object) >>> numpy.datetime64('2015-01-06T19:00:00.000000000-0500') == pandas.lib.Timestamp('2015-01-07 00:00:00') False

that's not what I mean. put halt in there and see what test actually hits this. This is handling an edge case. Your fix, just confuses things. I think there is a more general soln.

I ran the tests and reported the results in the issue: #9456.

Maybe a better fix would be to correctly handle the comparison between Timestamp and datetime64?

The idea is to have the least special cases as possible. You rarely actually want to convert a DatetimeIndex using .astype('O') this is the least desirable result (though it may have been done for some reason). But it is tricky because a user can pass Timestamp/datetime.datetime/datetime.date/np.datetime64 as elements and a non-index like (but that is index-like). Best best is to do _ensure_index which will convert list-like things to an Index (and possibly to a DatetimeIndex if its compat; if its not then all bets are off).

- dict with datetime64 keys now working - use lib.fast_multiget correctly to avoid unnecessary exceptions

patrickfournier · 2015-05-31T03:27:16Z

I rewrote the fix to be more generic. If the type of Index elements is the same as the type of data elements, we use the Index to access and copy the dict value. If not, we use the original index. See comments in diff.

patrickfournier · 2015-05-31T03:31:00Z

pandas/core/series.py

-                        data = [data.get(i, nan) for i in index]
+                    # lib.fast_multiget raises TypeError if type(data) != dict
+
+                    if lib.infer_dtype(data) == lib.infer_dtype(index.values):


Here I try to avoid a call to np.array() by checking if we can access data values using index.values. However I checked the code for infer_dtype and it may be quite complex. Maybe we should just go straight to the else?

jreback · 2015-06-04T13:54:54Z

this overlaps with #10269 (your PR attempts to fix Series the other DataFrame)

jreback · 2015-06-10T20:02:06Z

superseded by #10269

thanks!

patrickfournier added 2 commits April 17, 2015 16:41

TST: Adding test for bug GH 9456

f883b79

BUG: GH 9456 Fixed Series.__init__ to better handle dict data

a934792

- dict with datetime64 keys now working - when isinstance(index, DatetimeIndex), use lib.fast_multiget correctly to avoid raising a TypeError exception

jreback changed the title ~~Issue 9456~~ BUG: need better inference for path in Series construction (GH94560 Apr 17, 2015

sinhrks added Bug Datetime Datetime data dtype Dtype Conversions Unexpected or buggy dtype conversions labels May 1, 2015

jreback reviewed May 29, 2015
View reviewed changes

jreback added this to the Next Major Release milestone May 29, 2015

patrickfournier added 2 commits May 30, 2015 22:38

Merge branch 'master' into issue-9456

04eeabe

BUG: GH 9456 Fixed Series.__init__ to better handle dict data

c59c4fe

- dict with datetime64 keys now working - use lib.fast_multiget correctly to avoid unnecessary exceptions

patrickfournier reviewed May 31, 2015
View reviewed changes

patrickfournier changed the title ~~BUG: need better inference for path in Series construction (GH94560~~ BUG: need better inference for path in Series construction (GH9456) Jun 3, 2015

jreback mentioned this pull request Jun 4, 2015

BUG: GH10160 in DataFrame construction from dict with datetime64 index #10269

Closed

jreback closed this Jun 10, 2015

jreback modified the milestones: 0.16.2, Next Major Release Jun 10, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: need better inference for path in Series construction (GH9456) #9924

BUG: need better inference for path in Series construction (GH9456) #9924

patrickfournier commented Apr 17, 2015

jreback May 29, 2015

patrickfournier May 29, 2015

jreback May 29, 2015

patrickfournier May 29, 2015

jreback May 29, 2015

patrickfournier commented May 31, 2015

patrickfournier May 31, 2015

jreback commented Jun 4, 2015

jreback commented Jun 10, 2015

BUG: need better inference for path in Series construction (GH9456) #9924

BUG: need better inference for path in Series construction (GH9456) #9924

Conversation

patrickfournier commented Apr 17, 2015

jreback May 29, 2015

Choose a reason for hiding this comment

patrickfournier May 29, 2015

Choose a reason for hiding this comment

jreback May 29, 2015

Choose a reason for hiding this comment

patrickfournier May 29, 2015

Choose a reason for hiding this comment

jreback May 29, 2015

Choose a reason for hiding this comment

patrickfournier commented May 31, 2015

patrickfournier May 31, 2015

Choose a reason for hiding this comment

jreback commented Jun 4, 2015

jreback commented Jun 10, 2015