BUG: DataFrame.from_records with empty rec array #20806

TomAugspurger · 2018-04-24T02:30:19Z

keep the dtypes

Closes #20805

keep the dtypes Closes pandas-dev#20805

WillAyd · 2018-04-24T02:50:33Z

pandas/core/frame.py

@@ -7387,7 +7387,10 @@ def _to_arrays(data, columns, coerce_float=False, dtype=None):
        if isinstance(data, np.ndarray):
            columns = data.dtype.names
            if columns is not None:
-                return [[]] * len(columns), columns
+                arrays = [np.array([], dtype=dtype)
+                          for _, dtype in data.dtype.descr]


What's the reasoning for using the descr here instead of just the dtype? ref #20734 and the docs for dtype.descrI think we may run into some obscure issues using the str representation of the type instead of the type directly

WillAyd · 2018-04-24T02:52:19Z

pandas/tests/frame/test_constructors.py

        assert df.index.name == 'id'

+    def test_from_records_empty_dtypes(self):


Think we should cover more dtypes? Perhaps a good use case for a shared fixture?

can you add some more here?

jreback · 2018-04-24T10:15:16Z

@WillAyd comments, and a lint issue. otherwise lgtm.

- Use dtypes - lint

codecov · 2018-04-24T11:55:31Z

Codecov Report

Merging #20806 into master will increase coverage by 0.02%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master   #20806      +/-   ##
==========================================
+ Coverage   91.81%   91.83%   +0.02%     
==========================================
  Files         153      153              
  Lines       49318    49319       +1     
==========================================
+ Hits        45279    45294      +15     
+ Misses       4039     4025      -14

Flag	Coverage Δ
#multiple	`90.23% <100%> (+0.02%)`	⬆️
#single	`41.87% <0%> (-0.01%)`	⬇️

Impacted Files	Coverage Δ
pandas/core/frame.py	`97.17% <100%> (ø)`	⬆️
pandas/util/testing.py	`84.79% <0%> (+0.2%)`	⬆️
pandas/plotting/_converter.py	`66.81% <0%> (+1.73%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 41db527...29af436. Read the comment docs.

TomAugspurger · 2018-04-24T14:19:02Z

Hmm I can't reproduce the failures locally with those versions of python / NumPy.

May end up pushing this. The Series([]) dtype change (which this is for) isn't critical for 0.23.

jreback · 2018-06-04T21:46:45Z

doc/source/whatsnew/v0.23.0.txt

@@ -1092,6 +1092,7 @@ Numeric
 - Bug in :class:`Series` constructor with an int or float list where specifying ``dtype=str``, ``dtype='str'`` or ``dtype='U'`` failed to convert the data elements to strings (:issue:`16605`)
 - Bug in :class:`Index` multiplication and division methods where operating with a ``Series`` would return an ``Index`` object instead of a ``Series`` object (:issue:`19042`)
 - Bug in the :class:`DataFrame` constructor in which data containing very large positive or very large negative numbers was causing ``OverflowError`` (:issue:`18584`)
+- Bug in the :meth:`DataFrame.from_records` constructor losing the dtypes of a empty NumPy record array (:issue:`20805`)


move to 0.23.1

jreback · 2018-06-04T21:47:16Z

pandas/tests/frame/test_constructors.py

        assert df.index.name == 'id'

+    def test_from_records_empty_dtypes(self):


can you add some more here?

jreback · 2018-09-25T16:43:17Z

can you rebase

jreback · 2018-11-23T03:31:57Z

closing as stale. if you'd like to continue, pls ping.

BUG: DataFrame.from_records with empty rec array

b46b3ef

keep the dtypes Closes pandas-dev#20805

TomAugspurger added the Dtype Conversions Unexpected or buggy dtype conversions label Apr 24, 2018

TomAugspurger added this to the 0.23.0 milestone Apr 24, 2018

WillAyd reviewed Apr 24, 2018

View reviewed changes

Fixups

29af436

- Use dtypes - lint

TomAugspurger modified the milestones: 0.23.0, 0.23.1 Apr 25, 2018

jreback requested changes Jun 4, 2018

View reviewed changes

jreback removed this from the 0.23.1 milestone Jun 4, 2018

jreback closed this Nov 23, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: DataFrame.from_records with empty rec array #20806

BUG: DataFrame.from_records with empty rec array #20806

TomAugspurger commented Apr 24, 2018 •

edited

Loading

WillAyd Apr 24, 2018

WillAyd Apr 24, 2018

jreback Jun 4, 2018

jreback commented Apr 24, 2018

codecov bot commented Apr 24, 2018

TomAugspurger commented Apr 24, 2018

jreback Jun 4, 2018

jreback Jun 4, 2018

jreback commented Sep 25, 2018

jreback commented Nov 23, 2018

		assert df.index.name == 'id'

		def test_from_records_empty_dtypes(self):

BUG: DataFrame.from_records with empty rec array #20806

BUG: DataFrame.from_records with empty rec array #20806

Conversation

TomAugspurger commented Apr 24, 2018 • edited Loading

WillAyd Apr 24, 2018

Choose a reason for hiding this comment

WillAyd Apr 24, 2018

Choose a reason for hiding this comment

jreback Jun 4, 2018

Choose a reason for hiding this comment

jreback commented Apr 24, 2018

codecov bot commented Apr 24, 2018

Codecov Report

TomAugspurger commented Apr 24, 2018

jreback Jun 4, 2018

Choose a reason for hiding this comment

jreback Jun 4, 2018

Choose a reason for hiding this comment

jreback commented Sep 25, 2018

jreback commented Nov 23, 2018

TomAugspurger commented Apr 24, 2018 •

edited

Loading