Skip to content

BUG: DataFrame.from_records with empty rec array #20806

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

TomAugspurger
Copy link
Contributor

@TomAugspurger TomAugspurger commented Apr 24, 2018

keep the dtypes

Closes #20805

@TomAugspurger TomAugspurger added the Dtype Conversions Unexpected or buggy dtype conversions label Apr 24, 2018
@TomAugspurger TomAugspurger added this to the 0.23.0 milestone Apr 24, 2018
@@ -7387,7 +7387,10 @@ def _to_arrays(data, columns, coerce_float=False, dtype=None):
if isinstance(data, np.ndarray):
columns = data.dtype.names
if columns is not None:
return [[]] * len(columns), columns
arrays = [np.array([], dtype=dtype)
for _, dtype in data.dtype.descr]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reasoning for using the descr here instead of just the dtype? ref #20734 and the docs for dtype.descrI think we may run into some obscure issues using the str representation of the type instead of the type directly

assert df.index.name == 'id'

def test_from_records_empty_dtypes(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think we should cover more dtypes? Perhaps a good use case for a shared fixture?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add some more here?

@jreback
Copy link
Contributor

jreback commented Apr 24, 2018

@WillAyd comments, and a lint issue. otherwise lgtm.

- Use dtypes
- lint
@codecov
Copy link

codecov bot commented Apr 24, 2018

Codecov Report

Merging #20806 into master will increase coverage by 0.02%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #20806      +/-   ##
==========================================
+ Coverage   91.81%   91.83%   +0.02%     
==========================================
  Files         153      153              
  Lines       49318    49319       +1     
==========================================
+ Hits        45279    45294      +15     
+ Misses       4039     4025      -14
Flag Coverage Δ
#multiple 90.23% <100%> (+0.02%) ⬆️
#single 41.87% <0%> (-0.01%) ⬇️
Impacted Files Coverage Δ
pandas/core/frame.py 97.17% <100%> (ø) ⬆️
pandas/util/testing.py 84.79% <0%> (+0.2%) ⬆️
pandas/plotting/_converter.py 66.81% <0%> (+1.73%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 41db527...29af436. Read the comment docs.

@TomAugspurger
Copy link
Contributor Author

Hmm I can't reproduce the failures locally with those versions of python / NumPy.

May end up pushing this. The Series([]) dtype change (which this is for) isn't critical for 0.23.

@TomAugspurger TomAugspurger modified the milestones: 0.23.0, 0.23.1 Apr 25, 2018
@@ -1092,6 +1092,7 @@ Numeric
- Bug in :class:`Series` constructor with an int or float list where specifying ``dtype=str``, ``dtype='str'`` or ``dtype='U'`` failed to convert the data elements to strings (:issue:`16605`)
- Bug in :class:`Index` multiplication and division methods where operating with a ``Series`` would return an ``Index`` object instead of a ``Series`` object (:issue:`19042`)
- Bug in the :class:`DataFrame` constructor in which data containing very large positive or very large negative numbers was causing ``OverflowError`` (:issue:`18584`)
- Bug in the :meth:`DataFrame.from_records` constructor losing the dtypes of a empty NumPy record array (:issue:`20805`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move to 0.23.1

assert df.index.name == 'id'

def test_from_records_empty_dtypes(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add some more here?

@jreback jreback removed this from the 0.23.1 milestone Jun 4, 2018
@jreback
Copy link
Contributor

jreback commented Sep 25, 2018

can you rebase

@jreback
Copy link
Contributor

jreback commented Nov 23, 2018

closing as stale. if you'd like to continue, pls ping.

@jreback jreback closed this Nov 23, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Dtype Conversions Unexpected or buggy dtype conversions
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants