BUG?: namedtuples fields not checked on DataFrame constructor #27329

jorisvandenbossche · 2019-07-10T21:36:47Z

I personally find it very surprising that we assume that if the first element is a NamedTuple, that all elements of the list will be exactly the same NamedTuples:

In [29]: Record1 = namedtuple('Record1', ('b','a'))                                                                                                           

In [30]: Record2 = namedtuple('Record2', ('a','c'))                                                                                                           

In [31]: records = [Record1(1, 2), Record2(3, 4)]                                                                                                             

In [32]: pd.DataFrame(records)                                                                                                                                
Out[32]: 
   b  a
0  1  2
1  3  4

So the field names of the subsequent namedtuples are simply ignored.

The case of all the same tuples might be the most typical case, but it doesn't seem very safe for a default behaviour to not check this.

The text was updated successfully, but these errors were encountered:

ghost · 2019-07-11T01:20:25Z

#11416 (comment)

jorisvandenbossche · 2019-07-11T01:57:10Z

Thanks for digging that up. Now, we do this "expensive" check for Series and dicts, though, so I think we should also do it for namedtuples. Also, I suppose the cythonized fast_unique_multiple_list_gen that is used to union the dict keys will be faster than what was tested there.

jorisvandenbossche mentioned this issue Jul 10, 2019

ENH: Preserve key order when passing list of dicts to DataFrame on py 3.6+ #27309

Merged

4 tasks

jorisvandenbossche added the Bug label Jul 10, 2019

jorisvandenbossche changed the title ~~BUG: namedtuples fields not checked on DataFrame constructor~~ BUG?: namedtuples fields not checked on DataFrame constructor Jul 10, 2019

This was referenced Jul 18, 2019

WIP: treat list of namedtuples like list of dict in DataFrame() #27447

Closed

ENH: treat list of namedtuples like list of dict in DataFrame() #27494

Closed

jbrockmendel added the Constructors Series/DataFrame/Index/pd.array Constructors label Jul 23, 2019

jorisvandenbossche mentioned this issue Aug 20, 2019

ENH: Add support for dataclasses in the DataFrame constructor #27999

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG?: namedtuples fields not checked on DataFrame constructor #27329

BUG?: namedtuples fields not checked on DataFrame constructor #27329

jorisvandenbossche commented Jul 10, 2019

ghost commented Jul 11, 2019

jorisvandenbossche commented Jul 11, 2019

BUG?: namedtuples fields not checked on DataFrame constructor #27329

BUG?: namedtuples fields not checked on DataFrame constructor #27329

Comments

jorisvandenbossche commented Jul 10, 2019

ghost commented Jul 11, 2019

jorisvandenbossche commented Jul 11, 2019