You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I personally find it very surprising that we assume that if the first element is a NamedTuple, that all elements of the list will be exactly the same NamedTuples:
In [29]: Record1 = namedtuple('Record1', ('b','a'))
In [30]: Record2 = namedtuple('Record2', ('a','c'))
In [31]: records = [Record1(1, 2), Record2(3, 4)]
In [32]: pd.DataFrame(records)
Out[32]:
b a
0 1 2
1 3 4
So the field names of the subsequent namedtuples are simply ignored.
The case of all the same tuples might be the most typical case, but it doesn't seem very safe for a default behaviour to not check this.
The text was updated successfully, but these errors were encountered:
jorisvandenbossche
changed the title
BUG: namedtuples fields not checked on DataFrame constructor
BUG?: namedtuples fields not checked on DataFrame constructor
Jul 10, 2019
Thanks for digging that up. Now, we do this "expensive" check for Series and dicts, though, so I think we should also do it for namedtuples. Also, I suppose the cythonized fast_unique_multiple_list_gen that is used to union the dict keys will be faster than what was tested there.
I personally find it very surprising that we assume that if the first element is a NamedTuple, that all elements of the list will be exactly the same NamedTuples:
So the field names of the subsequent namedtuples are simply ignored.
The case of all the same tuples might be the most typical case, but it doesn't seem very safe for a default behaviour to not check this.
The text was updated successfully, but these errors were encountered: