-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
BUG: don't lose dtypes when concatenating empty array-likes #5742
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: don't lose dtypes when concatenating empty array-likes #5742
Conversation
this is fine, pls add a release notes entry (use this PR number as the issue number); you can add to bug_fixes at the end |
@@ -11806,6 +11806,23 @@ def test_to_csv_date_format(self): | |||
|
|||
assert_frame_equal(test, nat_frame) | |||
|
|||
def test_concat_empty_accounts_dtypes(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can't use the name accounts
for this. test_concat_empty_dataframe_dtypes
is fine.
Done that and squashed all the commits to a single one. |
@jreback here you go UPD: doesn't work, I guess simply merging it was too easy to be the solution :) will see to it later. |
@immerrr thanks |
The test fails because Is there a reason I don't see behind this decision? |
concat single item is only called when u r appending different dtypes in a single column (which is generally odd) this is quite tricky because u don't want to automatically cast to object (which is is general the result type for pretty much any object and anything else) because u can sometimes cast to a more appropriate dtype for example if u have bool and then other frame is empty no matter the dtype you would be ok so may have to handle that a bit like I do datetime/timedelata unlike date like you cannot have a nan but non empty with bools - u can only append bools to bools or bools to empty frame u can make a case for allowing appending with uint8 but I would not allow it |
I am not a big dan or coercing bools to numeric either - u could put this in but again requires some special logic (eg if all types can be casted to numeric and u have bools, but no date like then prob ok) |
Dan -> fan |
I'm -1 on having pandas internals coerce bool to unsigned right now. We haven't built up particularly good support for unsigned ints yet. |
Ok, it feels like boolean coercion itself is worth a discussion that won't fit here, so let's skip that. Now, to the issue. |
can you rebase and move notes to 0.13.1.... |
Sure |
can you just squash this down to 1 commit, thanks...otherwise looks fine |
…mpty-arraylikes BUG: don't lose dtypes when concatenating empty array-likes
thanks! |
I develop an application that does quite a bit of data manipulation. Being aware of
pandas
being functional-but-not-really-heavily-optimized I use it to maintain label consistency and for grouping/merging data, heavy-duty maths is usually done withnumpy
ufuncs. The application contains entities that have no data at the beginning and receive data over their lifetimes. Every once in a while an incoming data chunk will contain no data for a certain entity. Usually it's fine but if the entity was just created the following happens:After that ufuncs like
isnan
cease to work ondata.values
since its dtype has changed toobject
. This PR fixes it.