-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: Bug in DataFrame construction with nulls and datetimes in a list like #15892
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
pandas/_libs/src/inference.pyx
Outdated
object v | ||
|
||
for i in range(n): | ||
v = arr[i] | ||
if util.is_string_object(v): | ||
continue | ||
seen_string = 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is seen_string
supposed to be used somewhere?
actually this is non-performant. The issue is that it ends up looking at all of the strings, when we need to just test a couple to see if they are convertible. Not to mention this fails a couple of tests that have to do with us preserving |
pandas/_libs/src/inference.pyx
Outdated
elif is_timedelta(v): | ||
seen_timedelta=1 | ||
# nan or None | ||
seen_null = 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly, is seen_null
supposed to be used somewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no these were tries.......
I think this is fixed. boy user convenience is a PITA. |
ok I think its fixed now. |
As long as the test/benchmarks pass, then LGTM. This was a bigger effort than I thought it would be. |
@chrisaycock yeah a lot of this lower-level inference code is reasonable as we have quite a few integration tests, so refactoring stuff that I wrote years ago is pretty straightforward. But of course there are lots of cases and easy to break stuff (which the tests catch). Down the road we will will be able to simplify things. For example Further I have to think about whether we still want things like:
to work. It is reasonable & not too hard. this is actually a bigger issue:
as this is somewhat arbitrary. but this is for another time. |
Codecov Report
@@ Coverage Diff @@
## master #15892 +/- ##
==========================================
- Coverage 90.97% 90.97% -0.01%
==========================================
Files 145 145
Lines 49491 49495 +4
==========================================
+ Hits 45023 45026 +3
- Misses 4468 4469 +1
Continue to review full report at Codecov.
|
closes #15869