-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG/API: to_timedelta unit-argument ignored for string input #12136
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
this is a bug. The unit argument only matters when you don't have units for a particular entry. separately, I think we might want an error if you have mixed with unit specifiers and unit is specified e.g.
the first has a default of want to do a pull-request? |
I don't promise anything, but I can take a look at the code base and see if I can get my head around what needs to be done. |
gr8. contributing docs are here post any questions on this issue. |
Add test for pandas-dev#12136
I finally got around to looking at this. I noticed @duncanwp made some commits targeting this issue. Is that enough to solve it? |
@belteshassar you can certainly start with that. No pull-request yet. |
No, sorry. I thought about passing the unit string into parse_timedelta_string deep down on tslib.pyx explicitly and made a bit of progress, but it's not finished. Feel free to pick it up or disregard as you like, I don't think I'll have time to finish it. |
While working on this, I've found another issue Edit: Was a bit more complicated than I thought. Posted it as #12691 instead. |
I think you can go ahead and add, I don't think there is a reason. Further, you can reorg the tests a bit: see here, and make several more test functions (as the current one is getting long). The real diff for month,years is they are fixed-length units (because they are not relative to things like offsets), so while accepted are not that useful. |
I've been thinking a bit about this and would like some more input. As I see it, we have at least five cases that could appear:
I lean against either raising |
I would accept 1), raise on the rest. |
Neat. I was unaware of |
ok, you could enhance |
Sounds good. That was my thought as well. I suppose the ...and tests, indeed. |
@belteshassar if you want to come back to this would be great! |
I might be able to get back to it during next week. Or I guess I could push what I have so far if someone else wants to take over. |
We should not raise if
So, only "if" is an integer (actually number, more accurately considering the implementation). And in fact the current behavior (we don't want to break backwards compatibility) is for the above timedelta to return 12 seconds, not raise. So for strings with pure numbers I would suggest just trying to convert the
If anyone else agrees I could make a PR for this. |
@Sup3rGeo no this should raise as it is a specification error (IOW raise a ValueError). We don't want to guess here, the user specified that the integer is in a certain unit. If its not a number this is a problem. |
@jreback to make things clearer, I think we might be talking about two different things here.
2 - How to make
Please let me know if I misunderstood anything we are discussing. |
Also related to #16791 |
@Sup3rGeo I think the docs are too literal about Changing |
@TomAugspurger, I share the same opinion but I understood that @jreback was against it. Also, when you say changing |
Is it needed that we treat strings parsed as integers? (in contrast to strings with a unit specification inside the string like |
However, if we do this consistently we should probably also disallow |
Copying here the examples of the duplicate issues I closed: In : pd.to_timedelta(123456789, unit='s')
Out: Timedelta('1428 days 21:33:09') # ok
In : pd.to_timedelta('123456789', unit='s')
Out: Timedelta('0 days 00:00:00.123456') # oops!
In : pd.to_timedelta('123.45', unit='s')
Out: ValueError('no units specified') # ??? If a string is passed that is a proper integer, it is interpreted as nanoseconds regardless the
|
So the fact that the keyword is ignored is for sure a bug we should fix. The question for me is then: should we actually allow strings to be interpreted as integers? The docs clearly say that If we do allow it, I would in any case propose to require a unit, and not silently assume 'ns' (similar proposal for |
Ah, there is the PR #23025 working on this that is actually doing this |
So, I don't know if this is by design, but it sure confused me.
to_timedelta()
operates differently on the number1000
compared to the string'1000'
:and
while
I've tried my best to come up with a reason for why this behaviour would be desirable, but I can't think of any.
The text was updated successfully, but these errors were encountered: