-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
TST7337: Fix test failures on windows. #7362
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
passed 26-32/26-64 builds so far! looking good |
Fingers crossed! |
ok...good news and bad news... py 2 looks good all fixed
|
The tests with >>>from dateutil.tz import gettz
>>>gettz('Europe/London')._filename
# python 2
'Europe/London'
# python 3
'c:\\python33-32\\lib\\site-packages\\dateutil\\zoneinfo\\zoneinfo--latest.tar.gz' In python 3, you get the same filename for all timezones. We follow the dateutil package and rely on the filename for serializing timezones. Obviously if they all report the same filename then this isn't going to work. The change in behaviour is actually due to a change in the tarfile package from the standard library. When you extract a file from a tar archive in python 3, it's named for the archive, in python 2 for the filename you are extracting. This covers all but 3 of the test failures but I think this is a bug in tarfile for python 3. |
I hacked So I think we can say that all of these failures are related to the timezone/filename issue. @jreback I'm not sure how best to solve this. Ideally we'd fix the I can't think of a good workaround in pandas for this - there's always the possibility that a user constructs a 'bad' Some less-than-good suggestions:
I'll have a play with these but let me know what you think. |
what do you DO with the doesn't |
Use it to identify the timezone.
In [14]: from dateutil.tz import gettz
In [15]: tz = gettz('Europe/London')
In [18]: tz.tzname(datetime.datetime(2001, 1, 1))
Out[18]: 'GMT'
In [19]: tz.tzname(datetime.datetime(2001, 6, 1))
Out[19]: 'BST' so you can't use it as an identifier. [edit] and it doesn't match the timezone! Originally I used the tz object itself (since different instances of the same zone are equal) but that doesn't serialize (so |
ok...so its mainly test failures and serialization. e.g. you need to know a unique 'name' for the zone, rigth? yeah I simply would not allow them on construction, I know its hacky, but rejecting if and I would file a bug-report with dateutil (as this works on linux, so odd that it doesn't on windows) and works in py2. |
Exactly. We need to be able to identify the zone in a few other places as well:
Rejecting If we modify Agree re filing a bug - probably with the maintainer of the tarfile package as well. I'll try and sort this out this afternoon. I'll put the changes in this PR but won't squash them so we can see the two parts - let me know if that suits you. |
yep sounds fine |
also, their seems to be a |
Avoid constructing dateutil timezones explicitly because of filename issue. Use dateutil.tz.tzutc() to construct UTC timezone instead of dateutil.tz.gettz('UTC') Update docs to reflect this. Tidy up examples in the timezones section of the docs.
I think that class is for windows registry timezones - but it doesn't seem to be used when you call Anyway, I think I've got this working. I've tested on python 2.7 and 3.3 on windows and 2.7 on linux. The timezone related test failures are gone (with the exception of one failure which I can't replicate manually but which wasn't failing for you and doesn't seem to be tz-related). The changes are:
|
tz=dateutil.tz.tzutc()) | ||
rng_utc.tz | ||
|
||
Note that the ``UTC`` timezone is a special case in ``dateutil`` and should be constructed explicitly |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you test interactions somewhere where you say are creating a DatetimeIndex with a Timestamp of 'UTC', but using a dateutil_tz_utc in another element?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No - although they would be separate objects and we would be converting between them with the same code as converting between other timezones.
I'll add something like this:
ts = Timestamp('2001-01-05 11:56', tz='dateutil/UTC')
ts.tz_convert(dateutil.tz.tzutc())
# check that the time hasn't changed.
assert (ts.hour, ts.minute) == (11, 56)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, maybe also need some kind of raising if you try to mix pytz/dateutil in a creation, e.g.
DatetimeIndex([Timestamp('20130101',tz='Europe/London'),Timestamp('20130102',tz='dateutil/Europe/London')])
should raise loudly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does already. In fact, pandas already raises for mixed timezones in pytz e.g.
DatetimeIndex([Timestamp('20130101', tz='Europe/London'), Timestamp('20130101', tz='Europe/Paris')])
I've just added the test for different versions of utc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what I mean is a supplemental message that explicity compares pytz and dateutil mixed (not sure that this is easy to catch)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I see. There doesn't seem to be a single place where we could put that check. At least, by the time you get to the exception that's raised, you can't tell if you have both dateutil and pytz zones.
In principle though, I think it's something we can check because all pytz
timezones except UTC have a common base class and you can assume anything else is dateutil.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm.....well the check can be one when tz's are not all the same (e.g. at that point you can do a more detailed comparison to see if they are different and at least one is pytz and one dateutil)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we do this as a separate feature/PR so that we can get all of the tests passing on windows?
I think it's a good idea to have a better exception message and I'm happy to look into it but it feels like a separate issue from the test failures on windows.
Looking a bit deeper into the code, there are at least two places where an exception would currently be raised, tslib.array_to_datetime
and tslib.array_to_datetime
- both called via tseries.tools.to_datetime
. The test could go in to_datetime
- if there is an exception, check timezone types - but mismatched timezones wouldn't be the only reason for an exception there so we could mask a different problem.
@dbew yes....let's get passing...then go from them to make even more robust! if you are satisfied that works on windows then ping me when passing travis (we can merge and then address any follow up failing tests). |
@dbew ok, ready to go on this? |
Yes, travis all fine, so let's go. |
TST7337: Fix test failures on windows.
thanks! |
Just a couple of test failures remain....these you know about?
|
Argh! I thought I'd got them all. Running all of the tests was crashing On Monday, 9 June 2014, jreback [email protected] wrote:
|
np....thanks for the followups! |
So, the The from dateutil.tz import gettz
from datetime import datetime
u = gettz('UTC')
u.utcoffset(datetime(2001, 1, 1)) raises an exception on Windows. The result of Given that, I think it makes sense to skip the test on windows until the bug in I've run the tests in I've pushed the changes onto a new branch in the @AHLMSS repository here, pull request here |
closes #7337
Use dateutil.tz.tzutc() to construct UTC timezone instead of dateutil.tz.gettz('UTC')
Update docs to reflect this.
Tidy up examples in the timezones section of the docs.