Skip to content

hardcoded expectations on endianness in the tests -- feature or a bug? ;) #14830

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
yarikoptic opened this issue Dec 8, 2016 · 3 comments · Fixed by #14832
Closed

hardcoded expectations on endianness in the tests -- feature or a bug? ;) #14830

yarikoptic opened this issue Dec 8, 2016 · 3 comments · Fixed by #14832
Labels
Testing pandas testing functions or related to the test suite
Milestone

Comments

@yarikoptic
Copy link
Contributor

Code Sample, a copy-pastable example if possible

looking at code of pandas/tests/series/test_datetime_values.py which was introduced in c76ca44 I believe:

        period_index = period_range('20150301', periods=5)
        result = period_index.strftime("%Y/%m/%d")
        expected = np.array(['2015/03/01', '2015/03/02', '2015/03/03',
                             '2015/03/04', '2015/03/05'], dtype='<U10')
        self.assert_numpy_array_equal(result, expected)

Problem description

above (and some other tests... see e.g. https://buildd.debian.org/status/fetch.php?pkg=pandas&arch=s390x&ver=0.19.1-2&stamp=1480348101) lead to failed tests... which sometimes look as informative as

numpy array values are different (0.0 %)
[left]:  [2015/03/01, 2015/03/02, 2015/03/03, 2015/03/04, 2015/03/05]
[right]: [2015/03/01, 2015/03/02, 2015/03/03, 2015/03/04, 2015/03/05]

;-)

but the real reason is different endianness:

(Pdb) l
325
326             period_index = period_range('20150301', periods=5)
327             result = period_index.strftime("%Y/%m/%d")
328             expected = np.array(['2015/03/01', '2015/03/02', '2015/03/03',
329                                  '2015/03/04', '2015/03/05'], dtype='<U10')
330  ->         self.assert_numpy_array_equal(result, expected)
331
332             s = Series([datetime(2013, 1, 1, 2, 32, 59), datetime(2013, 1, 2, 14,
333                                                                   32, 1)])
334             result = s.dt.strftime('%Y-%m-%d %H:%M:%S')
335             expected = Series(["2013-01-01 02:32:59", "2013-01-02 14:32:01"])
(Pdb) p expected
array([u'2015/03/01', u'2015/03/02', u'2015/03/03', u'2015/03/04',
       u'2015/03/05'],
      dtype='<U10')
(Pdb) p result
array([u'2015/03/01', u'2015/03/02', u'2015/03/03', u'2015/03/04',
       u'2015/03/05'],
      dtype='>U10')

tentative fix is easy -- remove '>' from type definition. But I wondered if that is desired -- does pandas assume to enforce little-endianness or not? if it doesn't force -- then actually a proper fix, to maintain the same check of consistent endianness, would be to prefix with = and not <.

Please advise!

@chris-b1
Copy link
Contributor

chris-b1 commented Dec 8, 2016

I think this specific case is a bug in the test - just glancing at .strftime it shouldn't make any assumptions about endianness.

Some prior discussion here - #11282 (comment) - there shouldn't really be places where little-endian is enforced/assumed, but at the same time we don't test a big-endian platform, so there probably are places where it has snuck in.

@yarikoptic
Copy link
Contributor Author

Thanks! Meanwhile I have initiated a PR with possible "fixes"

@jreback
Copy link
Contributor

jreback commented Dec 8, 2016

the time stamp parsing code is c
not sure if it is endianess aware

@jreback jreback added the Testing pandas testing functions or related to the test suite label Dec 9, 2016
@jreback jreback added this to the 0.20.0 milestone Dec 9, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Testing pandas testing functions or related to the test suite
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants