Skip to content

CI: Fix npdev build post Cython annotation change #47670

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Jul 13, 2022

Conversation

mroeschke
Copy link
Member

@mroeschke mroeschke commented Jul 11, 2022

Appears the new build failure was due to cython/cython#4885

Major changes:

  1. Period._from_ordinal is annotated def _from_ordinal(oridinal: int, freq) which now strictly only accepts Python ints. Need to change to def _from_ordinal(oridinal: int64_t, freq)

@mroeschke
Copy link
Member Author

cc @jbrockmendel could use your insight into another failure.

e.g. np.repeat(EA, 2) is returning an ndarray instead of an EA. What code path would that go through?

@jbrockmendel
Copy link
Member

Period._from_ordinal is annotated def _from_ordinal(oridinal: int, freq), so ordinal needs to always be a Python int and not a np.int64

We end up casting to int64_t, so this casting feels wasteful. Could we just make the annotation int64_t?

@jbrockmendel
Copy link
Member

e.g. np.repeat(EA, 2) is returning an ndarray instead of an EA. What code path would that go through?

np.repeat uses array_function_dispatch, which I think means it checks for __array_function__, which we don't implement anywhere. Without __array_function__, it isn't that surprising that it returns an ndarray. If anything I'm surprised it used to (?).

@mroeschke
Copy link
Member Author

Found it:

We define def repeat(self, repeats, axis: int = 0) which np.repeat dispatches to

https://github.com/numpy/numpy/blob/54c52f13713f3d21795926ca4dbb27e16fada171/numpy/core/fromnumeric.py#L52-L66=

Since axis can be a np.int64 it hit's numpy's except TypeError block which probably caused the different behavior.

Looks like we need to be careful with annotating cython code with Python types as they will be strict now and raise a TypeError otherwise.

@jbrockmendel
Copy link
Member

Since axis can be a np.int64 it hit's numpy's except TypeError block which probably caused the different behavior

Sure its int64 and not intp?

@mroeschke
Copy link
Member Author

Since axis can be a np.int64 it hit's numpy's except TypeError block which probably caused the different behavior

Sure its int64 and not intp?

Hmm good point. intp is probably safer here

@mroeschke mroeschke added CI Continuous Integration Compat pandas objects compatability with Numpy or Python functions labels Jul 12, 2022
@mroeschke mroeschke added this to the 1.4.4 milestone Jul 12, 2022
@mroeschke
Copy link
Member Author

I decided to remove the : int annotation for repeat(..., axis: int). intp_t was failing in cython 0.29.x. We can revist the tying in a followup

@smarie
Copy link
Contributor

smarie commented Jul 13, 2022

Since axis can be a np.int64 it hit's numpy's except TypeError block which probably caused the different behavior.

Thanks @mroeschke for finding this ! I can stop investigating in #46405 :)

@mroeschke
Copy link
Member Author

Np @smarie, sorry I didn't realized you were investigating it in tandem.

Does it look okay to you @jbrockmendel?

@jbrockmendel
Copy link
Member

no objections

@@ -157,7 +157,7 @@ cdef class NDArrayBacked:
return self._from_backing_data(res_values)

# TODO: pass NPY_MAXDIMS equiv to axis=None?
def repeat(self, repeats, axis: int = 0):
def repeat(self, repeats, axis = 0):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what happens if we make this int | np.integer?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks to have worked. Changed

@mroeschke mroeschke merged commit 13e859f into pandas-dev:main Jul 13, 2022
@mroeschke mroeschke deleted the ci/np_dev_cython_fix branch July 13, 2022 20:58
@mroeschke
Copy link
Member Author

Merging to get the CI to green. Can followup if needed

simonjayhawkins pushed a commit that referenced this pull request Jul 14, 2022
…nnotation change) (#47709)

Backport PR #47670: CI: Fix npdev build post Cython annotation change

Co-authored-by: Matthew Roeschke <[email protected]>
mroeschke pushed a commit that referenced this pull request Sep 8, 2022
…specific directive is used (#46405)

* Added test representative of #46319. Should fail on CI

* Added a gha worker with non utf 8 zh_CN encoding

* Attempt to fix the encoding so that locale works

* Added the fix, but not using it for now, until CI is able to reproduce the issue.

* Crazy idea: maybe simply removing the .utf8 modifier will use the right encoding !

* Hopefully fixing the locale not available error

* Now simply generating the locale, not updating the ubuntu one

* Trying to install the locale without enabling it

* Stupid mistake

* Testing the optional locale generator condition

* Put back all runners

* Added whatsnew

* Now using the fix

* As per code review: moved locale-switching fixture `overridden_locale` to conftest

* Flake8

* Added comments on the runner

* Added a non-utf8 locale in the `it_IT` runner. Added the zh_CN.utf8 locale in the tests

* Improved readability of fixture `overridden_locale` as per code review

* Added two comments on default encoding

* Fixed #46319 by adding a new `char_to_string_locale` function in the `tslibs.util` module, able to decode char* using the current locale.

* As per code review: modified the test to contain non-utf8 chars. Fixed the resulting issue.

* Split the test in two for clarity

* Fixed test and flake8 error.

* Updated whatsnew to ref #46468 . Updated test name

* Removing wrong whatsnew bullet

* Nitpick on whatsnew as per code review

* Fixed build error rst directive

* Names incorrectly reverted in last merge commit

* Fixed test_localization so that #46595 can be demonstrated on windows targets (even if today these do not run on windows targets, see #46597)

* Fixed `tm.set_locale` context manager, it could error and leak when category LC_ALL was used. Fixed #46595

* Removed the fixture as per code review, and added corresponding parametrization in tests.

* Dummy mod to trigger CI again

* reverted dummy mod

* Attempt to fix the remaining error on the numpy worker

* Fixed issue in `_from_ordinal`

* Added asserts to try to understand

* Reverted debugging asserts and applied fix for numpy repeat from #47670.

* Fixed the last issue on numpy dev: a TypeError message had changed

* Code review: Removed `EXTRA_LOC`

* Code review: removed commented line

* Code review: reverted out of scope change

* Code review: reverted out of scope change

* Fixed unused import

* Fixed revert mistake

* Moved whatsnew to 1.6.0

* Update pandas/tests/io/parser/test_quoting.py

Co-authored-by: Sylvain MARIE <[email protected]>
noatamir pushed a commit to noatamir/pandas that referenced this pull request Nov 9, 2022
…specific directive is used (pandas-dev#46405)

* Added test representative of pandas-dev#46319. Should fail on CI

* Added a gha worker with non utf 8 zh_CN encoding

* Attempt to fix the encoding so that locale works

* Added the fix, but not using it for now, until CI is able to reproduce the issue.

* Crazy idea: maybe simply removing the .utf8 modifier will use the right encoding !

* Hopefully fixing the locale not available error

* Now simply generating the locale, not updating the ubuntu one

* Trying to install the locale without enabling it

* Stupid mistake

* Testing the optional locale generator condition

* Put back all runners

* Added whatsnew

* Now using the fix

* As per code review: moved locale-switching fixture `overridden_locale` to conftest

* Flake8

* Added comments on the runner

* Added a non-utf8 locale in the `it_IT` runner. Added the zh_CN.utf8 locale in the tests

* Improved readability of fixture `overridden_locale` as per code review

* Added two comments on default encoding

* Fixed pandas-dev#46319 by adding a new `char_to_string_locale` function in the `tslibs.util` module, able to decode char* using the current locale.

* As per code review: modified the test to contain non-utf8 chars. Fixed the resulting issue.

* Split the test in two for clarity

* Fixed test and flake8 error.

* Updated whatsnew to ref pandas-dev#46468 . Updated test name

* Removing wrong whatsnew bullet

* Nitpick on whatsnew as per code review

* Fixed build error rst directive

* Names incorrectly reverted in last merge commit

* Fixed test_localization so that pandas-dev#46595 can be demonstrated on windows targets (even if today these do not run on windows targets, see pandas-dev#46597)

* Fixed `tm.set_locale` context manager, it could error and leak when category LC_ALL was used. Fixed pandas-dev#46595

* Removed the fixture as per code review, and added corresponding parametrization in tests.

* Dummy mod to trigger CI again

* reverted dummy mod

* Attempt to fix the remaining error on the numpy worker

* Fixed issue in `_from_ordinal`

* Added asserts to try to understand

* Reverted debugging asserts and applied fix for numpy repeat from pandas-dev#47670.

* Fixed the last issue on numpy dev: a TypeError message had changed

* Code review: Removed `EXTRA_LOC`

* Code review: removed commented line

* Code review: reverted out of scope change

* Code review: reverted out of scope change

* Fixed unused import

* Fixed revert mistake

* Moved whatsnew to 1.6.0

* Update pandas/tests/io/parser/test_quoting.py

Co-authored-by: Sylvain MARIE <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI Continuous Integration Compat pandas objects compatability with Numpy or Python functions
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants