Skip to content

Pick and choose numpy version based on Python 2 or 3. #28511

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 4 commits into from

Conversation

didip
Copy link

@didip didip commented Sep 18, 2019

cc @jorisvandenbossche

@jorisvandenbossche jorisvandenbossche added this to the 0.24.3 milestone Sep 19, 2019
@jorisvandenbossche
Copy link
Member

Thanks for the PR!

Hmm, there a couple of failures. Not directly related to your change, but related to the fact that we didn't run those tests (at the state of 0.24.2) for some time, and fixes have been made for changed dependencies.
See eg #26725, #26726

There also seem to be some actual failures with numpy 1.17, so an option could be to restrict it to lower versions also for python 3 ?

@didip
Copy link
Author

didip commented Sep 19, 2019

Thanks for the quick response.

Do you have documented versions of what the maximum versions pandas should have on Python 3?

@TomAugspurger
Copy link
Contributor

TomAugspurger commented Sep 19, 2019 via email

@jorisvandenbossche
Copy link
Member

Part of them are certainly test failures, but it seems there are also actual bugs (although someone needs to investigate it more in detail to be sure):

______________________ TestRangeIndex.test_numpy_argsort _______________________
[gw1] linux -- Python 3.7.4 /home/vsts/miniconda3/envs/pandas-dev/bin/python

self = <pandas.tests.indexes.test_range.TestRangeIndex object at 0x7ff094eacd50>

    def test_numpy_argsort(self):
        for k, ind in self.indices.items():
>           result = np.argsort(ind)

pandas/tests/indexes/common.py:320: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
<__array_function__ internals>:6: in argsort
    ???
../../../miniconda3/envs/pandas-dev/lib/python3.7/site-packages/numpy/core/fromnumeric.py:1089: in argsort
    return _wrapfunc(a, 'argsort', axis=axis, kind=kind, order=order)
../../../miniconda3/envs/pandas-dev/lib/python3.7/site-packages/numpy/core/fromnumeric.py:61: in _wrapfunc
    return bound(*args, **kwds)
pandas/core/indexes/range.py:323: in argsort
    nv.validate_argsort(args, kwargs)
pandas/compat/numpy/function.py:56: in __call__
    self.defaults)
pandas/util/_validators.py:218: in validate_args_and_kwargs
    validate_kwargs(fname, kwargs, compat_args)
pandas/util/_validators.py:157: in validate_kwargs
    _check_for_default_values(fname, kwds, compat_args)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

fname = 'argsort', arg_val_dict = {'axis': -1, 'kind': None, 'order': None}
compat_args = OrderedDict([('axis', -1), ('kind', 'quicksort'), ('order', None)])

    def _check_for_default_values(fname, arg_val_dict, compat_args):
        """
        Check that the keys in `arg_val_dict` are mapped to their
        default values as specified in `compat_args`.
    
        Note that this function is to be called only when it has been
        checked that arg_val_dict.keys() is a subset of compat_args
    
        """
        for key in arg_val_dict:
            # try checking equality directly with '=' operator,
            # as comparison may have been overridden for the left
            # hand object
            try:
                v1 = arg_val_dict[key]
                v2 = compat_args[key]
    
                # check for None-ness otherwise we could end up
                # comparing a numpy array vs None
                if (v1 is not None and v2 is None) or \
                   (v1 is None and v2 is not None):
                    match = False
                else:
                    match = (v1 == v2)
    
                if not is_bool(match):
                    raise ValueError("'match' is not a boolean")
    
            # could not compare them directly, so try comparison
            # using the 'is' operator
            except ValueError:
                match = (arg_val_dict[key] is compat_args[key])
    
            if not match:
                raise ValueError(("the '{arg}' parameter is not "
                                  "supported in the pandas "
                                  "implementation of {fname}()".
>                                 format(fname=fname, arg=key)))
E               ValueError: the 'kind' parameter is not supported in the pandas implementation of argsort()

But it might be more a corner case anyway.

@jorisvandenbossche
Copy link
Member

Actually, I was looking at the numpy dev build. Most of the other build failures seem to be problems getting the environment installed (missing packages etc), like the python 2.7 build on Azure:

Solving environment: ...working... failed

ResolvePackageNotFound: 
  - xlsxwriter=0.5.2
  - pytz=2013b
  - python-dateutil=2.5.0
  - xlwt=0.7.5

Note sure how much effort we should do to get those CIs running again ..

@TomAugspurger
Copy link
Contributor

TomAugspurger commented Sep 19, 2019 via email

@codecov
Copy link

codecov bot commented Sep 20, 2019

Codecov Report

Merging #28511 into 0.24.x will decrease coverage by 49.48%.
The diff coverage is n/a.

Impacted file tree graph

@@             Coverage Diff             @@
##           0.24.x   #28511       +/-   ##
===========================================
- Coverage   92.39%   42.91%   -49.49%     
===========================================
  Files         166      166               
  Lines       52430    52430               
===========================================
- Hits        48443    22499    -25944     
- Misses       3987    29931    +25944
Flag Coverage Δ
#multiple ?
#single 42.91% <ø> (ø) ⬆️
Impacted Files Coverage Δ
pandas/core/frame.py 35.59% <ø> (-61.28%) ⬇️
pandas/io/formats/latex.py 0% <0%> (-100%) ⬇️
pandas/core/categorical.py 0% <0%> (-100%) ⬇️
pandas/io/sas/sas_constants.py 0% <0%> (-100%) ⬇️
pandas/tseries/plotting.py 0% <0%> (-100%) ⬇️
pandas/tseries/converter.py 0% <0%> (-100%) ⬇️
pandas/io/formats/html.py 0% <0%> (-99.35%) ⬇️
pandas/core/groupby/categorical.py 0% <0%> (-95.46%) ⬇️
pandas/io/sas/sas7bdat.py 0% <0%> (-91.17%) ⬇️
pandas/io/sas/sas_xport.py 0% <0%> (-90.15%) ⬇️
... and 125 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1ec0472...278f7dc. Read the comment docs.

@datapythonista datapythonista added Build Library building on various platforms Dependencies Required and optional dependencies labels Oct 7, 2019
Copy link
Member

@datapythonista datapythonista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fixing this @didip. Can you remove the tailing whitespace that is breaking the CI, and check the styling comment?


numpy_install_rule = 'numpy >= {numpy_ver}'.format(numpy_ver=min_numpy_ver)
if max_numpy_ver:
numpy_install_rule += ', < {numpy_ver}'.format(numpy_ver=max_numpy_ver)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a style preference, but what fo you think about the less verbose option:

numpy_requires = 'numpy >= {min_numpy_ver}{max_numpy_rule}'.format(
    min_numpy_ver=min_numpy_ver,
    max_numpy_rule=', < {}'.format(max_numpy_ver_py2) if sys.version_info[0] < 3 else '')

@datapythonista
Copy link
Member

Also, if you can please merge master into this branch.

@TomAugspurger
Copy link
Contributor

I think this is targeting 0.24.x directly, since we don't want these changes on master.

@didip
Copy link
Author

didip commented Oct 7, 2019

Agree with @TomAugspurger

@TomAugspurger
Copy link
Contributor

TomAugspurger commented Oct 7, 2019 via email

@didip
Copy link
Author

didip commented Oct 7, 2019

Yes, I’ll try to find sometime today to address it.

@didip
Copy link
Author

didip commented Oct 7, 2019

Where is the CI/CD output that shows failed linting?

This one? https://github.com/pandas-dev/pandas/pull/28511/checks?check_run_id=229464786

@datapythonista
Copy link
Member

yes, but the build is too old and is not available anymore. You can run ./ci/code_checks.sh and should report where are the tailing spaces.

@didip
Copy link
Author

didip commented Oct 8, 2019

There are tons of lint problems when running ./ci/code_checks.sh but I don't see one that pertains to setup.py.

@datapythonista
Copy link
Member

Not quite sure what's the problem with the docstrings, but you've also got this error in the CI pytables needs a release beyong 3.4.4 to support numpy 1.16x.

Do you mind having a look @didip. We can't merge this until the CI is green.

@WillAyd
Copy link
Member

WillAyd commented Oct 22, 2019

At this point we don't have any further plans for a 24.x release, so unfortunately I don't think this is something we will address (if anyone disagrees feel free to reopen)

@jorisvandenbossche
Copy link
Member

As far as I recall, we said to give the opportunity to interested contributors, so let's wait a bit more to see if @didip has time to update this.

@jbrockmendel
Copy link
Member

@didip it looks like some xfailed pytables tests need to be updated for the changes in this PR to work. can you take a look and update?

@didip
Copy link
Author

didip commented Oct 29, 2019

Upgraded pytables to 3.5.2. sh test_fast.sh now actually runs end-to-end with 20 failures.

@didip
Copy link
Author

didip commented Oct 31, 2019

sh test.sh ran until the end with 45 failures.

Anything else I need to do? Is this good enough to merge?

@jbrockmendel
Copy link
Member

sh test.sh ran until the end with 45 failures. Anything else I need to do? Is this good enough to merge?

You'll need to get the tests passing. This can include xfailing if appropriate.

@didip
Copy link
Author

didip commented Oct 31, 2019

@jbrockmendel They look unrelated though. Example:

XFAIL pandas/tests/test_strings.py::TestStringMethods::test_api_per_method[empty0-rsplit-Index-category]
  reason: Raising too restrictively; solved by GH 23167

@jbrockmendel
Copy link
Member

They look unrelated though

Looking at azure, some of them seem to relate to numpy version and error messages.

For the ones that are XPASS if this PR actually fixes those tests then you can un-xfail them.

@didip
Copy link
Author

didip commented Nov 3, 2019

Ran sh ci/code_checks.sh. All of its failures are DocTest.

All of them failed because the expectation and results are in the wrong order. Examples:

4253 Examples
4254 --------
4255 >>> df = pd.DataFrame({'name': ['Raphael', 'Donatello'],
4256 ...                    'mask': ['red', 'purple'],
4257 ...                    'weapon': ['sai', 'bo staff']})
4258 >>> df.to_csv(index=False)
Expected:
    'name,mask,weapon\nRaphael,red,sai\nDonatello,purple,bo staff\n'
Got:
    'mask,name,weapon\nred,Raphael,sai\npurple,Donatello,bo staff\n'

or another one:

2665         >>> d2 = {'name': ['Alice', 'Bob'],
2666         ...       'score': [9.5, 8],
2667         ...       'employed': [False, True],
2668         ...       'kids': [0, 0]}
2669         >>> df2 = pd.DataFrame(data=d2)
2670         >>> df2
Differences (unified diff with -expected +actual):
    @@ -1,3 +1,3 @@
    -    name  score  employed  kids
    -0  Alice    9.5     False     0
    -1    Bob    8.0      True     0
    +   employed  kids   name  score
    +0     False     0  Alice    9.5
    +1      True     0    Bob    8.0

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@didip happy to take this, but needs to pass the CI (and I know you tried to fix)

ping if you can get this too work.

@@ -31,7 +31,7 @@ nbsphinx
numexpr>=2.6.8
openpyxl
pyarrow>=0.9.0
tables>=3.4.2
tables>=3.5.2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't think this would work because it would require a newer numpy than is allowed in this version (1.13)

@alimcmaster1
Copy link
Member

@didip - thanks for the effort so far - do you still want to work on this?

@didip
Copy link
Author

didip commented Dec 23, 2019

To be honest, I am running out of capacity to continue this work. I am completely unfamiliar with the build system to properly debug further problems.

@jreback
Copy link
Contributor

jreback commented Jan 1, 2020

closing, but still happy for someone to push a PR to close this issue.

@jreback jreback closed this Jan 1, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Build Library building on various platforms Dependencies Required and optional dependencies
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants