Pick and choose numpy version based on Python 2 or 3. #28511

didip · 2019-09-18T21:06:07Z

closes Cannot install pandas 0.24.2 from source with Python 2.7 into an environment without NumPy #27435
tests added / passed
passes black pandas (not applicable, black setup.py changes too many quotes, polluting the diff).
passes git diff upstream/master -u -- "*.py" | flake8 --diff
whatsnew entry

jorisvandenbossche · 2019-09-19T08:10:13Z

Thanks for the PR!

Hmm, there a couple of failures. Not directly related to your change, but related to the fact that we didn't run those tests (at the state of 0.24.2) for some time, and fixes have been made for changed dependencies.
See eg #26725, #26726

There also seem to be some actual failures with numpy 1.17, so an option could be to restrict it to lower versions also for python 3 ?

didip · 2019-09-19T17:43:59Z

Thanks for the quick response.

Do you have documented versions of what the maximum versions pandas should have on Python 3?

TomAugspurger · 2019-09-19T18:37:10Z

I haven't looked at the failures, but are they actual, noticeable bugs, or just failures in the test? If they're real issues then we can just have pandas 0.24.x require NumPy <1.17 across the board.

…

On Thu, Sep 19, 2019 at 12:44 PM Didip Kerabat ***@***.***> wrote: Thanks for the quick response. Do you have documented versions of what the maximum versions pandas should have on Python 3? — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#28511?email_source=notifications&email_token=AAKAOITOQJNDPWJTJBLZRO3QKO26PA5CNFSM4IYDXLM2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD7EIVHI#issuecomment-533236381>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAKAOIUASZO5KC2EJDTIHATQKO26PANCNFSM4IYDXLMQ> .

jorisvandenbossche · 2019-09-19T18:52:43Z

Part of them are certainly test failures, but it seems there are also actual bugs (although someone needs to investigate it more in detail to be sure):

______________________ TestRangeIndex.test_numpy_argsort _______________________
[gw1] linux -- Python 3.7.4 /home/vsts/miniconda3/envs/pandas-dev/bin/python

self = <pandas.tests.indexes.test_range.TestRangeIndex object at 0x7ff094eacd50>

    def test_numpy_argsort(self):
        for k, ind in self.indices.items():
>           result = np.argsort(ind)

pandas/tests/indexes/common.py:320: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
<__array_function__ internals>:6: in argsort
    ???
../../../miniconda3/envs/pandas-dev/lib/python3.7/site-packages/numpy/core/fromnumeric.py:1089: in argsort
    return _wrapfunc(a, 'argsort', axis=axis, kind=kind, order=order)
../../../miniconda3/envs/pandas-dev/lib/python3.7/site-packages/numpy/core/fromnumeric.py:61: in _wrapfunc
    return bound(*args, **kwds)
pandas/core/indexes/range.py:323: in argsort
    nv.validate_argsort(args, kwargs)
pandas/compat/numpy/function.py:56: in __call__
    self.defaults)
pandas/util/_validators.py:218: in validate_args_and_kwargs
    validate_kwargs(fname, kwargs, compat_args)
pandas/util/_validators.py:157: in validate_kwargs
    _check_for_default_values(fname, kwds, compat_args)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

fname = 'argsort', arg_val_dict = {'axis': -1, 'kind': None, 'order': None}
compat_args = OrderedDict([('axis', -1), ('kind', 'quicksort'), ('order', None)])

    def _check_for_default_values(fname, arg_val_dict, compat_args):
        """
        Check that the keys in `arg_val_dict` are mapped to their
        default values as specified in `compat_args`.
    
        Note that this function is to be called only when it has been
        checked that arg_val_dict.keys() is a subset of compat_args
    
        """
        for key in arg_val_dict:
            # try checking equality directly with '=' operator,
            # as comparison may have been overridden for the left
            # hand object
            try:
                v1 = arg_val_dict[key]
                v2 = compat_args[key]
    
                # check for None-ness otherwise we could end up
                # comparing a numpy array vs None
                if (v1 is not None and v2 is None) or \
                   (v1 is None and v2 is not None):
                    match = False
                else:
                    match = (v1 == v2)
    
                if not is_bool(match):
                    raise ValueError("'match' is not a boolean")
    
            # could not compare them directly, so try comparison
            # using the 'is' operator
            except ValueError:
                match = (arg_val_dict[key] is compat_args[key])
    
            if not match:
                raise ValueError(("the '{arg}' parameter is not "
                                  "supported in the pandas "
                                  "implementation of {fname}()".
>                                 format(fname=fname, arg=key)))
E               ValueError: the 'kind' parameter is not supported in the pandas implementation of argsort()

But it might be more a corner case anyway.

jorisvandenbossche · 2019-09-19T18:54:12Z

Actually, I was looking at the numpy dev build. Most of the other build failures seem to be problems getting the environment installed (missing packages etc), like the python 2.7 build on Azure:

Solving environment: ...working... failed

ResolvePackageNotFound: 
  - xlsxwriter=0.5.2
  - pytz=2013b
  - python-dateutil=2.5.0
  - xlwt=0.7.5

Note sure how much effort we should do to get those CIs running again ..

TomAugspurger · 2019-09-19T20:27:06Z

Yeah, numpydev can certainly be ignored. For the rest, perhaps try adding back the `free` channel in each environment that needs it?

…

On Thu, Sep 19, 2019 at 1:54 PM Joris Van den Bossche < ***@***.***> wrote: Actually, I was looking at the numpy dev build. Most of the other build failures seem to be problems getting the environment installed (missing packages etc), like the python 2.7 build on Azure: Solving environment: ...working... failed ResolvePackageNotFound: - xlsxwriter=0.5.2 - pytz=2013b - python-dateutil=2.5.0 - xlwt=0.7.5 Note sure how much effort we should do to get those CIs running again .. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#28511?email_source=notifications&email_token=AAKAOIRXC2E4C3BPFKJZM5TQKPDFZA5CNFSM4IYDXLM2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD7EPEMA#issuecomment-533262896>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAKAOIV27MQSLBV63ZUB6JLQKPDFZANCNFSM4IYDXLMQ> .

codecov · 2019-09-20T06:12:24Z

Codecov Report

Merging #28511 into 0.24.x will decrease coverage by 49.48%.
The diff coverage is n/a.

@@             Coverage Diff             @@
##           0.24.x   #28511       +/-   ##
===========================================
- Coverage   92.39%   42.91%   -49.49%     
===========================================
  Files         166      166               
  Lines       52430    52430               
===========================================
- Hits        48443    22499    -25944     
- Misses       3987    29931    +25944

Flag	Coverage Δ
#multiple	`?`
#single	`42.91% <ø> (ø)`	⬆️

Impacted Files	Coverage Δ
pandas/core/frame.py	`35.59% <ø> (-61.28%)`	⬇️
pandas/io/formats/latex.py	`0% <0%> (-100%)`	⬇️
pandas/core/categorical.py	`0% <0%> (-100%)`	⬇️
pandas/io/sas/sas_constants.py	`0% <0%> (-100%)`	⬇️
pandas/tseries/plotting.py	`0% <0%> (-100%)`	⬇️
pandas/tseries/converter.py	`0% <0%> (-100%)`	⬇️
pandas/io/formats/html.py	`0% <0%> (-99.35%)`	⬇️
pandas/core/groupby/categorical.py	`0% <0%> (-95.46%)`	⬇️
pandas/io/sas/sas7bdat.py	`0% <0%> (-91.17%)`	⬇️
pandas/io/sas/sas_xport.py	`0% <0%> (-90.15%)`	⬇️
... and 125 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1ec0472...278f7dc. Read the comment docs.

datapythonista

Thanks for the fixing this @didip. Can you remove the tailing whitespace that is breaking the CI, and check the styling comment?

datapythonista · 2019-10-07T13:03:52Z

setup.py

+
+numpy_install_rule = 'numpy >= {numpy_ver}'.format(numpy_ver=min_numpy_ver)
+if max_numpy_ver:
+    numpy_install_rule += ', < {numpy_ver}'.format(numpy_ver=max_numpy_ver)


Just a style preference, but what fo you think about the less verbose option:

numpy_requires = 'numpy >= {min_numpy_ver}{max_numpy_rule}'.format( min_numpy_ver=min_numpy_ver, max_numpy_rule=', < {}'.format(max_numpy_ver_py2) if sys.version_info[0] < 3 else '')

datapythonista · 2019-10-07T13:06:24Z

Also, if you can please merge master into this branch.

TomAugspurger · 2019-10-07T16:38:09Z

I think this is targeting 0.24.x directly, since we don't want these changes on master.

didip · 2019-10-07T16:40:54Z

Agree with @TomAugspurger

TomAugspurger · 2019-10-07T16:42:27Z

We still need the linting issues fixed though.

…

On Mon, Oct 7, 2019 at 11:41 AM Didip Kerabat ***@***.***> wrote: Agree with @TomAugspurger <https://github.com/TomAugspurger> — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#28511?email_source=notifications&email_token=AAKAOIWZ32LSX4LVKAQ5IELQNNRCBA5CNFSM4IYDXLM2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEARAMAA#issuecomment-539100672>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAKAOIQRLK3TQUGZNE7KR2TQNNRCBANCNFSM4IYDXLMQ> .

didip · 2019-10-07T17:19:06Z

Yes, I’ll try to find sometime today to address it.

didip · 2019-10-07T21:59:34Z

Where is the CI/CD output that shows failed linting?

This one? https://github.com/pandas-dev/pandas/pull/28511/checks?check_run_id=229464786

datapythonista · 2019-10-07T22:04:02Z

yes, but the build is too old and is not available anymore. You can run ./ci/code_checks.sh and should report where are the tailing spaces.

didip · 2019-10-08T06:19:17Z

There are tons of lint problems when running ./ci/code_checks.sh but I don't see one that pertains to setup.py.

datapythonista · 2019-10-12T20:43:06Z

Not quite sure what's the problem with the docstrings, but you've also got this error in the CI pytables needs a release beyong 3.4.4 to support numpy 1.16x.

Do you mind having a look @didip. We can't merge this until the CI is green.

WillAyd · 2019-10-22T01:23:37Z

At this point we don't have any further plans for a 24.x release, so unfortunately I don't think this is something we will address (if anyone disagrees feel free to reopen)

jorisvandenbossche · 2019-10-22T06:23:59Z

As far as I recall, we said to give the opportunity to interested contributors, so let's wait a bit more to see if @didip has time to update this.

jbrockmendel · 2019-10-28T19:06:05Z

@didip it looks like some xfailed pytables tests need to be updated for the changes in this PR to work. can you take a look and update?

…-end with 20 failures.

didip · 2019-10-29T15:28:55Z

Upgraded pytables to 3.5.2. sh test_fast.sh now actually runs end-to-end with 20 failures.

didip · 2019-10-31T02:21:47Z

sh test.sh ran until the end with 45 failures.

Anything else I need to do? Is this good enough to merge?

jbrockmendel · 2019-10-31T03:23:18Z

sh test.sh ran until the end with 45 failures. Anything else I need to do? Is this good enough to merge?

You'll need to get the tests passing. This can include xfailing if appropriate.

didip · 2019-10-31T04:21:53Z

@jbrockmendel They look unrelated though. Example:

XFAIL pandas/tests/test_strings.py::TestStringMethods::test_api_per_method[empty0-rsplit-Index-category]
  reason: Raising too restrictively; solved by GH 23167

jbrockmendel · 2019-10-31T15:46:43Z

They look unrelated though

Looking at azure, some of them seem to relate to numpy version and error messages.

For the ones that are XPASS if this PR actually fixes those tests then you can un-xfail them.

didip · 2019-11-03T14:21:11Z

Ran sh ci/code_checks.sh. All of its failures are DocTest.

All of them failed because the expectation and results are in the wrong order. Examples:

4253 Examples
4254 --------
4255 >>> df = pd.DataFrame({'name': ['Raphael', 'Donatello'],
4256 ...                    'mask': ['red', 'purple'],
4257 ...                    'weapon': ['sai', 'bo staff']})
4258 >>> df.to_csv(index=False)
Expected:
    'name,mask,weapon\nRaphael,red,sai\nDonatello,purple,bo staff\n'
Got:
    'mask,name,weapon\nred,Raphael,sai\npurple,Donatello,bo staff\n'

or another one:

2665         >>> d2 = {'name': ['Alice', 'Bob'],
2666         ...       'score': [9.5, 8],
2667         ...       'employed': [False, True],
2668         ...       'kids': [0, 0]}
2669         >>> df2 = pd.DataFrame(data=d2)
2670         >>> df2
Differences (unified diff with -expected +actual):
    @@ -1,3 +1,3 @@
    -    name  score  employed  kids
    -0  Alice    9.5     False     0
    -1    Bob    8.0      True     0
    +   employed  kids   name  score
    +0     False     0  Alice    9.5
    +1      True     0    Bob    8.0

jreback

@didip happy to take this, but needs to pass the CI (and I know you tried to fix)

ping if you can get this too work.

jreback · 2019-12-02T01:00:27Z

requirements-dev.txt

@@ -31,7 +31,7 @@ nbsphinx
 numexpr>=2.6.8
 openpyxl
 pyarrow>=0.9.0
-tables>=3.4.2
+tables>=3.5.2


i don't think this would work because it would require a newer numpy than is allowed in this version (1.13)

alimcmaster1 · 2019-12-23T04:15:55Z

@didip - thanks for the effort so far - do you still want to work on this?

didip · 2019-12-23T18:23:45Z

To be honest, I am running out of capacity to continue this work. I am completely unfamiliar with the build system to properly debug further problems.

jreback · 2020-01-01T21:00:01Z

closing, but still happy for someone to push a PR to close this issue.

Pick and choose numpy version based on Python 2 or 3.

765cd7d

jorisvandenbossche added this to the 0.24.3 milestone Sep 19, 2019

restore_free_channel

54694be

jorisvandenbossche mentioned this pull request Sep 20, 2019

Release manager needed for pandas 0.24.3 python-sprints/pandas-mentoring#166

Open

datapythonista added Build Library building on various platforms Dependencies Required and optional dependencies labels Oct 7, 2019

datapythonista reviewed Oct 7, 2019

View reviewed changes

whitespace.

8be33c1

WillAyd closed this Oct 22, 2019

jorisvandenbossche reopened this Oct 22, 2019

jorisvandenbossche mentioned this pull request Oct 22, 2019

Cannot install pandas 0.24.2 from source with Python 2.7 into an environment without NumPy #27435

Closed

Upgrade pytables to 3.5.2 or greater. sh test_fast.sh now runs end-to…

278f7dc

…-end with 20 failures.

jreback requested changes Dec 2, 2019

View reviewed changes

jreback closed this Jan 1, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pick and choose numpy version based on Python 2 or 3. #28511

Pick and choose numpy version based on Python 2 or 3. #28511

didip commented Sep 18, 2019 •

edited

Loading

jorisvandenbossche commented Sep 19, 2019

didip commented Sep 19, 2019

TomAugspurger commented Sep 19, 2019 via email

jorisvandenbossche commented Sep 19, 2019

jorisvandenbossche commented Sep 19, 2019

TomAugspurger commented Sep 19, 2019 via email

codecov bot commented Sep 20, 2019 •

edited

Loading

datapythonista left a comment

datapythonista Oct 7, 2019

datapythonista commented Oct 7, 2019

TomAugspurger commented Oct 7, 2019

didip commented Oct 7, 2019

TomAugspurger commented Oct 7, 2019 via email

didip commented Oct 7, 2019

didip commented Oct 7, 2019 •

edited

Loading

datapythonista commented Oct 7, 2019

didip commented Oct 8, 2019

datapythonista commented Oct 12, 2019

WillAyd commented Oct 22, 2019

jorisvandenbossche commented Oct 22, 2019

jbrockmendel commented Oct 28, 2019

didip commented Oct 29, 2019

didip commented Oct 31, 2019

jbrockmendel commented Oct 31, 2019

didip commented Oct 31, 2019

jbrockmendel commented Oct 31, 2019

didip commented Nov 3, 2019

jreback left a comment

jreback Dec 2, 2019

alimcmaster1 commented Dec 23, 2019

didip commented Dec 23, 2019

jreback commented Jan 1, 2020

Pick and choose numpy version based on Python 2 or 3. #28511

Pick and choose numpy version based on Python 2 or 3. #28511

Conversation

didip commented Sep 18, 2019 • edited Loading

jorisvandenbossche commented Sep 19, 2019

didip commented Sep 19, 2019

TomAugspurger commented Sep 19, 2019 via email

jorisvandenbossche commented Sep 19, 2019

jorisvandenbossche commented Sep 19, 2019

TomAugspurger commented Sep 19, 2019 via email

codecov bot commented Sep 20, 2019 • edited Loading

Codecov Report

datapythonista left a comment

Choose a reason for hiding this comment

datapythonista Oct 7, 2019

Choose a reason for hiding this comment

datapythonista commented Oct 7, 2019

TomAugspurger commented Oct 7, 2019

didip commented Oct 7, 2019

TomAugspurger commented Oct 7, 2019 via email

didip commented Oct 7, 2019

didip commented Oct 7, 2019 • edited Loading

datapythonista commented Oct 7, 2019

didip commented Oct 8, 2019

datapythonista commented Oct 12, 2019

WillAyd commented Oct 22, 2019

jorisvandenbossche commented Oct 22, 2019

jbrockmendel commented Oct 28, 2019

didip commented Oct 29, 2019

didip commented Oct 31, 2019

jbrockmendel commented Oct 31, 2019

didip commented Oct 31, 2019

jbrockmendel commented Oct 31, 2019

didip commented Nov 3, 2019

jreback left a comment

Choose a reason for hiding this comment

jreback Dec 2, 2019

Choose a reason for hiding this comment

alimcmaster1 commented Dec 23, 2019

didip commented Dec 23, 2019

jreback commented Jan 1, 2020

didip commented Sep 18, 2019 •

edited

Loading

codecov bot commented Sep 20, 2019 •

edited

Loading

didip commented Oct 7, 2019 •

edited

Loading