update submodules to latest versions #73

mattip · 2020-02-17T10:03:27Z

test latest version of tools before adding new platforms, also check openblas is properly installed

mattip · 2020-02-17T16:14:26Z

The 32 bit builds are failing the new test_umath_accuracy.py with 8 ULP instead of 4. I wonder if this is a problem with gcc (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0 in the manylinux2010 image. The azure builds on numpy/numpy don't report which gcc is in use, but they use i386/ubuntu:bionic and then download gcc7 as well. Since the azure build neither skips nor xfails the test, I assume it passes but maybe a more verbose log would help.

I can try to reproduce the failure locally using the manylinux2010 32-bit image

charris · 2020-02-17T16:28:25Z

IIRC, the tests are run in a docker image. I wonder if the correct image is being used?

mattip · 2020-02-17T16:29:37Z

@r-devulap do you have an idea why the log test is returning more than 8 ULP difference on a travis docker run ~~with the manylinux2010 image~~?

Edit: @charris is correct, this is with the matthew-brett 32 bit image using wheelhouse/numpy-1.19.0.dev0+4edd437-cp36-cp36m-manylinux2010_i686.wh produced by the build-wheel step

mattip · 2020-02-17T16:34:55Z

@charris is that wheel uploaded anywhere, or is it only local to the CI run?

charris · 2020-02-17T16:54:02Z

@mattip Which wheel?

mattip · 2020-02-17T17:07:45Z

The wheels from this PR: specifically wheelhouse/numpy-1.19.0.dev0+4edd437-cp36-cp36m-manylinux2010_i686.whl

charris · 2020-02-17T17:25:24Z

No, the wheels generated in PRs are not uploaded, the rackspace access key isn't decoded in PR's. I don't know how to access them otherwise. We can change that as a travis option, I think, but then the rackspace repository would be contaminated. I wonder if we could send wheels to someplace like dropbox?

mattip · 2020-02-17T17:26:53Z

Not a big deal, I can build it locally since both the build and the test are scripted runs.

mattip · 2020-02-17T19:35:00Z

The 32-bit docker that fails and the 64-bit that pass have the same __cpu_features__, so they must be using the same loops. Maybe it is some cpu floating point register that is set differently?

pv · 2020-02-17T19:52:20Z

mfpmath=sse vs mfpmath=387?

r-devulap · 2020-02-17T20:03:58Z

The 32 bit builds are failing the new test_umath_accuracy.py with 8 ULP instead of 4.

@mattip the ULP difference is not 8, it is 8.38861e+06 ULP. Something is off (like computing ULP difference between 2 NAN's or something). Can you point me to a docker image I can try running locally to debug?

r-devulap · 2020-02-17T20:05:56Z

I see this in the error message: core/tests/test_numeric.py::TestClip::test_NaT_propagation[arr0-amin0-amax0]

mattip · 2020-02-17T20:46:28Z

The same docker that runs the azure tests successfully fails to properly use the wheel built here. I am now going to try to both build and test with the azure script from numpy/numpy. Note that this repo builds a wheel in the multilinux2010 docker and then tests it in a different one.

mattip · 2020-02-17T20:48:25Z

ULP difference is not 8, it is 8.38861e+06 ULP

Ahh, so maybe it is failing to properly catch the NAN error conditions in these dockers, and somehow Azure does something special?

mattip · 2020-02-17T21:10:30Z

mfpmath=sse vs mfpmath=387

Do we set that at all in the gcc flags, or is there an -march setting that toggles one or the other? In numpy/numpy#15585 I changed the platform.uname().machine: on the Azure build it is x86_64 but on a local build it is i686. So maybe this is a bug in the __cpu_feature__ detection inside a i686 docker.

mattip · 2020-02-17T22:55:38Z

@r-devulap : I can run the manylinux docker like

sudo docker run -it --rm -v/path/to/git/checkout/of/numpy:/build_dir \
quay.io/pypa/manylinux2010_i686  /bin/bash

Once in the docker, you will want to cd /build_dir, which will put you at your external numpy checkout. I find it helpful to adduser and su so I am not root but YMMV. The pythons are at /opt/python/cp*, so I usually /opt/python/cp36-cp36m/bin/python -venv /tmp/venv, then I can /tmp/venv/bin/python -mpip install -r test_requirements.txt and /tmp/venv/bin/python runtests.py.

The problem only shows up when compiling inside the docker of the manylinux image.

r-devulap · 2020-02-18T05:31:04Z

Thanks @mattip, I did confirm the failure and it happens for a NAN to NAN comparison, not sure what the difference is between this and other linux Glibc that it fails for this. Will dig in further tomorrow.

mattip · 2020-02-18T09:53:51Z

As always, there is more than one problem.

First, ~~HAVE_ATTRIBUTE_TARGET_AVX512F_WITH_INTRINSICS~~ NPY_HAVE_SSE2_INTRINSICS is not defined on this platform, even though the __cpu_features__ say it should be. So the ~~FLOAT_log_avx512f~~ FLOAT_log_fma loop is substituted in (because the feature should be there), the test is not skipped (because the feature should be there), and instead of the avx function, glibc's npy_log is being used (because the feature is not there).

Second, npy_log(-1e-45) is returning NAN, just not the NAN the test asks for. The test wants 0xffc00000, the function is returning 0x7fc00000. Both are NAN, but nulp_diff is very literal:

>>> import numpy as np, numpy.testing
>>> x = np.int32(0xffc00000).view(np.float32)
>>> y = np.int32(0x7fc00000).view(np.float32)
>>> print(x, y)
nan nan
>>> numpy.testing._private.utils.nulp_diff(y, x)
8388608.0

Edit: loop name, proper define

mattip · 2020-02-18T10:43:41Z

For completeness, this is what __cpu_features__ reports, so ~~at least~~ the feature detection disagrees with check_gcc_function_attribute_with_intrinsics:

>>> np.core._multiarray_umath.__cpu_features__['AVX512F']
False
>>> np.core._multiarray_umath.__cpu_features__['FMA3']
True
>>> np.core._multiarray_umath.__cpu_features__['AVX2']
True
>>> np.core._multiarray_umath.__cpu_features__['SSE2']
True

Edit: SSE2 is detected, but NPY_HAVE_SSE2_INTRINSICS is not defined

mattip · 2020-02-18T11:16:25Z

Edited: the first problem is SSE2 is detected via __cpu_features__, but NPY_HAVE_SSE2_INTRINSICS is not defined

mattip · 2020-02-18T11:36:47Z

xref numpy/numpy#15593 for the use of a too-specific NAN

For the dissonance between SSE2 detection methods, ping @seiko2plus

mattip · 2020-02-18T12:47:30Z

The log problem is solved by added -msse2 to the compile flags, otherwise the check for _mm_load_pd fails which fails the NPY_HAVE_SSE2_INTRINSICS define in npy_common.h

seiko2plus · 2020-02-18T12:58:32Z

@mattip, The short answer, NPY_HAVE_SSE2_INTRINSICS check during the build if the compiler has support for SSE2, while __cpu_features__["SSE2"] and NPY_CPU_HAVE(SSE2) check if the running CPU has support for SSE2 during the runtime. we discussed a fix for this situation in numpy/numpy#15558.

mattip · 2020-02-18T13:57:20Z

Now better, but there are still 32-bit failures. We should probably add a manylinux201 i686 run to the regular CI to pick these up.

Is there an accepted way to skip test failures in certain builds? Or perhaps we should disable 32-bit wheel builds until we can work out the kinks.

Edit: add "wheel" to last line

mattip · 2020-02-26T11:41:57Z

Nope, still getting a segfault on linux32.

mattip · 2020-03-02T22:09:11Z

Close reopen to upgrade openblas to 0.3.9 beta

mattip · 2020-03-06T13:59:15Z

.travis.yml

-  exclude:
-      # Exclude the default Python 3.6 build
-      - python: 3.6
+jobs:


lines 19-here update the confusing travis.yml: we don't have any python3.6 jobs to exclude, sudo is no longer needed, matrix should be replaced by jobs.

mattip · 2020-03-06T14:00:53Z

config.sh

@@ -7,44 +7,39 @@ function build_wheel {
    local lib_plat=$PLAT
    if [ -n "$IS_OSX" ]; then
        install_gfortran
+    else
+        # For manylinux2010 builds with manylinux1 openblas builds
+        $use_sudo yum install -y libgfortran-4.4.7


This is needed to use the manylinux2010 openblas builds on the manylinux1 wheel builds. libfortran will be statically linked (uses the *.a file) so the wheels are still valid.

Sounds like the comment is backward. Maybe "Needed for manylinux1 wheels using libraries built for manylinux2010."

mattip · 2020-03-06T15:45:13Z

Builds are green

charris · 2020-03-06T18:59:46Z

config.sh

-    (cd / && $use_sudo tar zxf $tar_path)
+    # Use the same incantation as numpy/tools/travis-before-install.sh
+    python -mpip install urllib3
+    target=$(python numpy/tools/openblas_support.py)


Strange to call a command a target. Maybe explicitly use python on the next line. Where does the file come from?

EDIT: nvm.

This is a strange construction, it could use a comment.

Looks like what it does is download all the libraries into /var/tmp/openblas/. This could become rather inefficient if the libraries become numerous.

Adding a comment and a todo to use site.cfg instead of copying.

charris · 2020-03-06T19:04:33Z

env_vars_32.sh

+# Environment variables for build
+OPENBLAS_VERSION="v0.3.7"
+MACOSX_DEPLOYMENT_TARGET=10.9
+CFLAGS="-msse2 -std=c99 -fno-strict-aliasing"


This is comes from env_vars.sh, correct? I assume the extra bit is -msse2. Is the latest multibuild is updated to use this. Is it documented in the README?

The needed bit is the -msse2. Commenting here, do you have a place in the README you would like to see it documented as well?

Added a line to the README, as well as explaining the OpenBLAS download

charris · 2020-03-06T19:05:46Z

config.sh

 }

 function get_test_cmd {
    local extra_argv=${1:-$EXTRA_ARGV}
    echo "import sys; import numpy; \
        sys.exit(not numpy.test('full', \
-        extra_argv=[${extra_argv}]))"
+        extra_argv=['-vv', ${extra_argv}]))"


For debugging?

charris · 2020-03-07T17:55:15Z

LGTM, I'll put this in when the tests finish. Thanks Matti.

charris · 2020-03-07T18:02:38Z

Is changing .appveyor.yml to download from anaconda cloud next?

mattip · 2020-03-07T18:11:58Z

Updated appyveyor too

charris · 2020-03-07T19:47:30Z

Merged.

charris · 2020-03-07T20:53:05Z

Hmm, two of the 32 bit builds segfaulted in the merge.

mattip · 2020-03-07T21:23:37Z

Grrr. I was hoping that the move to xenial would fix those pesky 32-bit segfaults. Maybe we should revert the -msse2 flag addition, and not use sse loops for the 32-bit builds? It would require adjusting the NumPy tests. Some of the tests assume if the CPU supports sse2 then the compiler will also.

charris · 2020-03-07T21:41:38Z

Grrr. I was hoping that the move to xenial would fix those pesky 32-bit segfaults.

Me too :) SSE2 has been around for 20 years, in fact I think NumPy requires it and we made that decision some time ago. Remind me what the compiler problem is.

mattip · 2020-03-07T21:46:35Z

If we compile with sse2 we get segfaults. If we compile without it, we get test failures on 32-bit since

numpy compiles without support for the hand-written intrinsic loops for exp, reciprical and more because gcc does not use sse2
tests for accuracy of those ufunc loops fail since glibc is less accurate than the hand-written loops.

charris · 2020-03-07T21:49:33Z

Hmm. Is the problem only in manylinux2010 and in manylinux1, or is it something new we are testing for in master.

mattip · 2020-03-08T09:05:13Z

It is something new in master: we now have sse-specific loops for many ufuncs. The latest crop of sse/avx loops added accuracy tests. This pointed out a deficiency in the glibc version of some of the functions (particularily reciprocal), which does not pass the accuracy tests on linux32. That led to the discovery that we need -msse2 on 32-bits to avoid using glibc and use the sse versions, which was added in PR 15600 about 3 weeks ago. Since the current numpy-wheels builds do not use that flag, they do not fail.

I see that on Azure the builds also do not fail. There the tests are run in the multibuild2010_i686 image on top of Ubuntu 18.04, here we run in a Ubuntu 16.04 (xenial) image on top of Ubuntu 18.04 (bionic). Things to try:

removing the msse2 flag
running tests in a multibuild2010_i686 image
move the 32-bit linux and appveyor tests to Azure, like in numpy/numpy

mattip force-pushed the update-subrepos branch from f3e7581 to d58f5b6 Compare February 17, 2020 11:07

This was referenced Feb 17, 2020

Anaconda org weekly upload MacPython/scipy-wheels#69

Merged

BLD: manylinux2010 docker reports machine=i686 numpy/numpy#15585

Merged

mattip force-pushed the update-subrepos branch 2 times, most recently from 6844808 to 7f89729 Compare February 17, 2020 14:04

mattip mentioned this pull request Feb 17, 2020

Q: Prerelease wheels no longer have OpenBLAS numpy/numpy#15580

Closed

mattip closed this Feb 26, 2020

mattip reopened this Feb 26, 2020

add urllib3 needed for openblas_support

130bc10

mattip closed this Mar 2, 2020

mattip reopened this Mar 2, 2020

Update multibuild and use xenial test images on 32-bit

c9ab82e

mattip force-pushed the update-subrepos branch from 8e3cb1e to c9ab82e Compare March 6, 2020 13:55

mattip commented Mar 6, 2020

View reviewed changes

charris reviewed Mar 6, 2020

View reviewed changes

fixes from review, cleanups, improve README

52f4b91

update .appveyor to get openblas from numpy/tools/openblas_support

1e6dae6

mattip force-pushed the update-subrepos branch from 6c5beef to 1e6dae6 Compare March 7, 2020 18:53

charris merged commit 0e39420 into MacPython:master Mar 7, 2020

mattip mentioned this pull request Aug 27, 2020

AVX test failures for 32 bit manylinux1 wheels numpy/numpy#17174

Closed

update submodules to latest versions #73

update submodules to latest versions #73

Conversation

mattip commented Feb 17, 2020

mattip commented Feb 17, 2020

charris commented Feb 17, 2020

mattip commented Feb 17, 2020 • edited Loading

mattip commented Feb 17, 2020

charris commented Feb 17, 2020

mattip commented Feb 17, 2020

charris commented Feb 17, 2020

mattip commented Feb 17, 2020

mattip commented Feb 17, 2020

pv commented Feb 17, 2020

r-devulap commented Feb 17, 2020

r-devulap commented Feb 17, 2020

mattip commented Feb 17, 2020

mattip commented Feb 17, 2020

mattip commented Feb 17, 2020

mattip commented Feb 17, 2020

r-devulap commented Feb 18, 2020

mattip commented Feb 18, 2020 • edited Loading

mattip commented Feb 18, 2020 • edited Loading

mattip commented Feb 18, 2020

mattip commented Feb 18, 2020

mattip commented Feb 18, 2020

seiko2plus commented Feb 18, 2020

mattip commented Feb 18, 2020 • edited Loading

mattip commented Feb 26, 2020

mattip commented Mar 2, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mattip commented Mar 6, 2020

charris Mar 6, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

charris Mar 6, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

charris commented Mar 7, 2020

charris commented Mar 7, 2020

mattip commented Mar 7, 2020

charris commented Mar 7, 2020

charris commented Mar 7, 2020

mattip commented Mar 7, 2020

charris commented Mar 7, 2020

mattip commented Mar 7, 2020

charris commented Mar 7, 2020

mattip commented Mar 8, 2020

mattip commented Feb 17, 2020 •

edited

Loading

mattip commented Feb 18, 2020 •

edited

Loading

mattip commented Feb 18, 2020 •

edited

Loading

mattip commented Feb 18, 2020 •

edited

Loading

charris Mar 6, 2020 •

edited

Loading

charris Mar 6, 2020 •

edited

Loading