Skip to content

BLD, TST: Build and test Pyodide wheels for pandas in CI #57896

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 39 commits into from
May 8, 2024
Merged
Show file tree
Hide file tree
Changes from 14 commits
Commits
Show all changes
39 commits
Select commit Hold shift + click to select a range
748432e
Create initial Pyodide workflow
agriyakhetarpal Mar 4, 2024
05b2400
Do not import pandas folder from the repo
agriyakhetarpal Mar 5, 2024
f159469
Install hypothesis for testing
agriyakhetarpal Mar 5, 2024
1713c86
Add pytest decorator to skip tests on WASM
agriyakhetarpal Mar 5, 2024
47f48a6
Skip `time.tzset()` tests on WASM platforms
agriyakhetarpal Mar 5, 2024
6cf568e
Skip file system access tests on WASM
agriyakhetarpal Mar 5, 2024
9fe1904
Skip two more tzset test failures
agriyakhetarpal Mar 5, 2024
3f07fa9
Skip two more FS failures on WASM
agriyakhetarpal Mar 5, 2024
bbc8868
Resolve last two tzset failures on WASM
agriyakhetarpal Mar 5, 2024
f049ac2
Add a `WASM` constant for Emscripten platform checks
agriyakhetarpal Mar 5, 2024
4d4c017
Fix floating point imprecision with `np.timedelta64`
agriyakhetarpal Mar 5, 2024
feaac77
Mark tz OverflowError as xfail on WASM
agriyakhetarpal Mar 5, 2024
f8b5831
Merge branch 'main' into add-emscripten-ci
agriyakhetarpal Mar 18, 2024
d54d198
Try to fix OverflowError with date ranges
agriyakhetarpal Mar 20, 2024
294ab6e
Move job to unit tests workflow, withdraw env vars
agriyakhetarpal Mar 20, 2024
cc233e4
Fix up a few style errors, use WASM variable
agriyakhetarpal Mar 20, 2024
dab4675
Merge branch 'main' into add-emscripten-ci
agriyakhetarpal Apr 1, 2024
a944f52
Bump Pyodide to `0.25.1`
agriyakhetarpal Apr 1, 2024
8a61292
Merge latest changes from main
agriyakhetarpal May 3, 2024
6ba8636
Use shorter job name
agriyakhetarpal May 3, 2024
75da87f
Skip test where warning is not raised properly
agriyakhetarpal May 3, 2024
8c357a3
Don't run `test_date_time` loc check on WASM
agriyakhetarpal May 3, 2024
2a3270f
Don't run additional loc checks in `test_sas7bdat`
agriyakhetarpal May 3, 2024
e1002f5
Disable WASM OverflowError
agriyakhetarpal May 3, 2024
13973bb
Skip tests requiring fp exception support
agriyakhetarpal May 3, 2024
24b3e6d
xfail tests that require stricter tolerances
agriyakhetarpal May 3, 2024
dce2705
xfail test where `OverflowError`s are received
agriyakhetarpal May 3, 2024
029de34
Merge branch 'main' into add-emscripten-ci
agriyakhetarpal May 3, 2024
7f4715f
Remove upper-pin from `pydantic`
agriyakhetarpal May 3, 2024
9f46528
Better skip messages via `pytest.skipif` decorator
agriyakhetarpal May 3, 2024
51f8893
Import `WASM` var via public API where possible
agriyakhetarpal May 3, 2024
ab911d1
Unpin `pytest` for Pyodide job
agriyakhetarpal May 3, 2024
b262b35
Merge main again
agriyakhetarpal May 3, 2024
7371f64
Add reason attr when using boolean to skip test
agriyakhetarpal May 3, 2024
05b19a4
Merge branch 'main' into add-emscripten-ci
agriyakhetarpal May 3, 2024
ef7a3ab
Don't xfail, skip tests that bring `OverflowError`s
agriyakhetarpal May 3, 2024
c089852
Skip timedelta test that runs well only on 64-bit
agriyakhetarpal May 3, 2024
6a907c1
Skip tests that use `np.timedelta64`
agriyakhetarpal May 3, 2024
96488f4
Merge branch 'main' into add-emscripten-ci
mroeschke May 7, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
76 changes: 76 additions & 0 deletions .github/workflows/emscripten.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
name: Test Emscripten/Pyodide build

on:
# TODO: refine when this workflow should run when this
# is ready for use or before merging
pull_request:
push:
workflow_dispatch:

env:
FORCE_COLOR: 3

concurrency:
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
cancel-in-progress: true

jobs:
build-wasm-emscripten:
name: Build pandas distribution for Pyodide
runs-on: ubuntu-22.04
# To enable this workflow on a fork, comment out:
# if: github.repository == 'pandas-dev/pandas'
env:
PYODIDE_VERSION: 0.25.0
# PYTHON_VERSION and EMSCRIPTEN_VERSION are determined by PYODIDE_VERSION.
# The appropriate versions can be found in the Pyodide repodata.json
# "info" field, or in Makefile.envs:
# https://github.com/pyodide/pyodide/blob/main/Makefile.envs#L2
PYTHON_VERSION: 3.11.3
EMSCRIPTEN_VERSION: 3.1.46
NODE_VERSION: 18
steps:
- name: Checkout pandas
uses: actions/checkout@v4
with:
fetch-depth: 0

- name: Set up Python ${{ env.PYTHON_VERSION }}
id: setup-python
uses: actions/setup-python@v5
with:
python-version: ${{ env.PYTHON_VERSION }}

- name: Set up Emscripten toolchain
uses: mymindstorm/setup-emsdk@v14
with:
version: ${{ env.EMSCRIPTEN_VERSION }}
actions-cache-folder: emsdk-cache

- name: Install pyodide-build
run: pip install "pydantic<2" pyodide-build==${{ env.PYODIDE_VERSION }}

- name: Build pandas for Pyodide
run: |
# pyodide build -Ceditable-verbose=true
pyodide build

- name: Set up Node.js
uses: actions/setup-node@v4
with:
node-version: ${{ env.NODE_VERSION }}

- name: Set up Pyodide virtual environment
run: |
pyodide venv .venv-pyodide
source .venv-pyodide/bin/activate
pip install dist/*.whl

- name: Test pandas for Pyodide
run: |
source .venv-pyodide/bin/activate
export PANDAS_CI=1
pip install "pytest<8.1.0" hypothesis
# do not import pandas from the checked out repo
cd ..
python -c 'import pandas as pd; pd.test(extra_args=["-m not clipboard and not single_cpu and not slow and not network and not db"])'
2 changes: 2 additions & 0 deletions pandas/compat/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@
PY311,
PY312,
PYPY,
WASM,
)
import pandas.compat.compressors
from pandas.compat.numpy import is_numpy_dev
Expand Down Expand Up @@ -193,4 +194,5 @@ def get_bz2_file() -> type[pandas.compat.compressors.BZ2File]:
"PY311",
"PY312",
"PYPY",
"WASM",
]
2 changes: 2 additions & 0 deletions pandas/compat/_constants.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@
PY311 = sys.version_info >= (3, 11)
PY312 = sys.version_info >= (3, 12)
PYPY = platform.python_implementation() == "PyPy"
WASM = (sys.platform == "emscripten") or (platform.machine() in ["wasm32", "wasm64"])
ISMUSL = "musl" in (sysconfig.get_config_var("HOST_GNU_TYPE") or "")
REF_COUNT = 2 if PY311 else 3

Expand All @@ -27,4 +28,5 @@
"PY311",
"PY312",
"PYPY",
"WASM",
]
1 change: 1 addition & 0 deletions pandas/tests/indexes/datetimes/methods/test_normalize.py
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,7 @@ def test_normalize_tz(self):
assert not rng.is_normalized

@td.skip_if_windows
@td.skip_if_wasm # tzset is available only on Unix-like systems
@pytest.mark.parametrize(
"timezone",
[
Expand Down
4 changes: 2 additions & 2 deletions pandas/tests/indexes/datetimes/methods/test_resolution.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
from dateutil.tz import tzlocal
import pytest

from pandas.compat import IS64
from pandas.compat import IS64, WASM

from pandas import date_range

Expand All @@ -22,7 +22,7 @@
)
def test_dti_resolution(request, tz_naive_fixture, freq, expected):
tz = tz_naive_fixture
if freq == "YE" and not IS64 and isinstance(tz, tzlocal):
if freq == "YE" and ((not IS64) or WASM) and isinstance(tz, tzlocal):
request.applymarker(
pytest.mark.xfail(reason="OverflowError inside tzlocal past 2038")
)
Expand Down
3 changes: 3 additions & 0 deletions pandas/tests/io/parser/common/test_file_buffer_url.py
Original file line number Diff line number Diff line change
Expand Up @@ -80,6 +80,7 @@ def test_path_path_lib(all_parsers):
tm.assert_frame_equal(df, result)


@td.skip_if_wasm # limited file system access on WASM
def test_nonexistent_path(all_parsers):
# gh-2428: pls no segfault
# gh-14086: raise more helpful FileNotFoundError
Expand All @@ -93,6 +94,8 @@ def test_nonexistent_path(all_parsers):
assert path == e.value.filename


@td.skip_if_wasm # limited file system access on WASM, it leads to different
# error messages than on other platforms
@td.skip_if_windows # os.chmod does not work in windows
def test_no_permission(all_parsers):
# GH 23784
Expand Down
1 change: 1 addition & 0 deletions pandas/tests/io/parser/test_c_parser_only.py
Original file line number Diff line number Diff line change
Expand Up @@ -550,6 +550,7 @@ def test_chunk_whitespace_on_boundary(c_parser_only):
tm.assert_frame_equal(result, expected)


@td.skip_if_wasm # limited file system access on WASM
def test_file_handles_mmap(c_parser_only, csv1):
# gh-14418
#
Expand Down
10 changes: 5 additions & 5 deletions pandas/tests/io/sas/test_sas7bdat.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
import numpy as np
import pytest

from pandas.compat import IS64
from pandas.compat import IS64, WASM
from pandas.errors import EmptyDataError

import pandas as pd
Expand Down Expand Up @@ -190,7 +190,7 @@ def test_date_time(datapath):
res = df0["DateTimeHi"].astype("M8[us]").dt.round("ms")
df0["DateTimeHi"] = res.astype("M8[ms]")

if not IS64:
if (not IS64) or (WASM):
# No good reason for this, just what we get on the CI
df0.loc[0, "DateTimeHi"] += np.timedelta64(1, "ms")
df0.loc[[2, 3], "DateTimeHi"] -= np.timedelta64(1, "ms")
Expand Down Expand Up @@ -285,7 +285,7 @@ def test_max_sas_date(datapath):
columns=["text", "dt_as_float", "dt_as_dt", "date_as_float", "date_as_date"],
)

if not IS64:
if (not IS64) or (WASM):
# No good reason for this, just what we get on the CI
expected.loc[:, "dt_as_dt"] -= np.timedelta64(1, "ms")

Expand Down Expand Up @@ -328,7 +328,7 @@ def test_max_sas_date_iterator(datapath):
columns=col_order,
),
]
if not IS64:
if (not IS64) or (WASM):
# No good reason for this, just what we get on the CI
expected[0].loc[0, "dt_as_dt"] -= np.timedelta64(1, "ms")
expected[1].loc[0, "dt_as_dt"] -= np.timedelta64(1, "ms")
Expand Down Expand Up @@ -359,7 +359,7 @@ def test_null_date(datapath):
),
},
)
if not IS64:
if (not IS64) or (WASM):
# No good reason for this, just what we get on the CI
expected.loc[0, "datetimecol"] -= np.timedelta64(1, "ms")
tm.assert_frame_equal(df, expected)
Expand Down
8 changes: 7 additions & 1 deletion pandas/tests/io/test_common.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@

import pandas as pd
import pandas._testing as tm
import pandas.util._test_decorators as td

import pandas.io.common as icom

Expand Down Expand Up @@ -163,6 +164,7 @@ def test_iterator(self):
tm.assert_frame_equal(first, expected.iloc[[0]])
tm.assert_frame_equal(pd.concat(it), expected.iloc[1:])

@td.skip_if_wasm # limited file system access on WASM
@pytest.mark.parametrize(
"reader, module, error_class, fn_ext",
[
Expand Down Expand Up @@ -228,6 +230,8 @@ def test_write_missing_parent_directory(self, method, module, error_class, fn_ex
):
method(dummy_frame, path)


@td.skip_if_wasm # limited file system access on WASM
@pytest.mark.parametrize(
"reader, module, error_class, fn_ext",
[
Expand Down Expand Up @@ -382,6 +386,7 @@ def mmap_file(datapath):


class TestMMapWrapper:
@td.skip_if_wasm # limited file system access on WASM
def test_constructor_bad_file(self, mmap_file):
non_file = StringIO("I am not a file")
non_file.fileno = lambda: -1
Expand All @@ -404,6 +409,7 @@ def test_constructor_bad_file(self, mmap_file):
with pytest.raises(ValueError, match=msg):
icom._maybe_memory_map(target, True)

@td.skip_if_wasm # limited file system access on WASM
def test_next(self, mmap_file):
with open(mmap_file, encoding="utf-8") as target:
lines = target.readlines()
Expand Down Expand Up @@ -586,7 +592,7 @@ def test_bad_encdoing_errors():
with pytest.raises(LookupError, match="unknown error handler name"):
icom.get_handle(path, "w", errors="bad")


@td.skip_if_wasm # limited file system access on WASM
def test_errno_attribute():
# GH 13872
with pytest.raises(FileNotFoundError, match="\\[Errno 2\\]") as err:
Expand Down
2 changes: 1 addition & 1 deletion pandas/tests/io/xml/test_xml.py
Original file line number Diff line number Diff line change
Expand Up @@ -484,7 +484,7 @@ def test_empty_string_etree(val):
with pytest.raises(ParseError, match="no element found"):
read_xml(data, parser="etree")


@td.skip_if_wasm # limited file system access on WASM
def test_wrong_file_path(parser):
filename = os.path.join("does", "not", "exist", "books.xml")

Expand Down
2 changes: 2 additions & 0 deletions pandas/tests/scalar/timestamp/methods/test_replace.py
Original file line number Diff line number Diff line change
Expand Up @@ -99,13 +99,15 @@ def test_replace_integer_args(self, tz_aware_fixture):
with pytest.raises(ValueError, match=msg):
ts.replace(hour=0.1)

@td.skip_if_wasm # tzset is available only on Unix-like systems
def test_replace_tzinfo_equiv_tz_localize_none(self):
# GH#14621, GH#7825
# assert conversion to naive is the same as replacing tzinfo with None
ts = Timestamp("2013-11-03 01:59:59.999999-0400", tz="US/Eastern")
assert ts.tz_localize(None) == ts.replace(tzinfo=None)

@td.skip_if_windows
@td.skip_if_wasm # tzset is available only on Unix-like systems
def test_replace_tzinfo(self):
# GH#15683
dt = datetime(2016, 3, 27, 1)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@

class TestTimestampMethod:
@td.skip_if_windows
@td.skip_if_wasm # tzset is available only on Unix-like systems
def test_timestamp(self, fixed_now_ts):
# GH#17329
# tz-naive --> treat it as if it were UTC for purposes of timestamp()
Expand Down
2 changes: 2 additions & 0 deletions pandas/tests/scalar/timestamp/test_formats.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
import pytz # a test below uses pytz but only inside a `eval` call

from pandas import Timestamp
import pandas.util._test_decorators as td

ts_no_ns = Timestamp(
year=2019,
Expand Down Expand Up @@ -95,6 +96,7 @@ class TestTimestampRendering:
@pytest.mark.parametrize(
"date", ["2014-03-07", "2014-01-01 09:00", "2014-01-01 00:00:00.000000001"]
)
@td.skip_if_wasm # tzset is not available in WASM
def test_repr(self, date, freq, tz):
# avoid to match with timezone name
freq_repr = f"'{freq}'"
Expand Down
5 changes: 4 additions & 1 deletion pandas/tests/tools/test_to_datetime.py
Original file line number Diff line number Diff line change
Expand Up @@ -959,6 +959,7 @@ def test_to_datetime_YYYYMMDD(self):
assert actual == datetime(2008, 1, 15)

@td.skip_if_windows # `tm.set_timezone` does not work in windows
@td.skip_if_wasm # tzset is available only on Unix-like systems
def test_to_datetime_now(self):
# See GH#18666
with tm.set_timezone("US/Eastern"):
Expand All @@ -975,7 +976,8 @@ def test_to_datetime_now(self):
assert pdnow.tzinfo is None
assert pdnow2.tzinfo is None

@td.skip_if_windows # `tm.set_timezone` does not work in windows
@td.skip_if_windows # `tm.set_timezone` does not work on Windows
@td.skip_if_wasm # tzset is available only on Unix-like systems
@pytest.mark.parametrize("tz", ["Pacific/Auckland", "US/Samoa"])
def test_to_datetime_today(self, tz):
# See GH#18666
Expand Down Expand Up @@ -1007,6 +1009,7 @@ def test_to_datetime_today_now_unicode_bytes(self, arg):
to_datetime([arg])

@pytest.mark.filterwarnings("ignore:Timestamp.utcnow is deprecated:FutureWarning")
@td.skip_if_wasm # tzset is available only on Unix-like systems
@pytest.mark.parametrize(
"format, expected_ds",
[
Expand Down
3 changes: 2 additions & 1 deletion pandas/tests/tseries/offsets/test_common.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@
)
from pandas.compat import (
IS64,
WASM,
is_platform_windows,
)

Expand Down Expand Up @@ -130,7 +131,7 @@ def test_apply_out_of_range(request, tz_naive_fixture, _offset):
if tz is not None:
assert t.tzinfo is not None

if isinstance(tz, tzlocal) and not IS64 and _offset is not DateOffset:
if isinstance(tz, tzlocal) and ((not IS64) or WASM) and _offset is not DateOffset:
# If we hit OutOfBoundsDatetime on non-64 bit machines
# we'll drop out of the try clause before the next test
request.applymarker(
Expand Down
2 changes: 1 addition & 1 deletion pandas/tests/tslibs/test_parsing.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@
import pandas._testing as tm
from pandas._testing._hypothesis import DATETIME_NO_TZ


@td.skip_if_wasm # tzset is available only on Unix-like systems
@pytest.mark.skipif(
is_platform_windows() or ISMUSL,
reason="TZ setting incorrect on Windows and MUSL Linux",
Expand Down
6 changes: 6 additions & 0 deletions pandas/util/_test_decorators.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,8 @@ def test_foo():

from __future__ import annotations

import sys
import platform
import locale
from typing import (
TYPE_CHECKING,
Expand Down Expand Up @@ -115,6 +117,10 @@ def skip_if_no(package: str, min_version: str | None = None) -> pytest.MarkDecor
locale.getlocale()[0] != "en_US",
reason=f"Set local {locale.getlocale()[0]} is not en_US",
)
skip_if_wasm = pytest.mark.skipif(
(sys.platform == "emscripten") or (platform.machine() in ["wasm32", "wasm64"]),
reason="does not support wasm"
)


def parametrize_fixture_doc(*args) -> Callable[[F], F]:
Expand Down
Loading