Skip to content

DEPS: Add warning if pyarrow is not installed #56896

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
Jan 19, 2024
5 changes: 4 additions & 1 deletion .github/workflows/unit-tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -92,7 +92,10 @@ jobs:
- name: "Numpy Dev"
env_file: actions-311-numpydev.yaml
pattern: "not slow and not network and not single_cpu"
test_args: "-W error::DeprecationWarning -W error::FutureWarning"
# Currently restricted the warnings that error to Deprecation Warnings from numpy
# done since pyarrow isn't compatible with numpydev always
# TODO: work with pyarrow to revert this?
test_args: "-W error::DeprecationWarning:numpy -W error::FutureWarning:numpy"
- name: "Pyarrow Nightly"
env_file: actions-311-pyarrownightly.yaml
pattern: "not slow and not network and not single_cpu"
Expand Down
33 changes: 31 additions & 2 deletions pandas/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -202,8 +202,37 @@
FutureWarning,
stacklevel=2,
)
# Don't allow users to use pandas.os or pandas.warnings
del os, warnings

# DeprecationWarning for missing pyarrow
from pandas.compat.pyarrow import pa_version_under10p1, pa_not_found

if pa_version_under10p1:
# pyarrow is either too old or nonexistent, warn
from pandas.compat._optional import VERSIONS

if pa_not_found:
pa_msg = "was not found to be installed on your system."
else:
pa_msg = (
f"was too old on your system - pyarrow {VERSIONS['pyarrow']} "
"is the current minimum supported version as of this release."
)

warnings.warn(
f"""
Pyarrow will become a required dependency of pandas in the next major release of pandas (pandas 3.0),
(to allow more performant data types, such as the Arrow string type, and better interoperability with other libraries)
but {pa_msg}
If this would cause problems for you,
please provide us feedback at https://github.com/pandas-dev/pandas/issues/54466
""", # noqa: E501
DeprecationWarning,
stacklevel=2,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to make this stacklevel=1?

Because this code is top-level and not inside a function or method, it one less than the typical stacklevel of 2 we use in a direct function/method.
Setting it as 1 would avoid that it triggers a warning when you call a function that has an inline import pandas (not sure how important this is, though)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The warning doesn't pop up in the REPL when I import pandas if I do stacklevel=1.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, OK forget this. I did test it, but should have done something wrong because now I can't reproduce getting the warning with plain import and stacklevel 1

)
del VERSIONS, pa_msg

# Delete all unnecessary imported modules
del pa_version_under10p1, pa_not_found, warnings, os

# module level doc-string
__doc__ = """
Expand Down
2 changes: 2 additions & 0 deletions pandas/compat/pyarrow.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
import pyarrow as pa

_palv = Version(Version(pa.__version__).base_version)
pa_not_found = False
pa_version_under10p1 = _palv < Version("10.0.1")
pa_version_under11p0 = _palv < Version("11.0.0")
pa_version_under12p0 = _palv < Version("12.0.0")
Expand All @@ -16,6 +17,7 @@
pa_version_under14p1 = _palv < Version("14.0.1")
pa_version_under15p0 = _palv < Version("15.0.0")
except ImportError:
pa_not_found = True
pa_version_under10p1 = True
pa_version_under11p0 = True
pa_version_under12p0 = True
Expand Down
22 changes: 22 additions & 0 deletions pandas/tests/test_common.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,8 @@
import numpy as np
import pytest

import pandas.util._test_decorators as td

import pandas as pd
from pandas import Series
import pandas._testing as tm
Expand Down Expand Up @@ -265,3 +267,23 @@ def test_bz2_missing_import():
code = textwrap.dedent(code)
call = [sys.executable, "-c", code]
subprocess.check_output(call)


@td.skip_if_installed("pyarrow")
@pytest.mark.parametrize("module", ["pandas", "pandas.arrays"])
def test_pyarrow_missing_warn(module):
# GH56896
response = subprocess.run(
[sys.executable, "-c", f"import {module}"],
capture_output=True,
check=True,
)
msg = """
Pyarrow will become a required dependency of pandas in the next major release of pandas (pandas 3.0),
(to allow more performant data types, such as the Arrow string type, and better interoperability with other libraries)
but was not found to be installed on your system.
If this would cause problems for you,
please provide us feedback at https://github.com/pandas-dev/pandas/issues/54466
""" # noqa: E501
stderr_msg = response.stderr.decode("utf-8")
assert msg in stderr_msg, stderr_msg