Skip to content

BUG on main: DeprecationWarning triggered by internal read_orc/parquet code #56171

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
3 tasks done
twoertwein opened this issue Nov 26, 2023 · 2 comments
Closed
3 tasks done
Labels
Bug Needs Triage Issue that has not been reviewed by a pandas team member

Comments

@twoertwein
Copy link
Member

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

# any
pd.read_orc(...)
# and any
pd.read_parquet(...)
# trigger DeprecationWarning: Passing a BlockManager to DataFrame is deprecated and will raise in a future version. Use public APIs instead.

Issue Description

The above happens only on main.

From read_parquet

../../.cache/pypoetry/virtualenvs/pandas-stubs-DrIM1v70-py3.11/lib/python3.11/site-packages/pandas/io/parquet.py:671: in read_parquet
    return impl.read(
../../.cache/pypoetry/virtualenvs/pandas-stubs-DrIM1v70-py3.11/lib/python3.11/site-packages/pandas/io/parquet.py:280: in read
    result = pa_table.to_pandas(**to_pandas_kwargs)
pyarrow/array.pxi:884: in pyarrow.lib._PandasConvertible.to_pandas
    ???
pyarrow/table.pxi:4196: in pyarrow.lib.Table._to_pandas
    ???
pyarrow/pandas-shim.pxi:112: in pyarrow.lib._PandasAPIShim.data_frame
    ???
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <[AttributeError("'DataFrame' object has no attribute '_mgr'") raised in repr()] DataFrame object at 0x7fdf70635210>
data = BlockManager
Items: Index(['a', 'b'], dtype='object')
Axis 1: Index([0, 1, 2], dtype='int64')
NumpyBlock: slice(0, 1, 1), 1 x 3, dtype: int64
NumpyBlock: slice(1, 2, 1), 1 x 3, dtype: float64
index = None, columns = None, dtype = None, copy = None

    def __init__(
        self,
        data=None,
        index: Axes | None = None,
        columns: Axes | None = None,
        dtype: Dtype | None = None,
        copy: bool | None = None,
    ) -> None:
        allow_mgr = False
        if dtype is not None:
            dtype = self._validate_dtype(dtype)

        if isinstance(data, DataFrame):
            data = data._mgr
            allow_mgr = True
            i if not copy:
                # if not copying data, ensure to still return a shallow copy
                # to avoid the result sharing the same Manager
                data = data.copy(deep=False)

        if isinstance(data, (BlockManager, ArrayManager)):
            if not allow_mgr:
                # GH#52419
>               warnings.warn(
                    f"Passing a {type(data).__name__} to {type(self).__name__} "
                    "is deprecated and will raise in a future version. "
                    "Use public APIs instead.",
                    DeprecationWarning,
                    stacklevel=1,  # bump to 2 once pyarrow 15.0 is released with fix
                )
E               DeprecationWarning: Passing a BlockManager to DataFrame is deprecated and will raise in a future version. Use public APIs instead.

xref pandas-dev/pandas-stubs#819

Expected Behavior

No warning

Installed Versions

On main

@twoertwein twoertwein added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Nov 26, 2023
@twoertwein twoertwein changed the title BUG: BUG on main: DeprecationWarning triggered by internal read_orc/parquet code Nov 26, 2023
@twoertwein
Copy link
Member Author

technically not tested on main but using the nightly builds used by pandas-stubs.

@phofl
Copy link
Member

phofl commented Nov 26, 2023

This is in arrow, so there is not much that we can do for now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Needs Triage Issue that has not been reviewed by a pandas team member
Projects
None yet
Development

No branches or pull requests

2 participants