Skip to content

DEPR: Enforce deprecation of ArrayManager #57118

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Jan 31, 2024

Conversation

rhshadrach
Copy link
Member

  • closes #xxxx (Replace xxxx with the GitHub issue number)
  • Tests added and passed if fixing a bug or adding a new feature
  • All code checks passed.
  • Added type annotations to new arguments/methods/functions.
  • Added an entry in the latest doc/source/whatsnew/vX.X.X.rst file if fixing a bug or adding a new feature.

@rhshadrach rhshadrach requested a review from Dr-Irv as a code owner January 28, 2024 18:24
@rhshadrach rhshadrach added Deprecate Functionality to remove in pandas ArrayManager labels Jan 28, 2024
@rhshadrach rhshadrach added this to the 3.0 milestone Jan 28, 2024
Copy link
Member Author

@rhshadrach rhshadrach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jbrockmendel - some comments and questions below

Comment on lines 385 to 386
SingleManager: TypeAlias = "SingleBlockManager"
Manager2D: TypeAlias = "BlockManager"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if these are still useful - can remove

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only useful if they are used elsewhere in the code

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are used - but I'm not sure that makes them useful. We can replace any current usage of SingleManager with SingleBlockManager.

I tracked it down, this was added in #40152, so I think it makes sense to remove.

@@ -289,6 +289,7 @@ def get_dtypes(self) -> npt.NDArray[np.object_]:
dtypes = np.array([blk.dtype for blk in self.blocks], dtype=object)
return dtypes.take(self.blknos)

# TODO: This is being used in a few places
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm seeing arrays being used in a few places, if this should still be removed then I'm thinking it's best for a followup.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm seeing this used in

  • DataFrame._is_homogeneous_type
  • DataFrame._reduce
  • DataFrame._reduce_axis1
  • DataFrame.shift
  • NDFrame._logical_func
  • NDFrame.astype

I'm thinking to just update the docstring for now, removing any mentions of the ArrayManager.

Comment on lines 371 to 373
# TODO: Cleanup?
# Alias so we can share code with ArrayManager
apply_with_block = apply
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this alias be removed?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it can go, as can all of inernals.base IIRC

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. Because part of the implementation needs to be moved from internals.base, I'd like to take that on in a followup.

Copy link
Contributor

@Dr-Irv Dr-Irv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only looked at _typing.py

@@ -23,6 +23,7 @@
Optional,
Protocol,
Type as type_t,
TypeAlias,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not supported in python 3.9, so you should import it from typing_extensions

Comment on lines 385 to 386
SingleManager: TypeAlias = "SingleBlockManager"
Manager2D: TypeAlias = "BlockManager"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only useful if they are used elsewhere in the code

Copy link
Member

@mroeschke mroeschke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you also remove PANDAS_DATA_MANAGER in pandas/__init__.py

And add a whatsnew note in 3.0?

@rhshadrach
Copy link
Member Author

Thanks @Dr-Irv, @jbrockmendel, and @mroeschke. I think this is ready for another look. Two outstanding issues:

Copy link
Contributor

@Dr-Irv Dr-Irv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just looked at _typing.py and I'm fine with that.

Copy link
Member

@mroeschke mroeschke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a few more "mode.data_manager" usages need to be removed

@@ -701,7 +696,7 @@ def __init__(
# to avoid the result sharing the same Manager
data = data.copy(deep=False)

if isinstance(data, (BlockManager, ArrayManager)):
if isinstance(data, (BlockManager)):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extra parens

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

other than this LGTM

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Copy link
Member

@mroeschke mroeschke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feel free to merge once the merge conflict is fixed

@mroeschke mroeschke merged commit c3014ab into pandas-dev:main Jan 31, 2024
@mroeschke
Copy link
Member

Thanks @rhshadrach

@rhshadrach rhshadrach deleted the enf_array_manager branch January 31, 2024 23:39
pmhatre1 pushed a commit to pmhatre1/pandas-pmhatre1 that referenced this pull request May 7, 2024
* DEPR: Enforce deprecation of ArrayManager

* cleanups

* More removals

* whatsnew

* Cleanups

* More removals and whatsnew

* cleanup
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Deprecate Functionality to remove in pandas
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants