Skip to content

REF: rename _data->_mgr #33054

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Apr 6, 2020
Merged

Conversation

jbrockmendel
Copy link
Member

We have a long-term goal of reducing the exposed surface of the internals. One difficulty is that it is not trivial to identify all the places that access the internal, in part because grepping for ._data will turn up a ton of non-internals Index and EA attributes. By renaming _data -> _mgr, we'll make it easier to identify places that we can de-internalize.

If this isn't a route we want to go down (in the past when I've pitched this it hasn't caught on), I'll try to keep this up-to-date so we an use the diff as the relevant reference.

@jorisvandenbossche
Copy link
Member

Your previous attempt was here: #27254

I am +1 on a rename, it will indeed make checking all cases where the BlockManager is accessed much easier to find.

But, we will need to keep _data as an alias. The BlockManager is enough a fundamental part of pandas that other projects rely on it (eg arrow does)

@jreback
Copy link
Contributor

jreback commented Mar 29, 2020

ok with this, I would add back a ._data property to BlockManager with a DeprecationWarning to get downstream to remove this dep. Also create an issue to remove this at some point.

@jreback jreback added Internals Related to non-user accessible pandas implementation API - Consistency Internal Consistency of API/Behavior labels Mar 29, 2020
@jreback jreback added this to the 1.1 milestone Mar 29, 2020
@jreback
Copy link
Contributor

jreback commented Mar 29, 2020

ping when green (maybe need to rebase at end of day today)

@jbrockmendel
Copy link
Member Author

ok with this, I would add back a ._data property to BlockManager with a DeprecationWarning to get downstream to remove this dep. Also create an issue to remove this at some point.

The property is already there, but not currently with a DeprecationWarning. @jorisvandenbossche are you OK with warning?

@jorisvandenbossche
Copy link
Member

Since we are only doing this to make searching in our code base easier, and not because something is broken or it is a wrong name or something else we try to solve, I don't see a reason why we should bother ourselves and downstream packages with a deprecation cycle.

@jbrockmendel
Copy link
Member Author

I don't see a reason why we should bother ourselves and downstream packages with a deprecation cycle.

I think what @jreback is getting at is that we would like to wean downstream packages off of internals, but I agree that this can be considered separately from the renaming.

@jorisvandenbossche
Copy link
Member

Well, right now, I suppose most downstream packages would probably just add some compatibility code to use the old/new name depending on the version, as they might have legitimate use cases for getting the blocks.
So since we are not changing anything but the name (eg we don't provide an alternative API to get the blocks), I am not sure what the value is of deprecation.

@jbrockmendel
Copy link
Member Author

So since we are not changing anything but the name (eg we don't provide an alternative API to get the blocks), I am not sure what the value is of deprecation.

Yah, im fine with that. Let's stick a pin into deprecation and just do the renaming.

@jbrockmendel
Copy link
Member Author

i think we have consensus here?

@jorisvandenbossche
Copy link
Member

@jreback can you respond here?

@jreback
Copy link
Contributor

jreback commented Apr 6, 2020

ok to merge, though still think we should show a DeprecationWarning (to have any libraries change to use the new code) and its not user visible anyhow (can be done as a followon).

@jorisvandenbossche jorisvandenbossche merged commit f0ab1c5 into pandas-dev:master Apr 6, 2020
@jorisvandenbossche
Copy link
Member

Thanks Brock!

@jorisvandenbossche
Copy link
Member

@jreback can you open an issue for that if you want to push for it? (I am still -1 based on my explanation above)

@jbrockmendel jbrockmendel deleted the data-mgr branch April 6, 2020 18:07
@jreback
Copy link
Contributor

jreback commented Apr 6, 2020

@jreback can you open an issue for that if you want to push for it? (I am still -1 based on my explanation above)

sure

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API - Consistency Internal Consistency of API/Behavior Internals Related to non-user accessible pandas implementation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants