Skip to content

REF: simplify _from_mgr constructors, use in more places #53871

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 27, 2023

Conversation

jbrockmendel
Copy link
Member

Largely split off from #52419

@jbrockmendel jbrockmendel requested a review from rhshadrach as a code owner June 26, 2023 22:25
@mroeschke mroeschke added Refactor Internal refactoring of code Internals Related to non-user accessible pandas implementation labels Jun 27, 2023
@mroeschke mroeschke added this to the 2.1 milestone Jun 27, 2023
@mroeschke mroeschke merged commit 5788685 into pandas-dev:main Jun 27, 2023
@mroeschke
Copy link
Member

Thanks @jbrockmendel

@jbrockmendel jbrockmendel deleted the ref-from_mgr-2 branch June 27, 2023 17:23
if self._constructor is Series:
# we are pandas.Series (or a subclass that doesn't override _constructor)
return self._from_mgr(mgr, axes=axes)
ser = self._from_mgr(mgr, axes=axes)
Copy link
Member

@jorisvandenbossche jorisvandenbossche Jul 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jbrockmendel this changes the logic from what we discussed in #52132, and I am afraid this isn't exactly equivalent for subclasses implementing _constructor.

(before, we passed the Manager object to _constructor, while with this PR we pass a Series object)

Now, what was the rationale for changing this?
The PR title says "simplify _from_mgr constructors", but don't see how this is a simplification here? (it actually got longer, and added _name handling)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC i went to use the new constructor_from_mgr in more places and found that broke some tests, and the edits here were the best option I found to keep things working (in particular pinning ser._name)

I also very much like that in this version subclasses don't have to worry about handling Manager objects, just Series/DataFrame objects, which they should already be able to handle.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain why or when the pinning of ser._name is needed? (which case is it solving?)

In geopandas, it is a feature that we can do something different for blockmanager vs generic array-like (we use it to decide whether we want to try to infer objects being passed or not). It also adds some overhead to go through the constructor again with a dataframe/series.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain why or when the pinning of ser._name is needed? (which case is it solving?)

My recollection is that without that, when I changed the usages to use constructor_from_mgr, a handful of tests broke with AttributeError bc _name was not defined.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Internals Related to non-user accessible pandas implementation Refactor Internal refactoring of code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants