-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
QST: Subclassing and concat #35415
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Generally speaking, we're happy to accept PRs for subclassing things as long as they don't affect the typical user's experience. Switching things like
Sure. We have a page dedicated to subclassing.
IMO, no. Most users wouldn't ever need it so I wouldn't want it in the signature.
I'm not familiar with the issue. |
I looks like it still does not work when |
I'd like to clarify the behavior in my original post for this issue with a code example: This is intended to show that either the 'left' value or the first value (in the case of concat) is respected in the case of subclassing. In my opinion the better design decision would be to make these operations commutative. This could be accomplished by only returning a subclass when the objects are homogeneous in type and a pandas object should be returned in all other cases. No matter what decision is made here I believe this should be documented in the subclassing documentation. import pandas as pd
class SubclassedSeries(pd.Series):
@property
def _constructor(self):
return SubclassedSeries
@property
def _constructor_expanddim(self):
return SubclassedDataFrame
class SubclassedDataFrame(pd.DataFrame):
@property
def _constructor(self):
return SubclassedDataFrame
@property
def _constructor_sliced(self):
return SubclassedSeries
subclassed_series = SubclassedSeries(range(10,20), name='test')
s = pd.Series(range(10,20), name='test')
# pandas.merge respects lef class type
t = pd.merge(subclassed_series, s)
assert isinstance(t, SubclassedDataFrame)
t = pd.merge(s, subclassed_series)
assert not isinstance(t, SubclassedDataFrame)
# pandas concat behavior
t = pd.concat([subclassed_series, s])
assert isinstance(t, SubclassedSeries)
t = pd.concat([s,subclassed_series])
assert not isinstance(t, SubclassedSeries)
t = pd.concat([subclassed_series, s], axis=1)
assert isinstance(t, SubclassedDataFrame)
t = pd.concat([s,subclassed_series], axis=1)
assert not isinstance(t, SubclassedDataFrame) |
I was looking into
df.astype
not respecting subclassed objects and noticed the issue had been resolved by modifyingpd.concat
to respect subclassed objects using the following code:Line 464 PR #29627
Line 477 PR #33884 (Set to be released on 1.1.0)
This assumes homogeneous object types, or at the very least that the first object is representative. My assumption was that functions on the pandas namespace did not respect subclassing and only methods on the objects did. Leaving developers to implement their own special use cases for subclassing. This seemed like a good delimitation to me, both logically and programmatically. But, now diving deeper into functions I don't regularly use I see I was mistaken as
merge
andmerge_ordered
both respect theleft
arg's subclassing.With this in mind, I have some questions.
concat
,merge
,merge_ordered
etc handle subclassing.constructor
kwarg added to functions likeread_sql
andread_csv
?to_hdf
not working with subclassing need to be fixed?The text was updated successfully, but these errors were encountered: