-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
API: interaction between subclasses (implementing __finalize__) and attrs/flags handling in __finalize__ #37099
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hmm, for some reason I thought we couldn't do a "simple" A patch to GeoPandas like --- a/geopandas/geodataframe.py
+++ b/geopandas/geodataframe.py
@@ -894,6 +894,8 @@ class GeoDataFrame(GeoPandasBase, DataFrame):
def __finalize__(self, other, method=None, **kwargs):
"""propagate metadata from other to self """
+ self = super().__finalize__(other, method=method, **kwargs)
+
# merge operation: using metadata of the left object
if method == "merge":
for name in self._metadata:
@@ -902,9 +904,6 @@ class GeoDataFrame(GeoPandasBase, DataFrame):
elif method == "concat":
for name in self._metadata:
object.__setattr__(self, name, getattr(other.objs[0], name, None))
- else:
- for name in self._metadata:
- object.__setattr__(self, name, getattr(other, name, None))
return self seems to still pass all our tests, and enables preserving attrs:
(so until there all my effort to clearly describe the issue .. ;-)) |
Hehe, yeah that was my initial reaction. Sometimes it takes writing up an issue to clarify the solution though. As long as the |
Yes, that's probably the tricky part long term. |
@jorisvandenbossche @TomAugspurger - can this be closed? |
I'm OK with closing for now. We could potentially split |
Pandas provides the mechanism for subclasses to customize how metadata propagation is handled through
_metadata
and__finalize__
. However, this got more complicated with the introduction ofattrs
andflags
, which are also handled in__finalize__
.Concrete example: GeoPandas implements a custom
GeoDataFrame.__finalize__
to be able to have custom logic about how metadata is propagated in methods.See here for the current implementation: https://github.com/geopandas/geopandas/blob/7044aa479dbae72c14dfb647a5a782a01cff81d1/geopandas/geodataframe.py#L1148-L1162 (it specifically checks for
method="merge"|"concat"
to ensure we pass through_metadata
of the first (concat) or left (merge) object).For a long time, pandas only add a very simple
__finalize__
implementation (which just propagated_metadata
unconditionally):However, with the recent changes, the default
__finalize__
now also does (code link to current finalize impl):attrs
flags.allows_duplicate_labels
and have custom logic formethod="concat"
for thisAs a result of GeoPandas overriding
__finalize__
, theattrs
andflags
features currently don't work for GeoDataFrames. See geopandas/geopandas#1654 for the bug report on the GeoPandas side (cc @martinfleis)While we certainly want to enable those features for GeoDataFrames as well.
So the main question is: given this additional logic in the default
__finalize__
, how can we enable that subclasses can still override a part of that logic (eg only_metadata
).GeoPandas could in principle copy the implementation of pandas (and adds its own customizations to it) and keep up to date with it. But, that doesn't feel like the ideal solution, certainly if pandas is going to add more functionality to it (eg more flags).
cc @TomAugspurger
The text was updated successfully, but these errors were encountered: