-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Subclassed reshape #15564
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Subclassed reshape #15564
Conversation
Codecov Report
@@ Coverage Diff @@
## master #15564 +/- ##
==========================================
- Coverage 91.06% 91.04% -0.02%
==========================================
Files 137 136 -1
Lines 49307 49171 -136
==========================================
- Hits 44899 44769 -130
+ Misses 4408 4402 -6
Continue to review full report at Codecov.
|
can you also have a look over: http://pandas-docs.github.io/pandas-docs-travis/internals.html#subclassing-pandas-data-structures and see if anything needs amendment |
doc/source/whatsnew/v0.9.1.txt
Outdated
@@ -105,6 +105,8 @@ New features | |||
- DataFrame.drop now supports non-unique indexes (:issue:`2101`) | |||
- Panel.shift now supports negative periods (:issue:`2164`) | |||
- DataFrame now support unary ~ operator (:issue:`2110`) | |||
- `stack`, `unstack`, and `pivot` operations now preserve subclass family |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move to 0.20.0
@@ -410,13 +413,24 @@ def unstack(obj, level, fill_value=None): | |||
|
|||
if isinstance(obj, DataFrame): | |||
if isinstance(obj.index, MultiIndex): | |||
return _unstack_frame(obj, level, fill_value=fill_value) | |||
unstacked = _unstack_frame(obj, level, fill_value=fill_value) | |||
else: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm I think this should be done inside the .stack
and _unstack_frame
functions no?
else: | ||
unstacker = _Unstacker(obj.values, obj.index, level=level, | ||
fill_value=fill_value) | ||
return unstacker.get_result() | ||
unstacked = unstacker.get_result() | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here, this now has logic in 2 places, the creation itself of the unstacked result, then the return class. should be a part of get_result
(it certainly could be another method that is called by get_result
) though
@@ -515,7 +529,7 @@ def factorize(index): | |||
mask = notnull(new_values) | |||
new_values = new_values[mask] | |||
new_index = new_index[mask] | |||
return Series(new_values, index=new_index) | |||
return frame._constructor_sliced(new_values, index=new_index) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so this is good
pandas/tests/frame/test_subclass.py
Outdated
@@ -125,6 +125,144 @@ def test_indexing_sliced(self): | |||
tm.assert_series_equal(res, exp) | |||
tm.assertIsInstance(res, tm.SubclassedSeries) | |||
|
|||
def test_subclass_stack(self): | |||
df = tm.SubclassedDataFrame([[1, 2, 3], [4, 5, 6], [7, 8, 9]], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add the issue number as a comment
thanks @delgadom reshape could certainly use some tlc! |
Thanks @jreback - I'm working on your suggestions. Keeping this clean is a bit tough, especially because unstacking is sometimes handled by To me this class seems really nasty and should be handled by functions rather than a class (would you ever use Happy to take a stab if you agree. |
@delgadom the reshaping code is certainly in need of refactoring to make things easier :> certainly take a stab at it. Yes the You might want to do a refactor before this PR to make it easier to test / debug. Though if its relatively simple go ahead here. |
Yeah I think I'll refactor and then create a new PR. Thanks for the help! |
ok! |
git diff upstream/master | flake8 --diff
Reshape operations in pandas/core/reshape.py modified to use
obj._constructor
,obj._constructor_sliced
, andobj._constructor_expanddim
for Series and DataFrame. This preserves a subclass family onstack
,unstack
, andpivot
operations.