-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: fix Panel.fillna() ignoring axis parameter #8395
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@0zeroth please take a look if you can |
* 0: fill column-by-column | ||
* 1: fill row-by-row | ||
method : {'backfill', 'bfill', 'pad', 'ffill', None}, default None | ||
Method to use for filling holes in reindexed Series |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't change arguments orders
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't change the order in the function signature. I just noticed the docstring was not ordered the same way as the signature, so I made them consistent. I can change them back.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no you are right. this is fine
default filling should be along rows for frames (axis=0), this is the stat_axis it just needs to be documented |
# need to downcast here because of all of the transposes | ||
result._data = result._data.downcast() | ||
|
||
return result | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are you taking this out?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't take it out. I moved it after the check for if it's a 3d object. It doesn't make sense to run this check on a Panel
and it can cause errors if you happen to have a mixed type Panel
that you're filling along axis 1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, this is a bit tricky I think. I suppose its possible to fill a mixed-type Panel in some cases. But if you see below, I think we should use Panel.apply
(which can then call DataFrame.fillna) to fix/solve this problem, so this restrictino needs to remain in place. I suspect some more tests will find if what I am saying is true (or I am FOS). :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should still be functional for panels with mixed_type data. The way panel filling works is the panels are broken down into their constituent dataframes and then those are filled quasi-recursively. When the mixed-type dataframes come around the check will be performed and they'll be handled appropriately.
I will definitely add some tests to make sure this is the case.
I've been trying to implement this with
Using |
you need to specifiy multiple axes in order to get a frame. |
Oh, I didn't mention my attempts with passing dataframes, which was actually my original idea. I played around with passing dual axes as you suggested, but the axes get transposed. That leaves you having to manually transpose the axes again, which is what we were trying to avoid.
However, now that I reconsider this, it guess it might be preferable to use |
I implemented the |
@@ -2288,11 +2281,32 @@ def fillna(self, value=None, method=None, axis=0, inplace=False, | |||
|
|||
# 3d | |||
elif self.ndim == 3: | |||
if axis == 0: | |||
new_axis = 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use something like this
axes = list(range(self.ndim))
axes.remove(axis)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I follow. The apply_axes
need to contain the axis which is being filled.
Realized that this method was almost trivial to extend to Panel4D, so I implemented that (along with tests). Unless I'm dopey, which is totally possible, I don't think your suggestion of using |
Also added some documentation |
) | ||
|
||
# 3d or 4d | ||
if self.ndim >= 3: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you need to fix this to avoid doing all these ndim specific things. (and then transposing and such)
This works... mostly. Unfortunately it fails for categoricals since they don't have I would also point out that I needed to invoke a |
I implemented a method for restoring categorical data. Not sure if it's robust enough to be widely applicable, but it does work for this use case and the categorical tests pass. |
will have to look after 0.15.0 is released (end of the week). |
If you want to take a look, I have time to work on this PR this week. |
@@ -324,6 +324,8 @@ API changes | |||
- Merge errors will now be sub-classes of ``ValueError`` rather than raw ``Exception`` (:issue:`8501`) | |||
- ``DataFrame.plot`` and ``Series.plot`` keywords are now have consistent orders (:issue:`8037`) | |||
|
|||
- ``Panel.fillna()`` now defaults to the stat axis (axis 1) when filling with a method. Previously the default axis was 0. Similarly, ``Panel.ffill()`` and ``Panel.bfill()`` default to axis 1. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these need to move to v0.15.1.txt
@stahlous can you rebase / revisit? |
pls rebase/update. closing, but reopen if you want to work on. |
Revisiting this after an extended hiatus. Do I need to submit another pull request, or can this one be re-opened? |
no, you can just push to this one |
I pushed to the Also, I'm having an odd problem with Travis. Anywhere that
It's like |
you need to rebase on master |
This fixes #8251. It may need some more comprehensive tests but before I go too much farther I want to make sure the axis numbering scheme makes sense . The default behavior of
fillna
is to fill alongaxis=0
. In the case of a DataFrame this fills along the columns. However, in the case of a Panel, filling alongaxis=0
means you're filling along items. I'm not sure if that's the behavior that most users would think is the default. You might note that I also removed the references to what the behavior of each axis value is in the docstring since it gets a bit arbitrary when you consider both DataFrames and Panels.