-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
ENH: When using another plotting backend, minimize pre-processing #28647
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@datapythonista does this seem reasonable to you? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess it could be perceived as related to those issues if you think the best fix would be to try to know all the options that can accept column names and then reduce the data to only those columns. That will never be sufficient for I think the more essential issue is that |
I'm happy with that. The way it is now it's based in how the matplotlib backend was implemented when was coupled. Besides being simpler in the pandas side, I see advantages on passing the whole dataframe, like backends being able to use other columns for hover and things like that. I think making the changes shouldn't be hard. @TomAugspurger what do you think? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the more essential issue is that pandas.plotting should do minimal data transforming before handing off the data to the plotting backend.
I agree with this goal.
Looking through the stuff that's skipped, ideally it could / would all be deleted. Any interest in working on that @jsignell?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pushed a small test.
While I think that removing all this preprocessing before passing off to the backend for all kind
s is valuable, I don't think we need to hold up this PR for that. I'll open up a followup issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough.
May be worth adding a TODO
with the issue number, so we can identify in the code this is a temporary solution.
Thanks @jsignell |
black pandas
git diff upstream/master -u -- "*.py" | flake8 --diff
I ran into this while implementing the hvplot backend. In hvplot you can do:
but with the pandas version
will fail because
data = data[y]
is called before the plotting is passed off to the backend.Basically it seems like backend writers should be free to get the passed pandas objects with as little interference as possible.