-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
PLOT: Add option to specify the plotting backend #26753
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PLOT: Add option to specify the plotting backend #26753
Conversation
Just realised with @TomAugspurger previous branch on the same work that I can simply do
|
pandas/plotting/_core.py
Outdated
'A pandas plotting backend must be a module that ' | ||
'can be imported'.format(backend_str)) | ||
|
||
required_objs = ['LinePlot', 'BarPlot', 'BarhPlot', 'HistPlot', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like we’re being too opinionated about the backends implementation. IMO, we should provide a series and frame .plot accessor that dispatches to the right backend. We can provide an ABC for backends to subclass with the expected user API, but beyond that I don’t think we care.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a strong opinion about this at this point. I opened #26747 to have a discussion on what we expect from backends, this can surely be simplified.
At this stage I just tried to be conservative, and force the backends to raise NotImplementedError
or whatever they consider if they don't implement something. My idea here is that if the user calls something like Series.plot()
or any other functionality, we don't find with a weird error that doesn't provide relevant information for the user.
But happy raise exceptions from our side for some of those when the backend is not the default one, if that's the preferred option.
Codecov Report
@@ Coverage Diff @@
## master #26753 +/- ##
==========================================
- Coverage 91.7% 91.67% -0.03%
==========================================
Files 179 179
Lines 50767 50784 +17
==========================================
+ Hits 46555 46558 +3
- Misses 4212 4226 +14
Continue to review full report at Codecov.
|
Codecov Report
@@ Coverage Diff @@
## master #26753 +/- ##
==========================================
- Coverage 91.98% 91.97% -0.01%
==========================================
Files 180 180
Lines 50760 50772 +12
==========================================
+ Hits 46690 46700 +10
- Misses 4070 4072 +2
Continue to review full report at Codecov.
|
@jreback this should be ready now, let me know if there is any other change needed. |
doc/source/whatsnew/v0.25.0.rst
Outdated
@@ -132,6 +132,7 @@ Other Enhancements | |||
- :class:`DatetimeIndex` and :class:`TimedeltaIndex` now have a ``mean`` method (:issue:`24757`) | |||
- :meth:`DataFrame.describe` now formats integer percentiles without decimal point (:issue:`26660`) | |||
- Added support for reading SPSS .sav files using :func:`read_spss` (:issue:`26537`) | |||
- Added new option ``plotting.backend`` to be able to select a plotting backend different that the existing ``matplotlib`` one. Use ``pandas.set_option('plotting.backend', '<backend-module>')`` where ``<backend-module`` is a library implementing the pandas plotting API (:issue:`14130`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"that" -> "than"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't have any alternative engines to list here yet, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not at the moment, but that's a good point. Next week once this is merged I'm planning to work with few people to adapt hvplot
. So we can see that everything is working well, and we can fix anything before 0.25. It may make sense to update this and use hvplot
as an example when it's ready.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great. I think this should be a prominent new feature if we are able to get either or both of pdvega ready to use it in time for the release.
@@ -460,6 +462,40 @@ def use_inf_as_na_cb(key): | |||
# Plotting | |||
# --------- | |||
|
|||
plotting_backend_doc = """ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One question: If I wanted to use the altar backend, I would be more like to use .set_option('plotting.backend', 'altair')
than ..., 'pdvega')
. @jakevdp what name would you prefer?
I think hvplot will just be hvplot, so that's fine.
Anyway, we might consider adding a dict here like plotting_backend_alias
that maps the user-facing name like altair
to the backend name like pdvega
. When the backend library registers themselves, they can also register their aliases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see your point, and I think it'd add value to the users, but not sure if I'm in favor of adding the extra complexity it's needed to manage aliases in a dynamic way.
I like the simplicity of the parameter being the name of the module. I guess in some cases will look nicer than others. May be hvplot
will use hvplot.pandas
, since hvplot contains other things besides our plugin, and the module to use may be hvplot.pandas
.
In practice I guess backends will register themselves, and users will rarely switch backends manually. But I guess if they do, it'll be better if they know they need to use the name of the module:
import pandas
import hvplot.pandas
df.plot()
pandas.set_option('backend.plotting', 'matplotlib')
df.plot()
pandas.set_option('backend.plotting', 'hvplot.pandas')
df.plot()
I don't have a strong opinion, but I'd say let's start with the simplest option, and add aliases or something else if we think it's useful once we start using this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it especially complex? I was thinking something like
_plotting_aliases = {} # or define somewhere else
def register_plotting_backend_cb(key):
backend_str = cf.get_option(key)
backend_str = _plotting_aliases.get(backend_str, backend_str)
...
Indeed, I think this simplifies things already, since we can use 'matplotlib'
as pandas.plotting._matplotlib
. Though we may continue to special case matplotlib to provide a nice error message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I wouldn't do is to have the aliases in pandas itself. May be I'm being too strict, but if feels wrong.
But you're right, it's probably not as complex as I was thinking anyway. An simple option plotting.aliases
with a dictionary may not be ideal, but would allow backends create an alias by simply:
pandas.set_option('plotting.aliases', dict(pandas.get_option('plotting.aliases'), hvplot='hvplot.pandas'))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but better in a follow up PR I think, so we can focus there on the exact syntax and approach
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perfectly fine doing as a followup.
And my thinking may have been a bit muddled here. I was thinking that the backend library would have already been imported, and so would have a chance to register their own aliases. But as you say, it would be pandas managing them, which doesn't feel quite right.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
another simple option is that backends add an optional attribute alias = 'hvplot'
, and we simply do:
if hasattr(backend_mod, 'alias'):
plotting_aliases[alias] = backend_mod.__name__
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes good idea. But still leaving this as a followup?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I prefer to keep the focus, the smaller the PRs, the better the content :)
@datapythonista amazing this didn't need to be rebase, but can you merge master so the tests run again after @TomAugspurger convert patch. merge on green. |
git diff upstream/master -u -- "*.py" | flake8 --diff
Adding option to specify the plotting backend. This does not change the plotting API, so for now backends are expected to implement the existing methods in the matplotlib API. This API is expected to change as a result of the discussions in #26747.