-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
ENH: add a gradient map to background gradient #39930
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
conceptually i guess is ok, but this function is getting now a lot of arguments, If gmap does not interact with the other arguments, then i thin a separate function would be better here.
For working with color gradients it is quite common to have a gradient map as an input, for example if you have text, or non-quantitative data. I think a second function would repeat a lot of code and a lot of documentation, since you would still want, Edit: I was also, later, going to propose a |
…nt_gmap # Conflicts: # pandas/tests/io/formats/style/test_style.py
pandas/io/formats/style.py
Outdated
if gmap is None: | ||
gmap = s.to_numpy(dtype=float) | ||
else: | ||
try: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, shouldn't this just be a DataFrame that is aligned? why go to all of this complication (and its fragile)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In all cases the gmap
has to match the data shape s
if it doesn't it would just raise downstream error. This way, imo the code is simple and concise and provides both flexibility of input and helpful user error feedback.
in the case axis
in [0,1] then s
, the data, is guaranteed a 1d-series, and you can supply, as gmap
either 1d or 2d inputs provided the second dim has length 1 so it can be reshaped:
df = DataFrame([[1,2],[3,4]])
...background_gradient(gmap=[2,4], axis=0) # 1d list
...background_gradient(gmap=df[1], axis=0) # series
...background_gradient(gmap=df[[1]], axis=0) # dataframe
...background_gradient(gmap=[[2],[4]], axis=0) # 2d list
...background_gradient(gmap=np.array([[2],[4]]), axis=0) # 2d numpy
...background_gradient(gmap=np.array([2,4]), axis=0) # 1d numpy ALL WORK
in the case axis
is None and s
is a DataFrame with 2 cols or more then you need a 2d input (or in edge-case a ravelled 1d input):
....background_gradient(gmap=[[2,2],[4,4]], axis=None) # 2d list
...background_gradient(gmap=np.array([[2,2],[4,4]]), axis=None) # 2d numpy
...background_gradient(gmap=df, axis=None) # dataframe
...background_gradient(gmap=[2,2,4,4], axis=None) # 1d edge case.
In the case axis
is None and the the data is a 1-col dataframe any of the 1d cases will again work, e.g
.background_gradient(gmap=[2,4], subset=[0], axis=None)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you need to for sure align if a dataframe is passed.
ok on 2D. why is accepting 1D ok ? this is just too flexible. I dont' see any advantage to this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok how's this: I've split the 1d and 2d cases still compare shape at end with error if they don't match.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think you are missing my point. what does a 2D gradient map mean / do? I don't see any tests that actually use this nor what its supposed to do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there are tests now for 2D arrays and DataFrames, for minimalist correct functionality, but not to demonstrate purpose.
the purpose of a 2D map is generally improved visualisation of 2D tabular data. If you just shade the background of a dataframe according to data that you can already read in the cell it just looks nice but doesn't convey information. But if you can, optionally, shade a background based on an alternate 2D map (axis=None
) then you are providing a visual cross section of information analysis, like my examples given below, and say the performance grid here You can't achieve this with only the 1D (axis=0,1
) functionality.
You could technically do heat maps currently but you would have to hide the cell data with CSS. This PR offers an easier solution for something like Look at 9. Heat Map (no text in cell). And you can only shade string cells with a provided quantitative map: something like doing this
I'm interested in this feature (I've struggled with stylers having to apply to the data in the df, not some transform of it). I'm not convinced by the name I think what you mean by the parameter is "surrogate data to use as the gradient function"; indeed, the underlying data is forgotten when a gmap is used. I'd most likely call this parameter I also note that a future iteration might allow |
I understand it to be the map used to render the That said, I'm not tied to it. If @jreback agrees (or anyone else) I will happily change the name, to a new suggestion.
I agree with you on this. I hadn't thought of this or of a use case yet, and it would just complicate the PR so best to leave out initially I think. |
…nt_gmap # Conflicts: # doc/source/whatsnew/v1.3.0.rst
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
really really try to reduce complexity
pandas/io/formats/style.py
Outdated
subset = self.data.select_dtypes(include=np.number).columns | ||
|
||
self.apply( | ||
self._background_gradient, | ||
partial(self._background_gradient, axis=axis), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is really non-obvious why this is needed and is adding complexity. what is the reason?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of using axis
to determine if apply-generated-object was a DataFrame or Series, I just tested directly and have removed the axis argument, and therefore the partial.
pandas/io/formats/style.py
Outdated
@@ -1512,26 +1585,24 @@ def _background_gradient( | |||
text_color_threshold: float = 0.408, | |||
vmin: Optional[float] = None, | |||
vmax: Optional[float] = None, | |||
gmap: Optional[Union[Sequence, np.ndarray, FrameOrSeries]] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this need not be a staticmethod which is adding all kind of complexity here, make this a method as you now need to pass axis
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this was a static method before I made modifications. It does not need to be a static method so have taken it module level.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok can you merge master, let's try to do this before the refactoring ones
…nt_gmap # Conflicts: # pandas/io/formats/style.py
…nt_gmap # Conflicts: # pandas/io/formats/style.py
@jreback pinging have merged master. this is greenish - test failure of ipython directive is unrelated. |
@jreback finally got full green on this, and have merged all the recent typing changes in. hopefully we can finalise this and then the refactor |
thanks @attack68 very nice! |
closes #22727
closes #28901
A very minor extension to the code and adding one argument gives the function a lot more scope: