-
-
Notifications
You must be signed in to change notification settings - Fork 18.6k
VIS/ENH Hexbin plot #5478
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
VIS/ENH Hexbin plot #5478
Conversation
looks interesting |
Thoughts on a default color pallet? Seaborn seems to use This may be helpful. @olgabot seems to use different cmaps for different data values. That may be a bit too much for pandas. EDIT: oh I also think this should spend some time incubating on master. Once the RC branch is forked, I'll add some release notes for .14; |
Oh, also, I hate the matplotlib argument names for the value ( A brief overview: by default ( If you specify What's the policy on respecting other libraries' function arguments. Other pandas plotting functions seem to use If we're willing to change, I'd suggest renaming Here's an example to play with if that helps: df = DataFrame({"A": np.random.uniform(size=20),
"B": np.random.uniform(size=20),
"C": np.arange(20) + np.random.uniform(size=20)})
ax = df.plot(kind='hexbin', x='A', y='B', gridsize=10) # histogram by default
ax = df.plot(kind='hexbin', x='A', y='B', C='C', reduce_C_function=np.max) So in the second case the color of a bin will be determined by the maximum value of |
While cubehelix solves the saturation problem, it doesn't solve the issue that the rainbow colormap is harmful and does not help with interpretation of the heatmap. The linked paper gives great examples how the changing colors introduce patterns in the data that don't actually exist and can lead to over-interpretation of color changes. I don't think using different colormaps for different data ranges is that complicated, in I'm still working on my As for renaming, I agree that the |
I set the matplotlib default colormap to cubehelix in seaborn because I wanted to banish jet and that seemed like a reasonably non-crappy alternative that is at least somewhat adaquate for most data. I don't think the hue shifts are as big a problem in cubehelix as in rainbow maps and jet because they're pretty gradual and accompanied by a shift in lightness/saturation. But in general I agree that there's no good one-size-fits-all solution to colormaps, and it's better to adapt to the data. For the corrplot function the default colormap is |
Thanks for the feedback! I'll think on this, but adapting to the data would be complicated by the business with The default behavior is to just do counts, so things will always be positive in that case. It might make sense just to pick the default with that case in mind, and assume that people doing fancier things can choose an appropriate color palette. |
@olgabot |
Ah, yeah, if the data are counts then it probably makes sense to use a sequential map; any of the colorbrewer ones should work. |
I think this is ready to go. Thanks for the suggestions on the colormaps. I'm going with One thing I'm a tad concerned about is the docstring on df.plot. I thew in a Let me know when your ready and I'll rebase and squash. |
@TomAugspurger you want to throw in 0.13? pls change the release notes, rebase, and squash... |
if you think the API might change, then just label it experimental (if you want) |
does this have an associated issue? |
No issue associated. I just put the PR number in the release notes. I'm pretty confident the API won't need adjusting, but I guess labeling it experimental is the safe thing to do. Ready when you and Travis say it's good! |
This looks nice. @TomAugspurger, do you know if binhex plots are available via yhat/ggplot ? |
looks ready to merge...any objections? @TomAugspurger @y-p @jtratner ? |
Merging new features between RC and final? |
seems non-invasive to me |
I'm not in a rush to get this in if that's a problem. |
@TomAugspurger push this to 0.14, @y-p? |
Fine by me. Once .13 is out I'll make the doc changes and ping you when I get that done (could be a little while). |
Yup, and to stay on track we should be more aggressive about tagging 0.13 final when it's time |
@TomAugspurger, so is this plot available out of the box with seaborn which is built on top of pandas? |
Not yet: Olga Botvinnik On Mon, Dec 16, 2013 at 10:38 AM, y-p [email protected] wrote:
|
I resisted other PRs for more sophisticated plots but this is closer to home, should be ok. |
I've been a tad worried about inconsistency in what vis PRs are accepted as well, so I haven't wanted to push to get this in. That said I think this should go in since I see hexbin plots as a drop-in replacement for scatter plots when you have a bunch of points. Plus Wes tweeted about it, and we can't make him a liar... |
hahah.! ok.... pls rebase and put in release notes (0.13.1)..thxs |
@@ -93,6 +93,8 @@ Experimental Features | |||
- Added PySide support for the qtpandas DataFrameModel and DataFrameWidget. | |||
- Added :mod:`pandas.io.gbq` for reading from (and writing to) Google | |||
BigQuery into a DataFrame. (:issue:`4140`) | |||
- Hexagonal bin plots from ``DataFrame.plot`` with ``kind='hexbin'`` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
moved to 0.13.1 section
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you move this to 0.14.0 section?
@@ -623,6 +623,7 @@ Enhancements | |||
output datetime objects should be formatted. Datetimes encountered in the | |||
index, columns, and values will all have this formatting applied. (:issue:`4313`) | |||
- ``DataFrame.plot`` will scatter plot x versus y by passing ``kind='scatter'`` (:issue:`2215`) | |||
- Hexagonal bin plots from ``DataFrame.plot`` with ``kind='hexbin'`` (:issue:`5478`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move to v0.13.1.txt
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe use the docs example here (its kind of cool) (and possibly put a link to the doc section as well)
I still think 0.14.0... don't make me out to be a softie. |
I threw everything in .14, so this should be good to sit until we start on that. I'm having trouble building the docs right now (on master too, not just this branch). I'll try to track down what's wrong with my environment and then I'll let you know that everything builds correctly. |
ping @jreback if you're ready to merge. |
@TomAugspurger trivial release notes change..then good 2 go |
@jreback Fixed that. I removed the experimental tag too. |
gr8 thanks @TomAugspurger ! |
This is just 10 minutes of copy-paste cargo-culting to gauge interest, I haven't tested anything yet.
It's not terribly difficult to do this on your own, so my feeling wouldn't be hurt at all if people are -1 on this :)
EDIT: oops I branched from my
to_frame
branch. I'll clean up the commits.