-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
API: cut interval formatting #8595
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
this is going to be turned into a Categorical, so ordering will happen automatically. interested to do for 0.15.1? |
@fancychildren can you post a full example (that doesn't order correctly) |
can you do a programatic example, e.g. |
|
I'd offer one potential direction to go with this. I think it would be great to find a way to allow users to specify the formatting of the bin labels returned by One approach would be to allow Here's the situation I've got in mind: import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
np.random.seed(0)
df = pd.DataFrame(np.random.randn(1000, 2) * 3, columns=list('xy'))
df['y'] = (df['x'] * 3) + 5 + np.random.randn(1000) * 2
df['x_bin'], edges = pd.cut(df['x'], bins=5, retbins=True)
nice_names = ['{b:0.1f} : {t:0.1f}'.format(b=edges[i], t=edges[i+1]) for i in range(len(edges)-1)]
df['x_bin_better'] = pd.cut(df['x'], bins=5, labels=nice_names)
fig, axes = plt.subplots(3, 1, figsize=(10, 10))
axes[0].plot(df['x'], df['y'], linestyle='None', marker='o', mew=1, alpha=0.25)
sns.violinplot('x_bin', 'y', data=df, ax=axes[1], scale='count', palette='Blues')
sns.violinplot('x_bin_better', 'y', data=df, ax=axes[2], scale='count', palette='Reds')
fig.tight_layout() To be clear, I'm not suggesting the format implemented above is better (I think the default is quite reasonable). Instead I'm suggesting we make it easier for users to override that format when appropriate. |
Would you be open to accepting a pull request for adding the ability for |
sure I would have a 2 arg callable returning a string |
Yeah, I like the idea of accepting a function that acts on scalars for the left and right bounds. Eventually we could add this option as a method on IntervalIndex. |
Now that |
it would be nice to have number in front of all labels. put number like 00, 01, 02 in front of labels so that it would order appropriately.
Lib\site-packages\pandas\tools\tile.py
The text was updated successfully, but these errors were encountered: