Skip to content

BUG: Fix scatter plot colors in groupby context to match line plot behavior (#59846) #61233

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

myenugula
Copy link
Contributor

def f(self):
return self.plot(*args, **kwargs)
# Special case for scatter plots to enable automatic colors in groupby context
if kwargs.get("kind") == "scatter":
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the issue report, this plot works with kine="line". Why do we need all this logic for scatter but it just work for line?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right - I was trying to solve the issue as an edge case, on top of the piece that was causing this inconsistency of coloring between ScatterPlot and LinePlot.

The source of the issue is that matplotlib handles ploting both line and scatter plots. Matplotlib has its own way of color-cycling when no color is defined. Here is an example:

plt.plot([1, 2, 3], [1, 2, 3])
plt.plot([4, 5, 6], [4, 5, 6])
plt.plot([7, 8, 9], [7, 8, 9])
# This results in having 3 different colors, one for each of the plots above

Similarly

plt.scatter(0, 0)
plt.scatter(1, 1)
plt.scatter(2, 2)
# This results in having 3 different colors, one for each of the points above

However, the reason for this inconsistency between ScatterPlot and LinePlot behaviours is that LinePlot doesn't pass any color in kwds to ax.plot when calling
df.groupby("layer").plot(x='x', y='y', ax= plt.gca(), kind='line')
While ScatterPlot explicitly defines the c= argument in ax.scatter when calling
df.groupby("layer").plot(x='x', y='y', ax= plt.gca(), kind='scatter')

The solution for the source issue is to set c_values = None when no color is passed to ScatterPlot; When self.c is None and self.color is None

@myenugula myenugula requested a review from rhshadrach April 23, 2025 08:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: Automatic change of color when the plot type is "line" but not when it is "scatter"
2 participants