-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
Fixing scatter plot size (#32904) #32937
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This fixes the marker size in scatter plots (see #32904).
Can you add a test for this? In And we'll need a release note in whatsnew/v1.1.0.rst. |
@@ -934,6 +934,8 @@ def __init__(self, data, x, y, s=None, c=None, **kwargs): | |||
# hide the matplotlib default for size, in case we want to change | |||
# the handling of this argument later | |||
s = 20 | |||
elif s in data.columns: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might fails for things like s=[1, 2, 3]
, non-hashable inputs. We should have other cases where we handle arrays or strings. In _make_plot
we have something like
c_is_column = is_hashable(c) and c in self.data.columns
Something similar may be required here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are some notes for myself.
According to matplotlib (https://matplotlib.org/3.1.1/api/_as_gen/matplotlib.pyplot.scatter.html), s is scalar or array_like, shape (n, )
. So the decision tree for s
is:
- None? If yes, then set s=20 (as it was defined in the code before), if no continue checking s.
- Scalar (
np.ndim(s) == 0
)? If yes, then proceed with the passed value fors
, else continue checking. - Hashable and in columns? (
is_hashable(s) and s in data.columns
) If yes, then sets = data[s]
, else continue checkings
. - If this is a list, then check that the list is as long as the number of points to plot (
np.ndim(s)==1 and len(s)==len(x)
) and pass it, else proceed (which will trigger matplotlib errors).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might fails for things like
s=[1, 2, 3]
, non-hashable inputs. We should have other cases where we handle arrays or strings. In_make_plot
we have something likec_is_column = is_hashable(c) and c in self.data.columns
Something similar may be required here.
Could you elaborate on the strings? (I'm not sure what is an example of this.)
* Added `const` where avaible * Trying to use memoryviews REF: https://github.com/pandas-dev/pandas/pull/32893/files#r396033598 * Nitpick, added a trailing comma Co-authored-by: MomIsBestFriend <>
Co-authored-by: MomIsBestFriend <>
* POC masked ops for reductions * fix mask for older numpy * also use in boolean * add min_count support * fix preserve_dtypes test * passthrough min_count for boolean as well * fix comment * add object to empty reduction test case * test platform int * Test sum separately with platform int * share min_count checking helper function with nanops * type + add docstring for min_count * move sum algo from ops to array_algos * add Int64/boolean to some benchmarks * add whatsnew * add skipna default in function signature * update type hint + deprivatize * update another type hint
* troubleshoot azure * troubleshoot locale build
merging with the local copy
This reverts commit 60b0e9f.
I made some errors with merging the commits and ended up deleting the original fork. Not sure if it's possible to adjust this pull request, so will create a new one instead. |
This fixes the marker size in scatter plots (see #32904).
black pandas
git diff upstream/master -u -- "*.py" | flake8 --diff