Skip to content

Fixing scatter plot size (#32904) #32937

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 68 commits into from
Closed

Fixing scatter plot size (#32904) #32937

wants to merge 68 commits into from

Conversation

SultanOrazbayev
Copy link
Contributor

@SultanOrazbayev SultanOrazbayev commented Mar 23, 2020

This fixes the marker size in scatter plots (see #32904).

This fixes the marker size in scatter plots (see #32904).
@TomAugspurger
Copy link
Contributor

Can you add a test for this? In pandas/tests/plotting/test_frame.py Your original example should be fine. That should have a comment referring back to your issue.

And we'll need a release note in whatsnew/v1.1.0.rst.

@TomAugspurger TomAugspurger added this to the 1.1 milestone Mar 23, 2020
@TomAugspurger TomAugspurger added the Visualization plotting label Mar 23, 2020
@@ -934,6 +934,8 @@ def __init__(self, data, x, y, s=None, c=None, **kwargs):
# hide the matplotlib default for size, in case we want to change
# the handling of this argument later
s = 20
elif s in data.columns:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might fails for things like s=[1, 2, 3], non-hashable inputs. We should have other cases where we handle arrays or strings. In _make_plot we have something like

c_is_column = is_hashable(c) and c in self.data.columns

Something similar may be required here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are some notes for myself.

According to matplotlib (https://matplotlib.org/3.1.1/api/_as_gen/matplotlib.pyplot.scatter.html), s is scalar or array_like, shape (n, ). So the decision tree for s is:

  • None? If yes, then set s=20 (as it was defined in the code before), if no continue checking s.
  • Scalar (np.ndim(s) == 0)? If yes, then proceed with the passed value for s, else continue checking.
  • Hashable and in columns? (is_hashable(s) and s in data.columns) If yes, then set s = data[s], else continue checking s.
  • If this is a list, then check that the list is as long as the number of points to plot (np.ndim(s)==1 and len(s)==len(x)) and pass it, else proceed (which will trigger matplotlib errors).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might fails for things like s=[1, 2, 3], non-hashable inputs. We should have other cases where we handle arrays or strings. In _make_plot we have something like

c_is_column = is_hashable(c) and c in self.data.columns

Something similar may be required here.

Could you elaborate on the strings? (I'm not sure what is an example of this.)

ShaharNaveh and others added 24 commits March 23, 2020 14:08
* Added `const` where avaible

* Trying to use memoryviews

REF:
https://github.com/pandas-dev/pandas/pull/32893/files#r396033598

* Nitpick, added a trailing comma

Co-authored-by: MomIsBestFriend <>
* DOC: add series examples (#24589)

adds documentation example to:
- `pandas.Series.eq`
- `pandas.Series.ne`
- `pandas.Series.gt`,
- `pandas.Series.ge`
- `pandas.series.le`
- `pandas.series.lt`
rjfs and others added 23 commits March 26, 2020 08:53
…lumns DataFrame (#32990)

* BUG: Fix bug for unstack with a lot of indices (#32624)
Co-authored-by: MomIsBestFriend <>
* POC masked ops for reductions

* fix mask for older numpy

* also use in boolean

* add min_count support

* fix preserve_dtypes test

* passthrough min_count for boolean as well

* fix comment

* add object to empty reduction test case

* test platform int

* Test sum separately with platform int

* share min_count checking helper function with nanops

* type + add docstring for min_count

* move sum algo from ops to array_algos

* add Int64/boolean to some benchmarks

* add whatsnew

* add skipna default in function signature

* update type hint + deprivatize

* update another type hint
* troubleshoot azure

* troubleshoot locale build
@SultanOrazbayev SultanOrazbayev changed the base branch from master to 0.20.x March 28, 2020 22:02
@SultanOrazbayev SultanOrazbayev changed the base branch from 0.20.x to master March 28, 2020 22:02
@SultanOrazbayev
Copy link
Contributor Author

I made some errors with merging the commits and ended up deleting the original fork. Not sure if it's possible to adjust this pull request, so will create a new one instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet