Fixing scatter plot size (#32904) #32937

SultanOrazbayev · 2020-03-23T16:55:53Z

This fixes the marker size in scatter plots (see #32904).

closes Bug: TypeError: ufunc 'sqrt' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe'' #32904
tests added / passed
passes black pandas
passes git diff upstream/master -u -- "*.py" | flake8 --diff
whatsnew entry

This fixes the marker size in scatter plots (see #32904).

TomAugspurger · 2020-03-23T18:20:08Z

Can you add a test for this? In pandas/tests/plotting/test_frame.py Your original example should be fine. That should have a comment referring back to your issue.

And we'll need a release note in whatsnew/v1.1.0.rst.

TomAugspurger · 2020-03-23T18:23:07Z

pandas/plotting/_matplotlib/core.py

@@ -934,6 +934,8 @@ def __init__(self, data, x, y, s=None, c=None, **kwargs):
            # hide the matplotlib default for size, in case we want to change
            # the handling of this argument later
            s = 20
+        elif s in data.columns:


This might fails for things like s=[1, 2, 3], non-hashable inputs. We should have other cases where we handle arrays or strings. In _make_plot we have something like

c_is_column = is_hashable(c) and c in self.data.columns

Something similar may be required here.

These are some notes for myself.

According to matplotlib (https://matplotlib.org/3.1.1/api/_as_gen/matplotlib.pyplot.scatter.html), s is scalar or array_like, shape (n, ). So the decision tree for s is:

None? If yes, then set s=20 (as it was defined in the code before), if no continue checking s.

Scalar (np.ndim(s) == 0)? If yes, then proceed with the passed value for s, else continue checking.

Hashable and in columns? (is_hashable(s) and s in data.columns) If yes, then set s = data[s], else continue checking s.

If this is a list, then check that the list is as long as the number of points to plot (np.ndim(s)==1 and len(s)==len(x)) and pass it, else proceed (which will trigger matplotlib errors).

This might fails for things like s=[1, 2, 3], non-hashable inputs. We should have other cases where we handle arrays or strings. In _make_plot we have something like

c_is_column = is_hashable(c) and c in self.data.columns

Something similar may be required here.

Could you elaborate on the strings? (I'm not sure what is an example of this.)

* Added `const` where avaible * Trying to use memoryviews REF: https://github.com/pandas-dev/pandas/pull/32893/files#r396033598 * Nitpick, added a trailing comma Co-authored-by: MomIsBestFriend <>

* DOC: add series examples (#24589) adds documentation example to: - `pandas.Series.eq` - `pandas.Series.ne` - `pandas.Series.gt`, - `pandas.Series.ge` - `pandas.series.le` - `pandas.series.lt`

#30501)

…lumns DataFrame (#32990) * BUG: Fix bug for unstack with a lot of indices (#32624)

Co-authored-by: MomIsBestFriend <>

* POC masked ops for reductions * fix mask for older numpy * also use in boolean * add min_count support * fix preserve_dtypes test * passthrough min_count for boolean as well * fix comment * add object to empty reduction test case * test platform int * Test sum separately with platform int * share min_count checking helper function with nanops * type + add docstring for min_count * move sum algo from ops to array_algos * add Int64/boolean to some benchmarks * add whatsnew * add skipna default in function signature * update type hint + deprivatize * update another type hint

* troubleshoot azure * troubleshoot locale build

merging with the local copy

This reverts commit 60b0e9f.

SultanOrazbayev · 2020-03-28T22:04:47Z

I made some errors with merging the commits and ended up deleting the original fork. Not sure if it's possible to adjust this pull request, so will create a new one instead.

Fixing scatter plot size (#32904)

f14ccdc

This fixes the marker size in scatter plots (see #32904).

SultanOrazbayev mentioned this pull request Mar 23, 2020

Bug: TypeError: ufunc 'sqrt' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe'' #32904

Closed

TomAugspurger added this to the 1.1 milestone Mar 23, 2020

TomAugspurger added the Visualization plotting label Mar 23, 2020

TomAugspurger reviewed Mar 23, 2020

View reviewed changes

ShaharNaveh and others added 24 commits March 23, 2020 14:08

Added const where avaible (#32893)

f20331d

* Added `const` where avaible * Trying to use memoryviews REF: https://github.com/pandas-dev/pandas/pull/32893/files#r396033598 * Nitpick, added a trailing comma Co-authored-by: MomIsBestFriend <>

CLN: xarray tests (#32943)

2c2ba45

TST: bare pytest raises in tests/scalar (#32929)

2ef8fd1

Correct data type misspelling (#32970)

4b64f98

move _get_cython_table_params into pandas/_testing.py (#32981)

6a84598

REF: misplaces Series.where, Series.rename tests (#32969)

4407d9f

CLN: move misplaced (and duplicated) Dataframe.__repr__ test (#32968)

e444e61

REF: misplaced DataFrame.where tests (#32948)

145a414

CLN: Remove GroupByError exception (#32952)

55636e6

REF: .values -> ._values (#32947)

d3ffc91

REF: misplaced DTI.shift tests (#32938)

d6f6203

CLN: Remove unused is_datetimelike arg (#32919)

ff91535

TST: move to indices fixture instead of create_index (#32916)

cd52920

CLN: Split integer array tests (#32910)

55df1e8

REF: misplaced arithmetic tests (#32912)

dc2b74e

TST: collect .insert tests (#32909)

3439327

TST: Avoid bare pytest.raises in mult files (#32906)

42ef409

DOC: Fix orc link (#32983)

650ef90

Timedeltas: Understand µs (#32899)

3e247fa

DOC: Add examples to Series operators (#24589) (#32704)

08fce67

* DOC: add series examples (#24589) adds documentation example to: - `pandas.Series.eq` - `pandas.Series.ne` - `pandas.Series.gt`, - `pandas.Series.ge` - `pandas.series.le` - `pandas.series.lt`

[ENH] Add "fullmatch" matching mode to Series.str [#32806] (#32807)

bed9103

CLN: Remove shebang (#32975)

b8004b8

REF: collect casting portion of Block.setitem (#32940)

b8035bb

CLN: Fix linting (#32987)

28e0f18

BUG: 27453 right merge order (#31278)

8a5f291

TomAugspurger mentioned this pull request Mar 26, 2020

Bug: TypeError: ufunc 'sqrt' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe'' #33034

Closed

5 tasks

rjfs and others added 23 commits March 26, 2020 08:53

BUG: Fixed strange behaviour of pd.DataFrame.drop() with inplace argu… (

a3097b5

#30501)

ERR: Raise a better error for numpy singletons in Index (#33026)

c81d90f

BUG: Multiple unstack using row index level labels and multi level co…

e7ee418

…lumns DataFrame (#32990) * BUG: Fix bug for unstack with a lot of indices (#32624)

REF: RangeIndex tests (#33050)

90127a8

REF: test_searchorted for PeriodIndex (#33040)

c440ee7

BUG: frame.lookup with non-unique axes (#33045)

6c2c1c8

STY: Boolean values for bint variables (#33009)

754caad

Co-authored-by: MomIsBestFriend <>

Fix to _get_nearest_indexer for pydata/xarray#3751 (#32905)

883379c

[BUG] Sum of grouped bool has inconsistent dtype (#32894)

1a57596

BUG: Preserve name in Index.astype (#32036)

e872067

CLN: de-duplicate Block.should_store and related (#33028)

c47e9ca

CLN: remove CategoricalBlock.to_native_types (#33063)

c69f7d8

BUG: implement astype from string dtype to nullable int dtype (#33062)

845c50c

REF: MultiIndex Indexing tests (#33053)

9130da9

REF: avoid internals in tshift (#33056)

b6cb1a4

DOC: Fixed examples in pandas/tseries (#32935)

13b9e40

DOC: Fix examples in reshape (#32980)

48ddfbb

CLN: update Appender to doc with case __doc__ (#32956)

e88c392

CI troubleshoot azure (#33080)

60b0e9f

* troubleshoot azure * troubleshoot locale build

added a test for scatter plot with variable marker size

e61fc5e

Merge branch 'patch-1' of github.com:SultanOrazbayev/pandas into patch-1

2893130

merging with the local copy

Revert "CI troubleshoot azure (#33080)"

c6b80ca

This reverts commit 60b0e9f.

SultanOrazbayev changed the base branch from master to 0.20.x March 28, 2020 22:02

SultanOrazbayev changed the base branch from 0.20.x to master March 28, 2020 22:02

SultanOrazbayev closed this Mar 28, 2020

SultanOrazbayev mentioned this pull request Mar 28, 2020

ENH/VIZ: Allowing s parameter of scatter plots to be a column name #33107

Merged

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixing scatter plot size (#32904) #32937

Fixing scatter plot size (#32904) #32937

SultanOrazbayev commented Mar 23, 2020 •

edited

Loading

TomAugspurger commented Mar 23, 2020

TomAugspurger Mar 23, 2020

SultanOrazbayev Mar 28, 2020

SultanOrazbayev Mar 28, 2020

SultanOrazbayev commented Mar 28, 2020

Fixing scatter plot size (#32904) #32937

Fixing scatter plot size (#32904) #32937

Conversation

SultanOrazbayev commented Mar 23, 2020 • edited Loading

TomAugspurger commented Mar 23, 2020

TomAugspurger Mar 23, 2020

Choose a reason for hiding this comment

SultanOrazbayev Mar 28, 2020

Choose a reason for hiding this comment

SultanOrazbayev Mar 28, 2020

Choose a reason for hiding this comment

SultanOrazbayev commented Mar 28, 2020

SultanOrazbayev commented Mar 23, 2020 •

edited

Loading