Support for PEP 3141 numbers #22952

tm9k1 · 2018-10-02T19:57:09Z

closes lib.is_scalar misses PEP 3141 numbers #22903
tests added / passed
passes git diff upstream/master -u -- "*.py" | flake8 --diff
whatsnew entry

tm9k1 · 2018-10-02T19:59:16Z

PROBLEM

calling pandas.api.types.is_scalar returns False for Fraction or Number

FIX

calling numpy.isscalar(val) within is_scalar function

datapythonista · 2018-10-02T20:12:15Z

Thanks for the fix @brute4s99

Can you add a test (a test that fails before the fix, and that passes with it), and a line in the bugs section of the whatsnew file for 0.24 please?

tm9k1 · 2018-10-02T20:15:28Z

Thanks for the fix @brute4s99

Can you add a test (a test that fails before the fix, and that passes with it), and a line in the bugs section of the whatsnew file for 0.24 please?

sure, @datapythonista ! I didn't know about the whatsnew file, so thanks !

datapythonista · 2018-10-02T20:16:32Z

no problem, took me several PRs before I started remembering about the whatsnew myself ;)

jbrockmendel · 2018-10-02T20:19:16Z

pandas/_libs/lib.pyx

-            or util.is_offset_object(val))
-
+            or util.is_offset_object(val)
+            or np.isscalar(val))


Does numpy's C-API have an implementation of this?

Does the behavior vary across supported versions of numpy?

Does numpy's C-API have an implementation of this?

I went this way, @jbrockmendel
Further, ScalarType is in numerictypes.py here.

I would just import Number / Fraction here and do an isinstance check

jbrockmendel · 2018-10-02T20:20:21Z

Simple solution, may well be the best one. This will definitely need a performance check.

tm9k1 · 2018-10-02T20:56:48Z

where can I connect with all the devs, @datapythonista ?
do we have any IRC channel or similar?

codecov · 2018-10-02T21:02:20Z

Codecov Report

Merging #22952 into master will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##           master   #22952   +/-   ##
=======================================
  Coverage   92.28%   92.28%           
=======================================
  Files         161      161           
  Lines       51451    51451           
=======================================
  Hits        47483    47483           
  Misses       3968     3968

Flag	Coverage Δ
#multiple	`90.68% <ø> (ø)`	⬆️
#single	`42.28% <ø> (ø)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update deb7b4d...e5f084a. Read the comment docs.

tm9k1 · 2018-10-02T21:03:38Z

Simple solution, may well be the best one. This will definitely need a performance check.

I just thought of reusing exisitng code, since we already require numpy to install pandas 😄

jreback

pls add a test

jreback · 2018-10-02T21:12:42Z

pandas/_libs/lib.pyx

-            or util.is_offset_object(val))
-
+            or util.is_offset_object(val)
+            or np.isscalar(val))


I would just import Number / Fraction here and do an isinstance check

jreback · 2018-10-02T21:13:09Z

doc/source/whatsnew/v0.24.0.txt

+   False
+
+New Behavior:
+


you don't need this whole sub-section ,this is a minor function, however adding these examples to the doc-string would be ok.

you don't need this whole sub-section ,this is a minor function, however adding these examples to the doc-string would be ok.

Which doc-string are you referring to, @jreback ?

the is_scalar doc-strings, though I dont' think we have an Examples section, if you can add one with some examples would be great, have a look at infer_dtype for some inspiration on a good doc-string

Will do, @jreback .

datapythonista · 2018-10-02T21:13:34Z

@brute4s99 you can use gitter: https://gitter.im/pydata/pandas

If it's related to this PR, is usually better to discuss here, so other people interested/reviewing are aware of the discussions.

In this case the issue is that the file you're modifying is a pyx file. Meaning that the syntax is Cython, and it gets converted to C and compiled. And calling a numpy Python function will have an impact on the speed (and we use Cython and compile the file to speed up things). If you're not familiar with Cython and those things, it can be difficult to fix this issue in a more performant way.

jbrockmendel · 2018-10-02T22:37:01Z

do we have any IRC channel or similar?

There's gitter, though I don't know how much it is used. The conversation here in GH likely your best bet.

I just thought of reusing exisitng code, since we already require numpy to install pandas

That's a good intuition, and may end up being the optimal way to go, but in the cython code we make an extra effort to avoid making pure-python calls. So if there turned out to be a C-implementation, we'd want to use that. As it is I expect @jreback's suggestion to directly do an isinstance check is the easiest solution.

tm9k1 · 2018-10-03T08:51:08Z

As it is I expect @jreback's suggestion to directly do an isinstance check is the easiest solution.

I was thinking the same, but then dropped it presuming importing Fraction and Number might cause slowness.

tm9k1 · 2018-10-03T08:59:25Z

pls add a test

Can you please elaborate, @jreback ?
I am new to contributing.

bashtage · 2018-10-03T09:10:52Z

You need to add a test that shows that your changes all work as expected.

bashtage · 2018-10-03T09:12:24Z

Somewhere around here:

pandas/pandas/tests/dtypes/test_inference.py

Line 1119 in 1a12c41

def test_is_scalar_builtin_scalars(self):

pep8speaks · 2018-10-03T09:52:29Z

Hello @brute4s99! Thanks for updating the PR.

There are no PEP8 issues in the file pandas/tests/dtypes/test_inference.py !

Comment last updated on October 03, 2018 at 09:54 Hours UTC

tm9k1 · 2018-10-03T09:56:11Z

Somewhere around here:

pandas/pandas/tests/dtypes/test_inference.py

Line 1119 in 1a12c41
def test_is_scalar_builtin_scalars(self):

Tests have been added and passed ! 😄

tm9k1 · 2018-10-03T11:43:15Z

please ignore failed Travis CI : :issue:`#22934`

this is a random error by Travis CI.

jreback · 2018-10-03T12:12:28Z

doc/source/whatsnew/v0.24.0.txt

@@ -834,3 +835,5 @@ Other
 - :meth:`DataFrame.nlargest` and :meth:`DataFrame.nsmallest` now returns the correct n values when keep != 'all' also when tied on the first columns (:issue:`22752`)
 - :meth:`~pandas.io.formats.style.Styler.bar` now also supports tablewise application (in addition to rowwise and columnwise) with ``axis=None`` and setting clipping range with ``vmin`` and ``vmax`` (:issue:`21548` and :issue:`21526`). ``NaN`` values are also handled properly.
 - Logical operations ``&, |, ^`` between :class:`Series` and :class:`Index` will no longer raise ``ValueError`` (:issue:`22092`)
+- Checking PEP 3141 numbers in `pandas.api.types.is_scalar` function returns ``True`` (:issue:`22903`)


use the :func:`~pandas.api.types.is_scalar`

ok, @jreback

jreback · 2018-10-03T12:13:20Z

pandas/_libs/lib.pyx

+
+    Parameters
+    ----------
+    val : numpy array scalar (e.g. np.int64), Python builtin numerics,


@datapythonista @jorisvandenbossche is this how we format lists of things?

@brute4s99 this should be like:

val : single line type description The full parameter description with indent of 4 spaces (this can span over multiple lines)

If the type description is a long enumeration (like in this case) and doesn't fit on a single line, you can keep the actual type description vague (eg "scalar"), and then put all the possibilities in the parameter description.

thanks for clarification, @jorisvandenbossche ! Will follow suit .

jreback · 2018-10-03T12:13:50Z

pandas/_libs/lib.pyx

+
+    Examples
+    --------
+    >>> dt = pd.datetime.datetime(2018,10,3)


can you show an example with a list as well (to return False)

sure! will do !

jreback · 2018-10-03T12:14:42Z

pandas/_libs/lib.pyx

-            or util.is_offset_object(val))
-
+            or util.is_offset_object(val)
+            or isinstance(val, Number)


you can use a compound isinstance(val, (Number, Fraction))

oh! I didn't know that !
Thanks @jreback

jreback · 2018-10-03T12:15:04Z

pandas/tests/dtypes/test_inference.py

@@ -13,7 +13,8 @@
 import numpy as np
 import pytz
 import pytest
-
+from numbers import Number


can you put the builtin imports at the top

jreback · 2018-10-03T12:15:56Z

pandas/tests/dtypes/test_inference.py

@@ -1184,6 +1185,10 @@ def test_is_scalar_pandas_containers(self):
        assert not is_scalar(Index([]))
        assert not is_scalar(Index([1]))

+    def tes_is_scalar_pep_3141(self):


just add onto the test_is_scalar_builtin_scalars test

I tried it, @jreback . Strangely, it was throwing errors, and local tests were failing. That's why I defined it in a different function. I couldn't understand the error either, let me send you a snip.

tm9k1 · 2018-10-04T11:43:03Z

@jreback please review

jreback

looks fine. can you rebase, ping on green.

jreback · 2018-10-07T23:03:33Z

pandas/_libs/lib.pyx

+    --------
+    >>> dt = pd.datetime.datetime(2018,10,3)
+    >>> pd.is_scalar(dt)
+    True


can you put a blank line between cases (as you have done for some)

tm9k1 · 2018-10-08T10:31:06Z

@jreback please review, I just re-did the changes after a git fetch upstream

datapythonista

some issues in the documentation, for the rest looks good

datapythonista · 2018-10-09T08:16:13Z

pandas/_libs/lib.pyx

+
+    Parameters
+    ----------
+    val : input argument of any type


After the name of the parameter and the colon, just but the name of the expected Python type. In this case use object if anything can be used.

datapythonista · 2018-10-09T08:17:41Z

pandas/_libs/lib.pyx

+
+    Parameters
+    ----------
+    val : input argument of any type
    This includes:
    - numpy array scalar (e.g. np.int64)


Can you generate the html version of this docstring? ./doc/make.py html --single=path.to.this.function

I think this won't render properly

datapythonista · 2018-10-09T08:18:20Z

pandas/_libs/lib.pyx


+    Returns
+    -------
+    True if the given value is scalar, False otherwise.


In the returns first list the type (i.e. bool) in the next line indented the description.

datapythonista

comments on the docs

datapythonista · 2018-10-09T10:01:48Z

pandas/_libs/lib.pyx

+
+    Parameters
+    ----------
+    val : object
    This includes:


I think this should be indented, please build the html and check

datapythonista · 2018-10-09T10:02:31Z

pandas/_libs/lib.pyx

+    * instances of datetime.datetime
+    * instances of datetime.timedelta
+    * Period
+    * instances of decimal.Decimal


can you remove all the instances of here. All the items in the list what you can check are the instances

datapythonista · 2018-10-09T10:02:56Z

pandas/_libs/lib.pyx


+    Returns
+    -------
+     a bool object.


Just bool, and in the next line indented a description

datapythonista · 2018-10-09T10:03:46Z

pandas/_libs/lib.pyx

+    >>> pd.api.types.is_scalar([2, 3])
+    False
+
+    >>> pd.api.types.is_scalar({0:1, 2:3})


there are missing spaces after the colon, check pep8 for those

datapythonista · 2018-10-09T10:05:36Z

pandas/_libs/lib.pyx

+    True
+
+    >>> from numbers import Number
+    >>> pd.api.types.is_scalar(Number())


I think it makes more sense to have an example Number here. Not sure what an empty Number instance can be used for, but doesn't seem something you'd use as often as an actual number.

What should I put in here, then, @datapythonista ?
Any number would fall into int category... :/

yeah prob can remove this last Number example, its not generally useful unless you actually inherit from Number of a subclass.

jreback · 2018-11-01T01:32:59Z

can you merge master and update

jreback · 2018-11-18T18:42:48Z

can you merge master

tm9k1 · 2018-11-18T21:05:43Z

please review, @datapythonista @jreback

jreback · 2018-11-18T23:06:58Z

rebase. small change @brute4s99

tm9k1 · 2018-11-19T21:09:27Z

is this done right? @jreback

datapythonista · 2018-11-19T21:22:58Z

Check the diff of the PR, something went wrong and it has lots of unrelated changes. Can you fix it please, so we can move forward with it.

tm9k1 · 2018-11-19T21:28:09Z

okayyy, I jumbled up.
Found my mistake, @datapythonista .
Will repair this ASAP

tm9k1 · 2018-11-19T21:43:54Z

please review, @datapythonista
I believe it's done right this time!

tm9k1 · 2018-11-19T21:47:57Z

I accidentally rebased on origin/master, that was ~350 commits behind upstream/master
steps taken:-

reverted HEAD to just before rebase
merged upstream/master into origin/is_scalar
updated origin/master to get NO diffs in upstream/master and origin/master
ran git rebase origin/master and fixed a conflict in doc/source/whatsnew/v0.24.0.rst
pushed to origin/is_scalar.

jreback · 2018-11-20T02:06:31Z

@datapythonista over to you; merge when satisfied.

datapythonista · 2018-11-20T10:41:57Z

Thanks @brute4s99

…fixed * upstream/master: DOC: Removing rpy2 dependencies, and converting examples using it to regular code blocks (pandas-dev#23737) BUG: Fix dtype=str converts NaN to 'n' (pandas-dev#22564) DOC: update pandas.core.resample.Resampler.nearest docstring (pandas-dev#20381) REF/TST: Add more pytest idiom to parsers tests (pandas-dev#23810) Added support for Fraction and Number (PEP 3141) to pandas.api.types.is_scalar (pandas-dev#22952) DOC: Updating to_timedelta docstring (pandas-dev#23259)

…is_scalar (pandas-dev#22952)

datapythonista added Numeric Operations Arithmetic, Comparison, and Logical operations Algos Non-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diff labels Oct 2, 2018

datapythonista requested a review from jbrockmendel October 2, 2018 20:10

jbrockmendel reviewed Oct 2, 2018

View reviewed changes

jreback requested changes Oct 2, 2018

View reviewed changes

jreback requested changes Oct 3, 2018

View reviewed changes

jreback requested changes Oct 7, 2018

View reviewed changes

jreback added this to the 0.24.0 milestone Oct 7, 2018

tm9k1 force-pushed the is_scalar branch 3 times, most recently from e7baf2b to 2d10e8b Compare October 8, 2018 11:59

datapythonista requested changes Oct 9, 2018

View reviewed changes

jreback removed this from the 0.24.0 milestone Nov 18, 2018

jreback approved these changes Nov 18, 2018

View reviewed changes

tm9k1 force-pushed the is_scalar branch from 4e426de to a6f65d2 Compare November 19, 2018 21:08

tm9k1 force-pushed the is_scalar branch from a6f65d2 to 4e426de Compare November 19, 2018 21:31

tm9k1 and others added 5 commits November 20, 2018 03:12

Fix pandas-dev#22903 for lib.is_scalar missing PEP 3141 numbers

4d697b7

Fixed linting issues

965a910

more fixes for lib.pyx (PEP 3141 is_scalar())

ecac1d1

pep8 and indentation fixes

1b431c5

whitespace

b2cb68f

tm9k1 force-pushed the is_scalar branch from 9b80ed7 to b2cb68f Compare November 19, 2018 21:43

Removed Number() example from is_scalar()

e5f084a

jreback added this to the 0.24.0 milestone Nov 20, 2018

datapythonista merged commit 029d57c into pandas-dev:master Nov 20, 2018

tm9k1 mentioned this pull request Nov 20, 2018

lib.is_scalar misses PEP 3141 numbers #22903

Closed

Pingviinituutti pushed a commit to Pingviinituutti/pandas that referenced this pull request Feb 28, 2019

Added support for Fraction and Number (PEP 3141) to pandas.api.types.…

5edd21d

…is_scalar (pandas-dev#22952)

Pingviinituutti pushed a commit to Pingviinituutti/pandas that referenced this pull request Feb 28, 2019

Added support for Fraction and Number (PEP 3141) to pandas.api.types.…

e7c6930

…is_scalar (pandas-dev#22952)

Uh oh!

Support for PEP 3141 numbers #22952

Support for PEP 3141 numbers #22952

Uh oh!

Conversation

tm9k1 commented Oct 2, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tm9k1 commented Oct 2, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PROBLEM

FIX

Uh oh!

datapythonista commented Oct 2, 2018

Uh oh!

tm9k1 commented Oct 2, 2018

Uh oh!

datapythonista commented Oct 2, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jbrockmendel commented Oct 2, 2018

Uh oh!

tm9k1 commented Oct 2, 2018

Uh oh!

codecov bot commented Oct 2, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

tm9k1 commented Oct 2, 2018

Uh oh!

jreback left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

datapythonista commented Oct 2, 2018

Uh oh!

jbrockmendel commented Oct 2, 2018

Uh oh!

tm9k1 commented Oct 3, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tm9k1 commented Oct 3, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bashtage commented Oct 3, 2018

Uh oh!

bashtage commented Oct 3, 2018

Uh oh!

pep8speaks commented Oct 3, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Comment last updated on October 03, 2018 at 09:54 Hours UTC

Uh oh!

tm9k1 commented Oct 3, 2018

Uh oh!

tm9k1 commented Oct 3, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

please ignore failed Travis CI : :issue:#22934

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tm9k1 commented Oct 2, 2018 •

edited

Loading

tm9k1 commented Oct 2, 2018 •

edited

Loading

codecov bot commented Oct 2, 2018 •

edited

Loading

tm9k1 commented Oct 3, 2018 •

edited

Loading

tm9k1 commented Oct 3, 2018 •

edited

Loading

pep8speaks commented Oct 3, 2018 •

edited

Loading

tm9k1 commented Oct 3, 2018 •

edited

Loading

please ignore failed Travis CI : :issue:`#22934`

tm9k1 commented Oct 4, 2018 •

edited

Loading