CLN/STY: pandas/_libs/internals.pyx #32801

ShaharNaveh · 2020-03-18T11:17:06Z

ShaharNaveh · 2020-03-18T11:19:00Z

pandas/_libs/internals.pyx

-        int64_t cur_blkno
-        Py_ssize_t i, start, stop, n, diff
-
+        Py_ssize_t i, start = 0, stop, n = blknos.shape[0], diff, tot_len


NOTE, I am adding here a variable definition (tot_len)

I would put this on multiple lines, this becomes difficult to read IMO

jorisvandenbossche

Since code style is subjective (when going beyond PEP8 / black), you also get some subjective comments from me ;)

jorisvandenbossche · 2020-03-18T12:26:17Z

pandas/_libs/internals.pyx


-        return f'{type(self).__name__}({v})'
+        v = self._as_slice if s is not None else self._as_array


Personally, I don't find this necessarily more readable ..

jorisvandenbossche · 2020-03-18T12:26:50Z

pandas/_libs/internals.pyx

        if s is not None:
            return slice_len(s)
-        else:
-            return len(self._as_array)


Why this change? I personally find an if/else pattern very clear

jorisvandenbossche · 2020-03-18T12:27:40Z

pandas/_libs/internals.pyx

-        int64_t cur_blkno
-        Py_ssize_t i, start, stop, n, diff
-
+        Py_ssize_t i, start = 0, stop, n = blknos.shape[0], diff, tot_len


I would put this on multiple lines, this becomes difficult to read IMO

jorisvandenbossche · 2020-03-18T12:28:11Z

pandas/_libs/internals.pyx

-                val = slice(start, None, step)
-            else:
-                val = slice(start, stop, step)
+            val = slice(start, None, step) if stop < 0 else slice(start, stop, step)


REF: pandas-dev#32801 (comment)

ShaharNaveh · 2020-03-18T13:05:08Z

pandas/_libs/internals.pyx

-        else:
-            return iter(self._as_array)
+
+        return iter(self._as_array)


@jorisvandenbossche This is a very similar change to the change that was mentioned here, would you like me to revert that as well?

same here (see comment below)

ShaharNaveh · 2020-03-18T13:05:41Z

pandas/_libs/internals.pyx

        if s is not None:
            return s
-        else:
-            return self._as_array


@jorisvandenbossche This is a very similar change to the change that was mentioned here, would you like me to revert that as well?

To be clear, my opinion is only a single one, but yes, I would personally revert this as well

ShaharNaveh · 2020-03-18T13:06:58Z

pandas/_libs/internals.pyx

-        else:
-            val = self._as_array[loc]
+
+        val = slice_getitem(s, loc) if s is not None else self._as_array[loc]


@jorisvandenbossche This is a very similar change to the change that was mentioned here, would you like me to revert that as well?

this is different from the other places where you left this comment, and this is the only one i would ask you to revert. The way it is currently written, I can look in coverage output to see if both branches are reached; I can't with this PR's version.

REF: pandas-dev#32801 (comment)

jbrockmendel · 2020-03-18T21:52:52Z

pandas/_libs/internals.pyx

-            #  np.arange(start, stop, step, dtype=np.int64)
+            # NOTE:
+            # this is the C-optimized equivalent of
+            # `np.arange(start, stop, step, dtype=np.int64)`


not a deal-breaker but i dont see how this is an improvement. especially if the comment is going to be repeated in multiple places id rather it be 2 lines than 3

jbrockmendel · 2020-03-18T21:54:01Z

pandas/_libs/internals.pyx

-    start = 0
-    cur_blkno = blknos[start]
-
-    if group is False:


if group is False should be marginally faster than if not group

From my benchmarking it's actually the other way around:

In [1]: foo = True In [2]: %timeit foo is False 22.9 ns ± 0.157 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each) In [3]: %timeit foo is False 23 ns ± 0.0388 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each) In [4]: %timeit foo is False 22.7 ns ± 0.194 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each) In [5]: %timeit not foo 20.7 ns ± 0.0589 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each) In [6]: %timeit not foo 21.1 ns ± 0.0636 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each) In [7]: %timeit not foo 20.6 ns ± 0.0497 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

In [1]: bar = False In [2]: %timeit bar is False 28.5 ns ± 0.077 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each) In [3]: %timeit bar is False 28.6 ns ± 0.0666 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each) In [4]: %timeit bar is False 29.5 ns ± 0.203 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each) In [5]: %timeit not bar 26.1 ns ± 0.217 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each) In [6]: %timeit not bar 25.8 ns ± 0.0231 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each) In [7]: %timeit not bar 26.7 ns ± 0.0831 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

look at the generated C code. The "is" should be a pointer comparison, whereas the other one should have to do some more work.

jbrockmendel · 2020-03-18T21:54:37Z

pandas/_libs/internals.pyx


    if n == 0:
        return

-    start = 0
-    cur_blkno = blknos[start]


doing this here instead of above means we can avoid the assignment in the n==0 case

…nternals

REF: pandas-dev#32801 (comment)

REF: https://github.com/pandas-dev/pandas/pull/32801/files#r394661666

…nternals

jreback

looks fine. this is mostly moving things around, but ok.

jorisvandenbossche · 2020-03-19T07:34:59Z

pandas/_libs/internals.pyx

            return s

+        raise TypeError("Not slice-like")


Also this change I personally don't find an improvement in code flow

…nternals

jreback · 2020-03-26T01:02:31Z

thanks @MomIsBestFriend

CLN/STY: pandas/_libs/internals.pyx

ac115a2

ShaharNaveh commented Mar 18, 2020

View reviewed changes

jorisvandenbossche requested changes Mar 18, 2020

View reviewed changes

MomIsBestFriend added 4 commits March 18, 2020 14:54

Splitted cdef to multiple lines

67a592f

REF: pandas-dev#32801 (comment)

Reverted if/else statement

8ea73ff

REF: pandas-dev#32801 (comment)

Reverted if/else change

c824540

REF: pandas-dev#32801 (comment)

Reverted if/else changes

571560e

REF: pandas-dev#32801 (comment)

ShaharNaveh requested a review from jorisvandenbossche March 18, 2020 13:03

ShaharNaveh commented Mar 18, 2020

View reviewed changes

simonjayhawkins added Clean Code Style Code style, linting, code_checks labels Mar 18, 2020

Revert if/else statement

dbd016e

REF: pandas-dev#32801 (comment)

ShaharNaveh requested a review from jbrockmendel March 18, 2020 20:25

jbrockmendel reviewed Mar 18, 2020

View reviewed changes

MomIsBestFriend added 4 commits March 19, 2020 00:06

Merge remote-tracking branch 'upstream/master' into OCD-collections-i…

efd2e18

…nternals

Make the comment to be 2 lines

6223f3c

REF: pandas-dev#32801 (comment)

Revert some assignment

ef276a0

REF: https://github.com/pandas-dev/pandas/pull/32801/files#r394661666

Merge remote-tracking branch 'upstream/master' into OCD-collections-i…

b06587e

…nternals

jreback reviewed Mar 19, 2020

View reviewed changes

jorisvandenbossche reviewed Mar 19, 2020

View reviewed changes

MomIsBestFriend added 3 commits March 20, 2020 03:01

Merge remote-tracking branch 'upstream/master' into OCD-collections-i…

9359c5e

…nternals

Merge remote-tracking branch 'upstream/master' into OCD-collections-i…

d80b3df

…nternals

Revert if/else statement

6cd03f3

jreback added this to the 1.1 milestone Mar 21, 2020

ShaharNaveh requested a review from jorisvandenbossche March 24, 2020 13:43

Merge remote-tracking branch 'upstream/master' into OCD-collections-i…

3e285c8

…nternals

Merge remote-tracking branch 'upstream/master' into OCD-collections-i…

cc3bd20

…nternals

jorisvandenbossche approved these changes Mar 25, 2020

View reviewed changes

jreback merged commit 63c631f into pandas-dev:master Mar 26, 2020

ShaharNaveh deleted the OCD-collections-internals branch March 26, 2020 01:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CLN/STY: pandas/_libs/internals.pyx #32801

CLN/STY: pandas/_libs/internals.pyx #32801

ShaharNaveh commented Mar 18, 2020 •

edited

Loading

ShaharNaveh Mar 18, 2020

jorisvandenbossche Mar 18, 2020

jorisvandenbossche left a comment

jorisvandenbossche Mar 18, 2020

jorisvandenbossche Mar 18, 2020

jorisvandenbossche Mar 18, 2020

jorisvandenbossche Mar 18, 2020

ShaharNaveh Mar 18, 2020

jorisvandenbossche Mar 19, 2020

ShaharNaveh Mar 18, 2020

jorisvandenbossche Mar 19, 2020

ShaharNaveh Mar 18, 2020

jbrockmendel Mar 18, 2020

jbrockmendel Mar 18, 2020

jbrockmendel Mar 18, 2020

ShaharNaveh Mar 18, 2020

jbrockmendel Mar 18, 2020

jbrockmendel Mar 18, 2020

jreback left a comment

jorisvandenbossche Mar 19, 2020

jreback commented Mar 26, 2020


		return f'{type(self).__name__}({v})'
		v = self._as_slice if s is not None else self._as_array

CLN/STY: pandas/_libs/internals.pyx #32801

CLN/STY: pandas/_libs/internals.pyx #32801

Conversation

ShaharNaveh commented Mar 18, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jorisvandenbossche left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jreback left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jreback commented Mar 26, 2020

ShaharNaveh commented Mar 18, 2020 •

edited

Loading