BUG make hashtable.unique support readonly arrays #18825

hexgnu · 2017-12-18T19:54:51Z

This problem was brought up in
#18773 and effectively comes
down to how Cython deals with readonly arrays. While it would be ideal
for Cython to fix the underlying problem in the meantime we can rely on
this.

closes pd.cut fails on readonly arrays #18773
tests added / passed
passes git diff upstream/master -u -- "*.py" | flake8 --diff
whatsnew entry

gfyoung · 2017-12-18T21:50:41Z

pandas/tests/reshape/test_tile.py

@@ -512,6 +512,16 @@ def f():
        tm.assert_numpy_array_equal(
            mask, np.array([False, True, True, True, True]))

+    def test_cut_read_only(self):


Reference issue number below.

done thanks @gfyoung

codecov · 2017-12-18T22:01:37Z

Codecov Report

Merging #18825 into master will increase coverage by <.01%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master   #18825      +/-   ##
==========================================
+ Coverage   91.59%   91.59%   +<.01%     
==========================================
  Files         150      150              
  Lines       48959    48959              
==========================================
+ Hits        44843    44845       +2     
+ Misses       4116     4114       -2

Flag	Coverage Δ
#multiple	`89.96% <ø> (ø)`	⬆️
#single	`41.13% <ø> (ø)`	⬆️

Impacted Files	Coverage Δ
pandas/util/testing.py	`84.9% <0%> (+0.21%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update cdebcf3...ddb5247. Read the comment docs.

gfyoung · 2017-12-18T22:54:05Z

pandas/tests/reshape/test_tile.py

+
+        mutable = np.arange(0, 100, 10)
+
+        one_to_hundred = np.arange(100)


Not quite...that's really zero to 99 😄

oh duh lol... my bad I'll update it

jreback

small comments. Ideally we would find a better / more general way of doing this.

jreback · 2017-12-19T11:13:33Z

pandas/tests/reshape/test_tile.py

+        mutable = np.arange(0, 100, 10)
+
+        hundred_elements = np.arange(100)
+        tm.assert_categorical_equal(cut(hundred_elements, readonly),


you can paramaterize on writeable here.

I think this is what you're asking. I added a parameterized test that tests the three combinations (Readonly, Mutable) (Mutable, Mutable) (Readonly, Readonly). Let me know if that's not what you were thinking. Thanks @jreback

jreback · 2017-12-19T11:19:02Z

pandas/_libs/hashtable_class_helper.pxi.in

@@ -255,10 +255,56 @@ dtypes = [('Float64', 'float64', 'val != val', True),
          ('UInt64', 'uint64', 'False', False),
          ('Int64', 'int64', 'val == iNaT', False)]

+def get_dispatch(dtypes):


can you add a comment similiar to in algos_take for using the memory view.

Done. I also noticed that the indenting was a tad off so I re-indented it to be 4 spaces.

pep8speaks · 2017-12-19T17:28:31Z

Hello @hexgnu! Thanks for updating the PR.

Cheers ! There are no PEP8 issues in this Pull Request. 🍻

Comment last updated on December 23, 2017 at 20:56 Hours UTC

jreback · 2017-12-20T12:38:47Z

pandas/tests/reshape/test_tile.py

+            tm.assert_categorical_equal(cut(hundred_elements, array_1),
+                                        cut(hundred_elements, array_2))
+
+        for (w_1, w_2) in [[True, True], [True, False], [False, False]]:


can you parametrize this test instead of writing a sub test function

sorry yea just pushed a commit for that. should be parametrized now 😄

jreback · 2017-12-21T17:40:48Z

doc/source/whatsnew/v0.22.0.txt

@@ -326,7 +326,7 @@ Reshaping
 - Bug in :func:`DataFrame.stack` which fails trying to sort mixed type levels under Python 3 (:issue:`18310`)
 - Fixed construction of a :class:`Series` from a ``dict`` containing ``NaN`` as key (:issue:`18480`)
 - Bug in :func:`Series.rank` where ``Series`` containing ``NaT`` modifies the ``Series`` inplace (:issue:`18521`)


rebase on master. can you move to 0.23 (docs were renamed), prob easiest to just check this file from master and past in new one

Done. I also squashed some of my commits to clean up the commit history for you.

This problem was brought up in pandas-dev#18773 and effectively comes down to how Cython deals with readonly arrays. While it would be ideal for Cython to fix the underlying problem in the meantime we can rely on this. fix: updates one_to_hundred for hundred_elements This is because arange(100) isn't actually 1 to 100... it's 0 to 99 docs: adds comment to fix using ndarray and fixes indenting test: parametrize test for test_readonly_cut doc: add new whatsnew entry for v0.23.0 fix: checkout existing upstream v0.22.0

jreback · 2017-12-23T20:56:39Z

I pushed a fix. ping on green.

hexgnu · 2017-12-26T16:37:48Z

Awesome thanks @jreback looks like it all passed.

jreback · 2017-12-27T20:27:13Z

thanks @hexgnu
keep em coming!

gfyoung added Compat pandas objects compatability with Numpy or Python functions Reshaping Concat, Merge/Join, Stack/Unstack, Explode labels Dec 18, 2017

gfyoung reviewed Dec 18, 2017

View reviewed changes

hexgnu force-pushed the fixes_unique_readonly branch from 21c65ae to 668d057 Compare December 18, 2017 22:01

gfyoung reviewed Dec 18, 2017

View reviewed changes

jreback requested changes Dec 19, 2017

View reviewed changes

hexgnu force-pushed the fixes_unique_readonly branch from c888981 to 4a9d32e Compare December 19, 2017 17:33

jreback requested changes Dec 20, 2017

View reviewed changes

jreback added this to the 0.22.0 milestone Dec 20, 2017

hexgnu force-pushed the fixes_unique_readonly branch from 4a9d32e to a5d5b82 Compare December 20, 2017 16:43

jreback requested changes Dec 21, 2017

View reviewed changes

hexgnu force-pushed the fixes_unique_readonly branch from a5d5b82 to 4f1e2aa Compare December 22, 2017 02:00

jreback added 2 commits December 23, 2017 15:53

Merge branch 'master' into PR_TOOL_MERGE_PR_18825

172b56b

lint

ddb5247

jreback approved these changes Dec 23, 2017

View reviewed changes

jreback merged commit 80a5399 into pandas-dev:master Dec 27, 2017

hexgnu added a commit to hexgnu/pandas that referenced this pull request Dec 28, 2017

BUG make hashtable.unique support readonly arrays (pandas-dev#18825)

408eecd

jschendel mentioned this pull request Jan 11, 2018

.unique fails with read-only input. #19195

Closed

chris-b1 mentioned this pull request Jun 30, 2018

Accept constant memoryviews in HashTable.lookup #21688

Merged

3 tasks

xhochy added a commit to xhochy/pandas that referenced this pull request Jul 7, 2018

Apply modern fix for pandas-dev#18825

36626d8

xhochy added a commit to xhochy/pandas that referenced this pull request Jul 7, 2018

Apply modern fix for pandas-dev#18825

6c335b0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG make hashtable.unique support readonly arrays #18825

BUG make hashtable.unique support readonly arrays #18825

hexgnu commented Dec 18, 2017

gfyoung Dec 18, 2017

hexgnu Dec 18, 2017

codecov bot commented Dec 18, 2017 •

edited

Loading

gfyoung Dec 18, 2017

hexgnu Dec 18, 2017

jreback left a comment

jreback Dec 19, 2017

hexgnu Dec 19, 2017

jreback Dec 19, 2017

hexgnu Dec 19, 2017

pep8speaks commented Dec 19, 2017 •

edited

Loading

jreback Dec 20, 2017

hexgnu Dec 20, 2017

jreback Dec 21, 2017

hexgnu Dec 22, 2017

jreback commented Dec 23, 2017

hexgnu commented Dec 26, 2017

jreback commented Dec 27, 2017


		mutable = np.arange(0, 100, 10)

		one_to_hundred = np.arange(100)

BUG make hashtable.unique support readonly arrays #18825

BUG make hashtable.unique support readonly arrays #18825

Conversation

hexgnu commented Dec 18, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov bot commented Dec 18, 2017 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jreback left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pep8speaks commented Dec 19, 2017 • edited Loading

Comment last updated on December 23, 2017 at 20:56 Hours UTC

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jreback commented Dec 23, 2017

hexgnu commented Dec 26, 2017

jreback commented Dec 27, 2017

codecov bot commented Dec 18, 2017 •

edited

Loading

pep8speaks commented Dec 19, 2017 •

edited

Loading