Skip to content

Bug: Made it so that 0 was included in uint8 #14412

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 7 commits into from

Conversation

verhalenn
Copy link

@verhalenn verhalenn commented Oct 13, 2016

Decided to restart. Sorry for the inconvenience. -> #14472

@verhalenn verhalenn changed the title Issue14401 Bug: Made it so that 0 was included in uint8 Oct 13, 2016
@sinhrks
Copy link
Member

sinhrks commented Oct 13, 2016

Thx for the PR. pls rebase existing PR from next.

@@ -45,3 +45,4 @@ Bug Fixes

- Bug in ``pd.concat`` where names of the ``keys`` were not propagated to the resulting ``MultiIndex`` (:issue:`14252`)
- Bug in ``MultiIndex.set_levels`` where illegal level values were still set after raising an error (:issue:`13754`)
- Bug in ``pd.to_numeric`` where it would not downcast a 0 to a uint8 (:issue:`14404`)
Copy link
Member

@sinhrks sinhrks Oct 13, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pls include #14401. Also, the issue is common for unsigned, not only for uint8.

@sinhrks sinhrks added Bug Dtype Conversions Unexpected or buggy dtype conversions labels Oct 13, 2016
@codecov-io
Copy link

codecov-io commented Oct 13, 2016

Current coverage is 85.26% (diff: 100%)

Merging #14412 into master will increase coverage by <.01%

@@             master     #14412   diff @@
==========================================
  Files           140        140          
  Lines         50631      50634     +3   
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
+ Hits          43167      43173     +6   
+ Misses         7464       7461     -3   
  Partials          0          0          

Powered by Codecov. Last update 7cad3f1...2b2622c

@jorisvandenbossche jorisvandenbossche added this to the 0.19.1 milestone Oct 13, 2016
@@ -391,6 +391,14 @@ def test_downcast(self):
res = pd.to_numeric(data, downcast=downcast)
tm.assert_numpy_array_equal(res, expected)

#check that 0 works as a unsigned downcast

data = [0, 1, 2, 3]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check this for all of the downcast parameters (where it should be included)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you want me to just edit everything between line 327 and line 368 in that test to include 0's in the arrays? Trying it before and after, it fails before and passes after. That being said if you don't want me to touch it for some reason it's relatively easy to make a new one. Also is there a reason that 340 and 341 is redundant?

@verhalenn
Copy link
Author

Added a series of tests which applied a pandas series that included the minimum and maximum values of each dtype to a to_numeric downcast to see if it would downcast to the appropriate dtype. It did not add to uint64 but I'm not sure if there were plans to include uint64 so I'll open up a new issue. Also edited the whatsnew.

@@ -45,3 +45,4 @@ Bug Fixes

- Bug in ``pd.concat`` where names of the ``keys`` were not propagated to the resulting ``MultiIndex`` (:issue:`14252`)
- Bug in ``MultiIndex.set_levels`` where illegal level values were still set after raising an error (:issue:`13754`)
- Bug in ``pd.to_numeric`` where it would not downcast a 0 properly. (:issue:`14401`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

say downcast='unsigned' as a kwargs is passed

@@ -401,6 +401,32 @@ def test_downcast(self):
res = pd.to_numeric(data, downcast=downcast)
tm.assert_numpy_array_equal(res, expected)

# check that the smallest and largest values in each integer type pass to each type.
int8 = [-128, 127]
Copy link
Contributor

@jreback jreback Oct 14, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't do this explicty, use np.iinfo(dtype).min/max as these vary by platform

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually on 2nd though, might be ok if appveyor passes

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you do this in a loop instead (eg. a loop of tuples of the start,stop, dtype)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was actually thinking dictionary if that's okay. And I believe that the numpy types are standard across platforms but I'm not positive which is why I didn't bother with the np.info(dtype).min/max but for safeties sake I'll include it.

@@ -48,3 +48,6 @@ Bug Fixes
- Bug in ``MultiIndex.set_levels`` where illegal level values were still set after raising an error (:issue:`13754`)
- Bug in ``DataFrame.to_json`` where ``lines=True`` and a value contained a ``}`` character (:issue:`14391`)
- Bug in ``df.groupby`` causing an ``AttributeError`` when grouping a single index frame by a column and the index level (:issue`14327`)
- Bug in ``pd.to_numeric`` where it would not downcast a 0 to a uint8 (:issue:`14404`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just make a single entry

@@ -401,6 +401,62 @@ def test_downcast(self):
res = pd.to_numeric(data, downcast=downcast)
tm.assert_numpy_array_equal(res, expected)

# check that the smallest and largest values in each integer type pass to each type.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is still overly verbose. do something like.

checks = [('int8', 'integer', [np.iinfo(np.int8).min, np.iinfo(np.int8).max]), 
                  .....
               ]

for dtype, downcast, min_max in checks:
     ....

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Dtype Conversions Unexpected or buggy dtype conversions
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: pd.to_numeric(..., downcast='unsigned') should accept 0
5 participants