Skip to content

BUG: pd.cut with bins=1 and input all 0s #15428

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
luca-s opened this issue Feb 16, 2017 · 1 comment
Closed

BUG: pd.cut with bins=1 and input all 0s #15428

luca-s opened this issue Feb 16, 2017 · 1 comment
Labels
Algos Non-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diff Bug
Milestone

Comments

@luca-s
Copy link
Contributor

luca-s commented Feb 16, 2017

Code Sample, a copy-pastable example if possible

>>> import pandas as pd
>>> pd.__version__
u'0.19.2'

>>> s = pd.Series([0,0,0])
>>> pd.cut(s, 1)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/luca/.local/lib/python2.7/site-packages/pandas/tools/tile.py", line 119, in cut
    include_lowest=include_lowest)
  File "/home/luca/.local/lib/python2.7/site-packages/pandas/tools/tile.py", line 194, in _bins_to_cuts
    raise ValueError('Bin edges must be unique: %s' % repr(bins))
ValueError: Bin edges must be unique: array([ 0.,  0.])

>>> s = pd.Series([-1,-1,-1])
>>> pd.cut(s, 1)
0    (-1.001, -0.999]
1    (-1.001, -0.999]
2    (-1.001, -0.999]

>>> s = pd.Series([1,1,1])
>>> pd.cut(s, 1)
0    (0.999, 1.001]
1    (0.999, 1.001]
2    (0.999, 1.001]

Problem description

The special case of an input containing all 0s raises a ValueError. I had a look at the code and it is straightforward to fix. I can provide a PR if requested.

Expected Output

>>> pd.cut(s, 1)
0    (-0.001, 0.001]
1    (-0.001, 0.001]
2    (-0.001, 0.001]

Output of pd.show_versions()

>>> pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 2.7.12.final.0
python-bits: 64
OS: Linux
OS-release: 4.8.0-36-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: None.None

pandas: 0.19.2
nose: 1.3.7
pip: 8.1.1
setuptools: 32.3.1
Cython: 0.23.4
numpy: 1.12.0
scipy: 0.18.1
statsmodels: 0.6.1
xarray: None
IPython: 5.1.0
sphinx: 1.5.1
patsy: 0.4.1
dateutil: 2.6.0
pytz: 2016.10
blosc: None
bottleneck: None
tables: None
numexpr: None
matplotlib: 1.5.3
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: 0.999
httplib2: 0.9.1
apiclient: None
sqlalchemy: 1.1.4
pymysql: None
psycopg2: None
jinja2: 2.8.1
boto: None
pandas_datareader: 0.2.1

@TomAugspurger TomAugspurger added Algos Non-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diff Bug labels Feb 16, 2017
@TomAugspurger TomAugspurger added this to the 0.20.0 milestone Feb 16, 2017
@TomAugspurger
Copy link
Contributor

Sure a fix would be great.

@jreback jreback closed this as completed in d32acaa Mar 8, 2017
AnkurDedania pushed a commit to AnkurDedania/pandas that referenced this issue Mar 21, 2017
The special case of running pd.cut() qith bins=1 an input containing
all  0s raises a ValueError

closes pandas-dev#15428
closes pandas-dev#15431

Author: Luca Scarabello <[email protected]>
Author: Luca <[email protected]>

Closes pandas-dev#15437 from luca-s/issue_15428 and squashes the following commits:

1248987 [Luca] rebased on master
def84ba [Luca] Yet another implementation attempt
692503a [Luca Scarabello] Improved solution: using same approach as pd.cut
b7d92dc [Luca] Added 'allow' duplicates option to _bins_to_cuts
f56a27f [Luca Scarabello] Issue pandas-dev#15431
55806cf [Luca Scarabello] BUG: pd.cut with bins=1 and input all 0s
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Algos Non-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diff Bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants