Skip to content

BUG: pd.qcut with q=1 and input with identical values #15431

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
luca-s opened this issue Feb 16, 2017 · 2 comments
Closed

BUG: pd.qcut with q=1 and input with identical values #15431

luca-s opened this issue Feb 16, 2017 · 2 comments
Labels
Bug Duplicate Report Duplicate issue or pull request Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Milestone

Comments

@luca-s
Copy link
Contributor

luca-s commented Feb 16, 2017

Code Sample, a copy-pastable example if possible

import pandas as pd
>>> pd.__version__
u'0.19.2'

>>> s = pd.Series([1,1,1])
>>> pd.qcut(s, 1)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/luca/.local/lib/python2.7/site-packages/pandas/tools/tile.py", line 175, in qcut
    precision=precision, include_lowest=True)
  File "/home/luca/.local/lib/python2.7/site-packages/pandas/tools/tile.py", line 194, in _bins_to_cuts
    raise ValueError('Bin edges must be unique: %s' % repr(bins))
ValueError: Bin edges must be unique: array([1, 1])

>>> s = pd.Series([1,1,2])
>>> pd.qcut(s, 1)
0    [1, 2]
1    [1, 2]
2    [1, 2]

>>> s = pd.Series([0,1,2])
>>> pd.qcut(s, 1)
0    [0, 2]
1    [0, 2]
2    [0, 2]

Problem description

If the input contains identical values pd.qcut fails even with q=1. I had a look at the code and I can try to relax the check to allow this specific case to pass and avoid the exception, but then I have to make sure the code can handle a single edge.

Expected Output

>>> s = pd.Series([1,1,1])
>>> pd.qcut(s, 1)
0    [0, 2]
1    [0, 2]
2    [0, 2]

Output of pd.show_versions()

>>> pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 2.7.12.final.0
python-bits: 64
OS: Linux
OS-release: 4.8.0-36-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: None.None

pandas: 0.19.2
nose: 1.3.7
pip: 8.1.1
setuptools: 32.3.1
Cython: 0.23.4
numpy: 1.12.0
scipy: 0.18.1
statsmodels: 0.6.1
xarray: None
IPython: 5.1.0
sphinx: 1.5.1
patsy: 0.4.1
dateutil: 2.6.0
pytz: 2016.10
blosc: None
bottleneck: None
tables: None
numexpr: None
matplotlib: 1.5.3
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: 0.999
httplib2: 0.9.1
apiclient: None
sqlalchemy: 1.1.4
pymysql: None
psycopg2: None
jinja2: 2.8.1
boto: None
pandas_datareader: 0.2.1

@luca-s luca-s changed the title BUG: pd.qcut with q=1 and input with unique values BUG: pd.qcut with q=1 and input with identical values Feb 16, 2017
@jreback
Copy link
Contributor

jreback commented Feb 17, 2017

this is the same as #15428 so ideally let's solve in the same PR., just list it as a separate case there.

@jreback jreback closed this as completed Feb 17, 2017
@jreback jreback added Bug Duplicate Report Duplicate issue or pull request Reshaping Concat, Merge/Join, Stack/Unstack, Explode labels Feb 17, 2017
@jreback jreback added this to the 0.20.0 milestone Feb 17, 2017
@luca-s
Copy link
Contributor Author

luca-s commented Feb 17, 2017

I am not sure I am following your reasoning. From my point of view It's a different issue (totally different code bug) and they are also two different functions (qcut vs cut).

luca-s added a commit to luca-s/pandas that referenced this issue Feb 17, 2017
BUG: pd.qcut with q=1 and input with identical values
luca-s added a commit to luca-s/pandas that referenced this issue Mar 2, 2017
BUG: pd.qcut with q=1 and input with identical values
luca-s added a commit to luca-s/pandas that referenced this issue Mar 7, 2017
BUG: pd.qcut with q=1 and input with identical values
jreback pushed a commit that referenced this issue Mar 8, 2017
The special case of running pd.cut() qith bins=1 an input containing
all  0s raises a ValueError

closes #15428
closes #15431

Author: Luca Scarabello <[email protected]>
Author: Luca <[email protected]>

Closes #15437 from luca-s/issue_15428 and squashes the following commits:

1248987 [Luca] rebased on master
def84ba [Luca] Yet another implementation attempt
692503a [Luca Scarabello] Improved solution: using same approach as pd.cut
b7d92dc [Luca] Added 'allow' duplicates option to _bins_to_cuts
f56a27f [Luca Scarabello] Issue #15431
55806cf [Luca Scarabello] BUG: pd.cut with bins=1 and input all 0s
AnkurDedania pushed a commit to AnkurDedania/pandas that referenced this issue Mar 21, 2017
The special case of running pd.cut() qith bins=1 an input containing
all  0s raises a ValueError

closes pandas-dev#15428
closes pandas-dev#15431

Author: Luca Scarabello <[email protected]>
Author: Luca <[email protected]>

Closes pandas-dev#15437 from luca-s/issue_15428 and squashes the following commits:

1248987 [Luca] rebased on master
def84ba [Luca] Yet another implementation attempt
692503a [Luca Scarabello] Improved solution: using same approach as pd.cut
b7d92dc [Luca] Added 'allow' duplicates option to _bins_to_cuts
f56a27f [Luca Scarabello] Issue pandas-dev#15431
55806cf [Luca Scarabello] BUG: pd.cut with bins=1 and input all 0s
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Duplicate Report Duplicate issue or pull request Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants