Skip to content

nunique + TimeGrouper error #12352

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wavexx opened this issue Feb 16, 2016 · 5 comments
Closed

nunique + TimeGrouper error #12352

wavexx opened this issue Feb 16, 2016 · 5 comments
Labels
Bug Resample resample method
Milestone

Comments

@wavexx
Copy link

wavexx commented Feb 16, 2016

This used to work in the past:

tmp = pd.DataFrame({
        'ID': {pd.Timestamp('2015-06-05 00:00:00'): '0010100903', pd.Timestamp('2015-06-08 00:00:00'): '0010150847'},
        'DATE': {pd.Timestamp('2015-06-05 00:00:00'): '2015-06-05', pd.Timestamp('2015-06-08 00:00:00'): '2015-06-08'}})
tmp.groupby(pd.TimeGrouper('D')).ID.nunique()

but now I get the obscure:

Traceback (most recent call last):
  File "test.py", line 6, in <module>
    tmp.groupby(pd.TimeGrouper('D')).ID.nunique()
  File "/usr/lib/python3/dist-packages/pandas/core/groupby.py", line 2697, in nunique
    name=self.name)
  File "/usr/lib/python3/dist-packages/pandas/core/series.py", line 227, in __init__
    data = SingleBlockManager(data, index, fastpath=True)
  File "/usr/lib/python3/dist-packages/pandas/core/internals.py", line 3736, in __init__
    ndim=1, fastpath=True)
  File "/usr/lib/python3/dist-packages/pandas/core/internals.py", line 2454, in make_block
    placement=placement)
  File "/usr/lib/python3/dist-packages/pandas/core/internals.py", line 87, in __init__
    len(self.values), len(self.mgr_locs)))
ValueError: Wrong number of items passed 2, placement implies 4
@wavexx
Copy link
Author

wavexx commented Feb 16, 2016

The logically equivalent:

tmp.groupby(pd.TimeGrouper('D')).ID.apply(lambda x: x.nunique())

works as intended.

@jreback jreback added Bug Resample resample method labels Feb 16, 2016
@jreback jreback added this to the 0.18.0 milestone Feb 16, 2016
@jreback
Copy link
Contributor

jreback commented Feb 16, 2016

hmm, does look buggy.

@jreback
Copy link
Contributor

jreback commented Feb 16, 2016

this last worked in 0.16.2, and failed in 0.17.0 (and continues in 0.18.0),

@wavexx
Copy link
Author

wavexx commented Feb 16, 2016

Thanks for investigating the exact breaking point. I'm currently revisiting some code that I wrote for python2.7 with pandas 0.16.* and now I'm porting to python3 and 0.17.1 (currently Debian unstable).

@jreback
Copy link
Contributor

jreback commented Feb 16, 2016

yeah there were some fixes related to this, but this one didn't take.

jreback added a commit to jreback/pandas that referenced this issue Feb 17, 2016
@jreback jreback modified the milestones: 0.18.0, 0.18.1 Feb 17, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Resample resample method
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants