-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Bug when combining .groupby() apply with .expanding() apply #12829
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
this is an extremely weird thing to do (and completely non-performant), keeping tuples in columns. do something like this:
|
Sorry there shouldn't have been any tuples in the columns. I've changed it all to strings. The problem is with the attempt to use a window method to count the occurrences of these strings. The code snipped I posted should be returning counts, not strings. |
|
Thanks, but I'm trying to only count rows including a string in the list |
then just pre-filter first. |
It seems so easy once you say it. |
|
just |
Ah! All is clear. thanks! |
In this example, the aim is to use an expanding window to create an expanding count, by group, of the occurrence of a predetermined set of strings. Seemed like there might be some sort of bug in the performance of
expanding
when combined withgroupby
andapply
.In this case the strings are
['tito', 'bar', 'feep']
So this would become:
However, when I run the following code, it's just the
category
column that gets returned ascount
. The same thing happens when I usewindow
in the place ofexpanding
.INSTALLED VERSIONS
commit: None
python: 2.7.6.final.0
python-bits: 64
OS: Linux
OS-release: 3.13.0-76-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
pandas: 0.18.0
nose: 1.3.1
pip: 8.0.2
setuptools: 19.1.1
Cython: 0.23.4
numpy: 1.11.0
scipy: 0.16.1
statsmodels: 0.6.1
xarray: None
IPython: 4.0.2
sphinx: 1.2.2
patsy: 0.4.1
dateutil: 2.5.2
pytz: 2016.3
blosc: None
bottleneck: None
tables: None
numexpr: None
matplotlib: 1.5.1
openpyxl: None
xlrd: None
xlwt: 0.7.5
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.8
boto: None
The text was updated successfully, but these errors were encountered: