-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
groupby.agg changes the default parameter of np.std #13344
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
this is as expected (and has been these way for as long as I can remember). numpy functions are translated to pandas ones. If you really want to use numpy then.
|
you can also explicty pass in |
I thought about that possibility but seeing that it is not translated in |
While some approaches use FWIW Came here by way of StackOverflow post: https://stackoverflow.com/a/47699378/7829525 |
I'm posting here in case someone else runs into the same issue. I want both mean and standard deviation, but I can't pass in ddof to the
That throws So instead I tried to do
But this actually does not work correctly because it's trying to transform each cell instead of doing an actual aggretate. So this then turns into
|
Code Sample, a copy-pastable example if possible
As opposed to pandas, the default value for
ddof
in standard deviation calculations is 0 in numpy. However, when I usegroupby.agg(np.std)
I get the result forddof=1
.Expected Output
There is a note in the docs that states the default axis is changed but for me ddof change is quite unexpected. So my expected output is this:
output of
pd.show_versions()
INSTALLED VERSIONS
commit: None
python: 3.5.1.final.0
python-bits: 64
OS: Windows
OS-release: 10
machine: AMD64
processor: Intel64 Family 6 Model 60 Stepping 3, GenuineIntel
byteorder: little
LC_ALL: None
LANG: en_GB
pandas: 0.18.1
nose: 1.3.7
pip: 8.1.1
setuptools: 20.3
Cython: 0.23.4
numpy: 1.11.0
scipy: 0.17.0
statsmodels: None
xarray: None
IPython: 4.1.2
sphinx: 1.3.1
patsy: 0.4.0
dateutil: 2.5.3
pytz: 2016.4
blosc: None
bottleneck: 1.0.0
tables: 3.2.2
numexpr: 2.5
matplotlib: 1.5.1
openpyxl: 2.3.2
xlrd: 0.9.4
xlwt: 1.0.0
xlsxwriter: 0.8.4
lxml: 3.6.0
bs4: 4.4.1
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: 1.0.12
pymysql: None
psycopg2: None
jinja2: 2.8
boto: 2.39.0
pandas_datareader: None
The text was updated successfully, but these errors were encountered: