Skip to content

BUG: Series.clip should not take axis argument #47100

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
3 tasks done
mvashishtha opened this issue May 23, 2022 · 7 comments · Fixed by #47109
Closed
3 tasks done

BUG: Series.clip should not take axis argument #47100

mvashishtha opened this issue May 23, 2022 · 7 comments · Fixed by #47109

Comments

@mvashishtha
Copy link

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import pandas as pd
pd.Series([1]).clip(3, axis=1)

Issue Description

Series.clip takes an axis argument documented here, but there's only one axis for a Series and passing axis=1 always gives ValueError: No axis named 1 for object type Series.

Expected Behavior

axis should be an invalid argument to Series.clip.

Installed Versions

INSTALLED VERSIONS

commit : b4e578d
python : 3.9.12.final.0
python-bits : 64
OS : Darwin
OS-release : 21.3.0
Version : Darwin Kernel Version 21.3.0: Wed Jan 5 21:37:58 PST 2022; root:xnu-8019.80.24~20/RELEASE_X86_64
machine : x86_64
processor : i386
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8

pandas : 1.5.0.dev0+122.gb4e578db85.dirty
numpy : 1.22.1
pytz : 2021.3
dateutil : 2.8.2
pip : 22.0.4
setuptools : 59.8.0
Cython : 0.29.27
pytest : 7.0.0
hypothesis : None
sphinx : 4.4.0
blosc : None
feather : None
xlsxwriter : 3.0.3
lxml.etree : 4.8.0
html5lib : None
pymysql : None
psycopg2 : 2.9.3
jinja2 : 3.0.3
IPython : 8.0.0
pandas_datareader: None
bs4 : None
bottleneck : None
fastparquet : None
fsspec : 2022.3.0
gcsfs : None
matplotlib : 3.5.1
numba : None
numexpr : 2.8.1
odfpy : None
openpyxl : 3.0.9
pandas_gbq : 0.16.0
pyarrow : 6.0.1
pyreadstat : None
pyxlsb : None
s3fs : 2022.3.0
scipy : 1.7.3
sqlalchemy : 1.4.31
tables : 3.7.0
tabulate : None
xarray : 0.20.2
xlrd : 2.0.1
xlwt : None
zstandard : None

@mvashishtha mvashishtha added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels May 23, 2022
@simonjayhawkins
Copy link
Member

Thanks @mvashishtha for the report.

the axis argument is for compatibility with DataFrame.clip and the docs for Series.clip probably need to clarify this.

axis : int or str axis name, optional
Align object with lower and upper along the given axis.

other Series methods that take axis arguments for compatibility such as add, sub, mul, div, mod etc do not document the axis argument in the parameter list (which is probably also not the right thing to do)

whereas Series.aggregate documents the axis parameter as

axis : {0 or ‘index’}
Parameter needed for compatibility with DataFrame.

which I think is clearer.

raising ValueError: No axis named 1 for object type Series is the correct behavior here and

pd.Series([1]).clip(3, axis=0)

gives the correct answer.

PR welcome to fix Series.clip documentation. or PRs welcome to make the documentation more consistent for all Series methods that accept an axis parameter.

@simonjayhawkins simonjayhawkins added Docs good first issue and removed Bug Needs Triage Issue that has not been reviewed by a pandas team member labels May 24, 2022
@simonjayhawkins simonjayhawkins added this to the Contributions Welcome milestone May 24, 2022
@mvashishtha
Copy link
Author

Thank you @simonjayhawkins for the response.

Parameter needed for compatibility with DataFrame.

This seems like an undesirable feature of the Series API. Isn't allowing the axis parameters in all these methods just an internal convenience for pandas code? Why do all these Series methods need to be compatible with the corresponding DataFrame ones?

@simonjayhawkins
Copy link
Member

Why do all these Series methods need to be compatible with the corresponding DataFrame ones?

good question. I don't know the history here.

Isn't allowing the axis parameters in all these methods just an internal convenience for pandas code?

It maybe that since Series and DataFrame are subclasses of an internal NDFrame class that some of these methods either shared the same code in the past or still currently share the same code.

@jreback
Copy link
Contributor

jreback commented May 24, 2022

one of tenants of implementation is to have the exact same implementation for a series or a kernel applied to multiple arrays for a frame

there are at times performance shortcuts taken that make this not possible but generally it's true

we have the same api on these structures to provide an obvious user experience - sure we could remove the axis parameter from series but it actually makes the signature the same as for frames

so this is a deliberate UX choice as well an implementation saver

all good properties in an open source project

@jackgoldsmith4
Copy link
Contributor

I can work on a PR to make the docs consistent. @simonjayhawkins I am going to add a mention of the axis parameter in the docstrings for all Series methods

@jackgoldsmith4
Copy link
Contributor

take

@jackgoldsmith4
Copy link
Contributor

Going through the docs for different Series methods revealed that the defaults for the axis parameter seem to arbitrarily switch between 0 and None. Does this distinction matter, or should they all be consistent?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants