Skip to content

DOC: .loc behavior undocumented for Index argument #36850

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
hickmanw opened this issue Oct 4, 2020 · 2 comments · Fixed by #38109
Closed

DOC: .loc behavior undocumented for Index argument #36850

hickmanw opened this issue Oct 4, 2020 · 2 comments · Fixed by #38109
Assignees
Labels
Docs Index Related to the Index class or subclasses
Milestone

Comments

@hickmanw
Copy link

hickmanw commented Oct 4, 2020

Location of the documentation

pandas.Series.loc

pandas.DataFrame.loc

Documentation problem

The documentation does not mention pandas.Index as a valid argument for .loc. I expected the result to be the same as passing the values of the index, but the result is instead the same as reindex (as long as the result has at least one element, otherwise it raises KeyError). I discovered the difference when relying on the index name to remain unchanged after .loc. Below is a minimal demonstration of the current behavior for pandas.Series. Analogous behavior occurs for both axes of pandas.DataFrame.

In  [2]: foo = pd.Index(range(2), name='foo')
         bar = pd.Index(range(1), name='bar')
         baz = pd.Index([2], name='baz')

         s = pd.Series(list('ab'), index=foo)
         s
Out [2]: 
         foo
         0    a
         1    b
         dtype: object

In  [3]: s.loc[bar]
Out [3]: 
         bar
         0    a
         dtype: object

In  [4]: s.loc[bar].index is bar
Out [4]: True

In  [5]: s.reindex(bar).index is bar
Out [5]: True

In  [6]: s.loc[baz]
Out [6]: KeyError: "None of [Int64Index([2], dtype='int64', name='baz')] are in the [index]"

Output of pd.show_versions

INSTALLED VERSIONS


commit : 2a7d332
python : 3.6.7.final.0
python-bits : 64
OS : Linux
OS-release : 4.14.193-149.317.amzn2.x86_64
Version : #1 SMP Thu Sep 3 19:04:44 UTC 2020
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : en_US.UTF-8
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8

pandas : 1.1.2
numpy : 1.19.1
pytz : 2020.1
dateutil : 2.8.1
pip : 20.2.3
setuptools : 50.3.0
Cython : None
pytest : 5.4.3
hypothesis : None
sphinx : 3.2.1
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : 1.1
pymysql : None
psycopg2 : None
jinja2 : 2.11.2
IPython : 7.16.1
pandas_datareader: None
bs4 : None
bottleneck : None
fsspec : None
fastparquet : None
gcsfs : None
matplotlib : 3.3.1
numexpr : None
odfpy : None
openpyxl : 3.0.5
pandas_gbq : None
pyarrow : None
pytables : None
pyxlsb : None
s3fs : None
scipy : 1.5.2
sqlalchemy : 1.2.12
tables : None
tabulate : None
xarray : None
xlrd : 1.2.0
xlwt : None
numba : 0.51.0

Suggested fix for documentation

Add an entry to explain the current/intended behavior when passing an Index to .loc, especially mentioning that the index will be replaced. Doing so would complement the clarity recently provided by #35506 for alignable boolean series.

@hickmanw hickmanw added Docs Needs Triage Issue that has not been reviewed by a pandas team member labels Oct 4, 2020
@rhshadrach rhshadrach added Index Related to the Index class or subclasses and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Oct 19, 2020
@rhshadrach
Copy link
Member

Thanks for raising this. PRs to improve the docs are most welcome.

@hongshaoyang
Copy link
Contributor

take

@jreback jreback added this to the 1.2 milestone Nov 29, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Docs Index Related to the Index class or subclasses
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants