Skip to content

BUG: pandas.DataFrame.index.map() works differently if debugpy debugger is attached #43940

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
2 of 3 tasks
ember91 opened this issue Oct 9, 2021 · 4 comments
Closed
2 of 3 tasks
Labels
Bug Index Related to the Index class or subclasses Needs Info Clarification about behavior needed to assess issue

Comments

@ember91
Copy link
Contributor

ember91 commented Oct 9, 2021

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the master branch of pandas.

Reproducible Example

import pandas as pd


def map_func(dt):
    if dt.month <= 2:
        return dt.year + dt.month - 1
    return dt.year + dt.month


dict = {
    "Name": ["Tom", "Joseph"],
    "Date": ["2021-10-01", "2021-11-01"],
}
df = pd.DataFrame(dict)
df = df.set_index("Date")
df.index = pd.to_datetime(df.index)
print(f"Result: {df.index.map(map_func)}")

Issue Description

Running the attached program in debugpy debugger produces different results than if I run it plainly.

Running it plainly with /usr/bin/env /usr/bin/python3.9 test.py produces output as expected:
Result: Int64Index([2031, 2032], dtype='int64', name='Date')

Running it in debugpy with:
cd $USER/test_error ; /usr/bin/env /usr/bin/python3.9 $USER/.vscode/extensions/ms-python.python-2021.10.1317843341/pythonFiles/lib/python/debugpy/launcher 37319 -- $USER/test_error/test.py
produces a crash on line 5. For some reason dt.month is the full index:
DatetimeIndex(['2021-10-01', '2021-11-01'], dtype='datetime64[ns]', name='Date', freq=None)
and not only a single date.

Note that the debugpy command above is the auto generated command when using the VS code debugger (Run -> Start Debugging, or F5). Note that it also assumes that you are in the '$USER/test_error' directory.

VS code version is 1.61.0 and Python extension version is v2021.10.1317843341.

I'm posting it here, but I'm not sure if it's a pandas, debugger or python issue.

Expected Behavior

No crash when running in debugpy, or at least the same result

Installed Versions

INSTALLED VERSIONS

commit : 73c6825
python : 3.9.6.final.0
python-bits : 64
OS : Linux
OS-release : 5.11.0-37-generic
Version : #41~20.04.2-Ubuntu SMP Fri Sep 24 09:06:38 UTC 2021
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8

pandas : 1.3.3
numpy : 1.21.2
pytz : 2021.3
dateutil : 2.8.2
pip : 21.2.4
setuptools : 58.2.0
Cython : 0.29.24
pytest : None
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : 3.0.1
lxml.etree : 4.6.3
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 3.0.2
IPython : None
pandas_datareader: None
bs4 : 4.10.0
bottleneck : 1.3.2
fsspec : None
fastparquet : None
gcsfs : None
matplotlib : 3.4.3
numexpr : 2.7.3
odfpy : None
openpyxl : 3.0.9
pandas_gbq : None
pyarrow : None
pyxlsb : None
s3fs : None
scipy : 1.7.1
sqlalchemy : None
tables : 3.6.1
tabulate : None
xarray : None
xlrd : None
xlwt : None
numba : None
None

@ember91 ember91 added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Oct 9, 2021
@ember91 ember91 changed the title BUG: pandas.DataFrame.index.map() works differently if debugger is attached BUG: pandas.DataFrame.index.map() works differently if debugpy debugger is attached Oct 9, 2021
@mzeitlin11
Copy link
Member

Thanks for reporting this @ember91! Can't reproduce this with the pycharm debugger or pdb - can others reproduce? @ember91 best way to distinguish between a pandas issue or debugpy issue would be if you can step through your code and see exactly where the behaviors start to diverge

@mzeitlin11 mzeitlin11 added the Index Related to the Index class or subclasses label Oct 9, 2021
@ember91
Copy link
Contributor Author

ember91 commented Oct 10, 2021

So I ran it with pdb as well. Breaking at line 5 (break test.py:5 in pdb), both debuggers show that dt contains DatetimeIndex(['2021-10-01', '2021-11-01'], dtype='datetime64[ns]', name='Date', freq=None). Next step, debugpy will crash, since dt isn't a singular date, but pdb will jump out of the function and go to pandas/core/indexes/extension.py(80)fget().

So by looking in the source code in extension.py I see the comment:
Try to run function on index first, and then on elements of index

@mroeschke mroeschke added Needs Info Clarification about behavior needed to assess issue and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Oct 16, 2021
@mroeschke
Copy link
Member

Since this issue is not immediately reproducible with more standard debuggers and not immediately obvious that this is a pandas issue, closing. If the debugpy folks can pin this issue to pandas can reopen.

@ember91
Copy link
Contributor Author

ember91 commented Nov 1, 2021

See debugpy bug report at microsoft/debugpy#775.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Index Related to the Index class or subclasses Needs Info Clarification about behavior needed to assess issue
Projects
None yet
Development

No branches or pull requests

3 participants