Skip to content

BUG: Pandas 1.4.0 - pd.NaT can not be replaced. #45836

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
2 of 3 tasks
xmatthias opened this issue Feb 5, 2022 · 6 comments · Fixed by #46404
Closed
2 of 3 tasks

BUG: Pandas 1.4.0 - pd.NaT can not be replaced. #45836

xmatthias opened this issue Feb 5, 2022 · 6 comments · Fixed by #46404
Labels
Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate Regression Functionality that used to work in a prior pandas version replace replace method
Milestone

Comments

@xmatthias
Copy link

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import pandas as pd

pd.DataFrame([pd.NaT, pd.NaT]).replace({pd.NaT: None, pd.np.NaN: None})

# Either pd.NaT or pd.np.NaN work in 1.3.5

Issue Description

In Pandas 1.3.5, the above example returns

  pd.DataFrame([pd.NaT, pd.NaT]).replace({pd.NaT: None, pd.np.NaN: None})
Out[1]: 
      0
0  None
1  None

in pandas 1.4.0, this no longer works

  pd.DataFrame([pd.NaT, pd.NaT]).replace({pd.NaT: None, pd.np.NaN: None})
Out[5]: 
    0
0 NaT
1 NaT

Expected Behavior

.replace() should correctly replace pd.NaT to whatever is specified.

even if we assume pd.NaT is no longer np.NaN - this should still work as pd.NaT is explicitly given.

Installed Versions

Version 1.4.0:

INSTALLED VERSIONS

commit : bb1f651
python : 3.9.7.final.0
python-bits : 64
OS : Linux
OS-release : 5.16.5-arch1-1
Version : #1 SMP PREEMPT Tue, 01 Feb 2022 21:42:50 +0000
machine : x86_64
processor :
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8
pandas : 1.4.0
numpy : 1.22.1
pytz : 2021.3
dateutil : 2.8.2
pip : 22.0.3
setuptools : 57.4.0
Cython : None
pytest : 6.2.5
hypothesis : None
sphinx : None
blosc : 1.10.6
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : 1.0.2
psycopg2 : 2.9.3
jinja2 : 3.0.3
IPython : 7.29.0
pandas_datareader: None
bs4 : None
bottleneck : None
fastparquet : None
fsspec : None
gcsfs : None
matplotlib : None
numba : None
numexpr : 2.7.3
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pyreadstat : None
pyxlsb : None
s3fs : None
scipy : 1.7.3
sqlalchemy : 1.4.31
tables : 3.7.0
tabulate : 0.8.9
xarray : None
xlrd : None
xlwt : None
zstandard : None

Version 1.3.5 (where it's still working):

INSTALLED VERSIONS

commit : 66e3805
python : 3.9.7.final.0
python-bits : 64
OS : Linux
OS-release : 5.16.5-arch1-1
Version : #1 SMP PREEMPT Tue, 01 Feb 2022 21:42:50 +0000
machine : x86_64
processor :
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8

pandas : 1.3.5
numpy : 1.22.1
pytz : 2021.3
dateutil : 2.8.2
pip : 22.0.3
setuptools : 57.4.0
Cython : None
pytest : 6.2.5
hypothesis : None
sphinx : None
blosc : 1.10.6
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : 1.0.2
psycopg2 : 2.9.3 (dt dec pq3 ext lo64)
jinja2 : 3.0.3
IPython : 7.29.0
pandas_datareader: None
bs4 : None
bottleneck : None
fsspec : None
fastparquet : None
gcsfs : None
matplotlib : None
numexpr : 2.7.3
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pyxlsb : None
s3fs : None
scipy : 1.7.3
sqlalchemy : 1.4.31
tables : 3.7.0
tabulate : 0.8.9
xarray : None
xlrd : None
xlwt : None
numba : None

@xmatthias xmatthias added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Feb 5, 2022
@xmatthias xmatthias changed the title BUG: Pandas 1.4.0 pd.NaT can not be replaced. BUG: Pandas 1.4.0 - pd.NaT can not be replaced. Feb 5, 2022
@phofl
Copy link
Member

phofl commented Feb 6, 2022

Looks like a duplicate/closely aligned with #45601

@jorisvandenbossche jorisvandenbossche added Regression Functionality that used to work in a prior pandas version replace replace method Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate and removed Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Feb 7, 2022
@jorisvandenbossche jorisvandenbossche added this to the 1.4.1 milestone Feb 7, 2022
@jorisvandenbossche
Copy link
Member

It's indeed the same issue as #45601 (and I suppose also caused by the same code change, but didn't verify that).

But this is certainly a regression, so let's maybe keep this open to track for 1.4.1 (the other issue is about pd.NA, which is strictly speaking still experimental, and we don't necessarily have to fix it for 1.4, although it might be easier/better to fix both)

@jorisvandenbossche
Copy link
Member

jorisvandenbossche commented Feb 7, 2022

cc @jbrockmendel (probably caused by #44940, based on the comments in #45601)

@jbrockmendel
Copy link
Member

Yah, one option is to say "manually convert to object dtype". Another is we could check for if both to_replace and value are NA and in that case infer that the user doesn't want them treated as equivalent. i think we do something similar in replace_list

simonjayhawkins added a commit to simonjayhawkins/pandas that referenced this issue Feb 9, 2022
@simonjayhawkins
Copy link
Member

cc @jbrockmendel (probably caused by #44940, based on the comments in #45601)

can confirm first bad commit: [9cd1c6f] BUG: nullable dtypes not preserved in Series.replace (#44940)

@simonjayhawkins
Copy link
Member

It's indeed the same issue as #45601 (and I suppose also caused by the same code change, but didn't verify that).

ran git bisect on #45601, can confirm same PR #44940

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate Regression Functionality that used to work in a prior pandas version replace replace method
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants