Description
Code Sample, a copy-pastable example if possible
# Your code here
>>> df = pd.DataFrame.from_dict({"date": [1485264372711, 1485265925110, 1540215845888, 1540282121025]})
>>> df["date_dt"] = pd.to_datetime(df["date"], unit='ms', cache=True)
>>> df
date date_dt
0 1485264372711 2017-01-24 13:26:12.711
1 1485265925110 2017-01-24 13:52:05.110
2 1540215845888 2018-10-22 13:44:05.888
3 1540282121025 2018-10-23 08:08:41.025
>>> df.loc[:, "date_dt_cp"] = df.loc[:, "date_dt"]
>>> df
date date_dt date_dt_cp
0 1485264372711 2017-01-24 13:26:12.711 2017-01-24 13:26:12.711
1 1485265925110 2017-01-24 13:52:05.110 2017-01-24 13:52:05.110
2 1540215845888 2018-10-22 13:44:05.888 2018-10-22 13:44:05.888
3 1540282121025 2018-10-23 08:08:41.025 2018-10-23 08:08:41.025
>>> df.loc[[2,3], "date_dt_cp"] = df.loc[[2,3], "date_dt"]
>>> df
date date_dt date_dt_cp
0 1485264372711 2017-01-24 13:26:12.711 2017-01-24 13:26:12.711000064
1 1485265925110 2017-01-24 13:52:05.110 2017-01-24 13:52:05.110000128
2 1540215845888 2018-10-22 13:44:05.888 2018-10-22 13:44:05.888000000
3 1540282121025 2018-10-23 08:08:41.025 2018-10-23 08:08:41.024999936
Problem description
Using .loc[]
on datetime columns to assign values modifies the dates.
When .loc[]
is used on all the lines (df.loc[:, "date_dt_cp"] = df.loc[:, "date_dt"]
) the dates are unchanged
>>> df
date date_dt date_dt_cp
0 1485264372711 2017-01-24 13:26:12.711 2017-01-24 13:26:12.711
1 1485265925110 2017-01-24 13:52:05.110 2017-01-24 13:52:05.110
2 1540215845888 2018-10-22 13:44:05.888 2018-10-22 13:44:05.888
3 1540282121025 2018-10-23 08:08:41.025 2018-10-23 08:08:41.025
but when selecting only a subset of lines (df.loc[[2,3], "date_dt_cp"] = df.loc[[2,3], "date_dt"]
), the values of the dates are changed:
>>> df
date date_dt date_dt_cp
0 1485264372711 2017-01-24 13:26:12.711 2017-01-24 13:26:12.711000064
1 1485265925110 2017-01-24 13:52:05.110 2017-01-24 13:52:05.110000128
2 1540215845888 2018-10-22 13:44:05.888 2018-10-22 13:44:05.888000000
3 1540282121025 2018-10-23 08:08:41.025 2018-10-23 08:08:41.024999936
Expected Output
The last assignment in the example above shouldn't update the values:
date date_dt date_dt_cp
0 1485264372711 2017-01-24 13:26:12.711 2017-01-24 13:26:12.711
1 1485265925110 2017-01-24 13:52:05.110 2017-01-24 13:52:05.110
2 1540215845888 2018-10-22 13:44:05.888 2018-10-22 13:44:05.888
3 1540282121025 2018-10-23 08:08:41.025 2018-10-23 08:08:41.025
Output of pd.show_versions()
pandas : 0.25.1
numpy : 1.17.2
pytz : 2019.2
dateutil : 2.8.0
pip : 19.1.1
setuptools : 41.0.1
Cython : 0.29.10
pytest : 3.7.4
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : 2.8.3 (dt dec pq3 ext lo64)
jinja2 : 2.10.1
IPython : 7.3.0
pandas_datareader: None
bs4 : None
bottleneck : 1.2.1
fastparquet : None
gcsfs : None
lxml.etree : None
matplotlib : 3.0.0
numexpr : 2.7.0
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pytables : None
s3fs : None
scipy : 1.3.0
sqlalchemy : None
tables : None
xarray : None
xlrd : None
xlwt : 1.3.0
xlsxwriter : None