Skip to content

Incorrect timedelta type coercion when doing in-place .loc expansion #13829

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
bmcfee opened this issue Jul 28, 2016 · 3 comments · Fixed by #53812
Closed

Incorrect timedelta type coercion when doing in-place .loc expansion #13829

bmcfee opened this issue Jul 28, 2016 · 3 comments · Fixed by #53812
Labels
Bug Dtype Conversions Unexpected or buggy dtype conversions Needs Tests Unit test(s) needed to prevent regressions setitem-with-expansion Timedelta Timedelta data type

Comments

@bmcfee
Copy link
Contributor

bmcfee commented Jul 28, 2016

Code Sample, a copy-pastable example if possible

In [1]: import pandas as pd

In [2]: d = pd.DataFrame(columns=list('abc'))

In [3]: d.loc[0] = dict(a=pd.to_timedelta(5, unit='s'),
                        b=pd.to_timedelta(72, unit='s'),
                        c='23')

In [4]: d
Out[4]: 
         a        b               c
0 00:00:05 00:01:12 00:00:00.000000

In [5]: d.c
Out[5]: 
0   00:00:00.000000
Name: c, dtype: timedelta64[ns]

Expected Output

         a        b  c
0 00:00:05 00:01:12 23

The c column should have type object and the value at loc[0] should be of type string.

Note that this does not happen when only one of the a or b are timedeltas:

In [9]: d2 = pd.DataFrame(columns=list('abc'))

In [10]: d2.loc[0] = dict(a=5, b=pd.to_timedelta(72, unit='s'), c='23')

In [11]: d2
Out[11]: 
     a        b   c
0  5.0 00:01:12  23

In [12]: d3 = pd.DataFrame(columns=list('abc'))

In [13]: d3.loc[0] = dict(a=pd.to_timedelta(5, unit='s'), b=72, c='23')

In [14]: d3
Out[14]: 
         a     b   c
0 00:00:05  72.0  23

In these two cases, the c column retains the correct type.

output of pd.show_versions()

In [6]: pd.show_versions()

INSTALLED VERSIONS
------------------
commit: None
python: 3.5.2.final.0
python-bits: 64
OS: Linux
OS-release: 4.4.0-31-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8

pandas: 0.18.1
nose: 1.3.7
pip: 8.1.2
setuptools: 20.1.1
Cython: 0.24.1
numpy: 1.11.1
scipy: 0.18.0
statsmodels: None
xarray: None
IPython: 5.0.0
sphinx: 1.4.1
patsy: None
dateutil: 2.5.2
pytz: 2016.4
blosc: None
bottleneck: None
tables: None
numexpr: 2.6.0
matplotlib: 1.5.2
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: 1.0.9
pymysql: None
psycopg2: None
jinja2: 2.8
boto: None
pandas_datareader: None
@sinhrks
Copy link
Member

sinhrks commented Jul 28, 2016

Thanks. This looks to be caused by Series type inference is bit aggressive. The conversion is performed here.

pd.Series(dict(a=pd.to_timedelta(5, unit='s'),
               b=pd.to_timedelta(72, unit='s'),
               c='23'))
# a          00:00:05
# b          00:01:12
# c   00:00:00.000000
# dtype: timedelta64[ns]

PR is appreciated!

@jreback jreback added Dtype Conversions Unexpected or buggy dtype conversions Difficulty Intermediate labels Jul 28, 2016
@jreback jreback added this to the Next Major Release milestone Jul 28, 2016
@bmcfee
Copy link
Contributor Author

bmcfee commented Jul 28, 2016

Thanks for confirming. I'd be happy to pr, but after looking at the code, really have no idea where to begin. I think this would be better handled by someone already familiar with the internals of pandas.

@jreback jreback changed the title Incorrect timedelta type coercion when doing in-place loc[i] = append Incorrect timedelta type coercion when doing in-place .loc expansion Jul 29, 2016
@mroeschke mroeschke removed this from the Contributions Welcome milestone Oct 13, 2022
@jbrockmendel
Copy link
Member

This looks right on main. Could use a test (or check if #27303 added a relevant test)

@jbrockmendel jbrockmendel added the Needs Tests Unit test(s) needed to prevent regressions label May 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Dtype Conversions Unexpected or buggy dtype conversions Needs Tests Unit test(s) needed to prevent regressions setitem-with-expansion Timedelta Timedelta data type
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants