.loc assignment of pd.Timestamp to Series of dtype object results in cast to Long #15526

joseortiz3 · 2017-02-27T22:13:13Z

Simple repro, casting to object, but a datetimelike is cast to its underlying repr

In [1]: s = pd.Series([1])

In [2]: s
Out[2]: 
0    1
dtype: int64

In [3]: s[1] = pd.Timestamp('20130101')

In [4]: s
Out[4]: 
0                      1
1    1356998400000000000
dtype: object

Code Sample, a copy-pastable example if possible

>>> import pandas as pd
>>> s = pd.Series()
>>> s.loc['date'] = pd.Timestamp.now()
>>> s
date   2017-02-27 15:04:32.357
dtype: datetime64[ns]
>>> s.loc['date2'] = pd.Timestamp.now()
>>> s
date    2017-02-27 15:04:32.357
date2   2017-02-27 15:04:41.724
dtype: datetime64[ns]
>>> s.loc['date3'] = 3
>>> s
date     2017-02-27 15:04:32.357000
date2    2017-02-27 15:04:41.724000
date3                             3
dtype: object
>>> s.loc['date4'] = pd.Timestamp.now()
>>> s
date     2017-02-27 15:04:32.357000
date2    2017-02-27 15:04:41.724000
date3                             3
date4           1488207907032000000
dtype: object

Problem description

When a series' dtype is object, .loc assignment of pd.Timestamp values results in being cast to a long, rather than remaining a datetime. Since the dtype is object, I would expect objects to be uncasted.

Expected Output

The expected behavior is to not be casted, i.e. to remain a datetime.

>>> s = s.append(pd.Series(data = [pd.Timestamp.now()], index = ['date_x']))
>>> s
date      2017-02-27 15:04:32.357000
date2     2017-02-27 15:04:41.724000
date3                              3
date4            1488207907032000000
date_x    2017-02-27 15:11:03.995000
dtype: object

Output of `pd.show_versions()`

# Paste the output here pd.show_versions() here 27-Feb-17 15:08:27 DEBUG lzma module is not available 27-Feb-17 15:08:27 DEBUG Registered VCS backend: git 27-Feb-17 15:08:28 DEBUG Registered VCS backend: hg 27-Feb-17 15:08:28 DEBUG Config variable 'Py_DEBUG' is unset, Python ABI tag may be incorrect 27-Feb-17 15:08:28 DEBUG Config variable 'WITH_PYMALLOC' is unset, Python ABI tag may be incorrect 27-Feb-17 15:08:28 DEBUG Config variable 'Py_UNICODE_SIZE' is unset, Python ABI tag may be incorrect 27-Feb-17 15:08:28 DEBUG Config variable 'Py_DEBUG' is unset, Python ABI tag may be incorrect 27-Feb-17 15:08:28 DEBUG Config variable 'WITH_PYMALLOC' is unset, Python ABI tag may be incorrect 27-Feb-17 15:08:28 DEBUG Config variable 'Py_UNICODE_SIZE' is unset, Python ABI tag may be incorrect 27-Feb-17 15:08:28 DEBUG Registered VCS backend: svn 27-Feb-17 15:08:28 DEBUG Registered VCS backend: bzr

INSTALLED VERSIONS

commit: None
python: 2.7.12.final.0
python-bits: 64
OS: Windows
OS-release: 10
machine: AMD64
processor: Intel64 Family 6 Model 60 Stepping 3, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None

pandas: 0.19.2
nose: 1.3.7
pip: 9.0.1
setuptools: 33.1.0.post20170122
Cython: 0.25.2
numpy: 1.10.4
scipy: 0.17.1
statsmodels: 0.8.0
xarray: 0.9.1
IPython: 5.2.2
sphinx: 1.5.2
patsy: 0.4.1
dateutil: 2.6.0
pytz: 2016.10
blosc: None
bottleneck: 1.2.0
tables: 3.3.0
numexpr: 2.6.1
matplotlib: 2.0.0
openpyxl: 2.4.1
xlrd: 1.0.0
xlwt: 1.2.0
xlsxwriter: 0.9.6
lxml: 3.7.2
bs4: 4.5.3
html5lib: 0.999
httplib2: None
apiclient: None
sqlalchemy: 1.1.5
pymysql: None
psycopg2: None
jinja2: 2.8
boto: 2.45.0
pandas_datareader: 0.2.1

The text was updated successfully, but these errors were encountered:

jreback · 2017-02-27T22:19:42Z

I suppose. This is really really odd to do though. I think we are simply not going to allow things like this in pandas2 anyhow.

jreback · 2017-02-27T22:22:07Z

there are several related issues that I marked as well. PR's to look at would be appreciated.

joseortiz3 · 2017-02-27T22:27:24Z

I'm sorry, why is this odd? In my use case, it makes perfect sense, and probably in a lot of other scheduling applications.

>>> tm.get_trade_info_by_id(24)
data name
close_date                                    2017-02-28 14:30:00
decision                                                        1
exit                                                          NaN
exit_comm                                                     NaN
exit_exec                                                     NaN
exit_filled                                                   NaN
exit_id                                                       NaN
exit_price                                                    NaN
exit_status                                                   NaN
order           <ib.ext.Order.Order object at 0x000000000AA38C50>
order_comm                                                    NaN
order_exec      <ib.ext.Execution.Execution object at 0x000000...
order_filled                                                  260
order_id                                                       24
order_price                                                829.34
order_status                                               Filled
order_type                                                    NaN
quantity                                                      260
stock                                                        GOOG
date                                          1487881800000000000 # how I discovered this bug
Name: order_24, dtype: object

jreback · 2017-02-27T22:45:57Z

pandas is a columnar store. These should be single dtyped for performance. IOW, a DataFrame is a heterogenous collection of rows that that have possibly different dtypes column wise. You are looking at effectively a single row, which mixes dtypes (hence its object). You might as well just use straight python.

virtually None of the pandas magic will happen for object dtypes. Note that strings themselves are represented as object dtype so that is fine.

Mixing is a real no-no. Sure you can do it, and it IS supported. But its not very useful.

joseortiz3 · 2017-02-27T23:21:39Z

Ohhh. I see. In my case, I have a function that returns a row in a Dataframe whose columns are different dtypes (good pandas). Then the row is a Series with dtype object (heterogenous data). But in my case, I just wanted to add a couple things to that series that I would rather not put in that dataframe (although now I am considering it).

Also, even if the performance magic of pandas doesn't happen, organizational/syntax magic does.

jreback · 2017-10-30T11:00:06Z

this looks good in master. just need a validation test.

jreback · 2018-03-09T11:04:58Z

this is a duplicate of #6942

jreback added Dtype Conversions Unexpected or buggy dtype conversions Indexing Related to indexing on series/frames, not to indexes themselves Bug Difficulty Intermediate labels Feb 27, 2017

jreback added this to the Next Major Release milestone Feb 27, 2017

jreback modified the milestones: Next Major Release, 0.21.1 Oct 30, 2017

jorisvandenbossche modified the milestones: 0.21.1, 0.22.0 Nov 30, 2017

jreback mentioned this issue Dec 9, 2017

apply method, should return certain columns as datetime but returns them as int[64] instead #18700

Closed

jreback closed this as completed Mar 9, 2018

jreback added the Duplicate Report Duplicate issue or pull request label Mar 9, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.loc assignment of pd.Timestamp to Series of dtype object results in cast to Long #15526

.loc assignment of pd.Timestamp to Series of dtype object results in cast to Long #15526

joseortiz3 commented Feb 27, 2017 •

edited by jreback

Loading

INSTALLED VERSIONS

jreback commented Feb 27, 2017

jreback commented Feb 27, 2017

joseortiz3 commented Feb 27, 2017 •

edited

Loading

jreback commented Feb 27, 2017

joseortiz3 commented Feb 27, 2017 •

edited

Loading

jreback commented Oct 30, 2017

jreback commented Mar 9, 2018

.loc assignment of pd.Timestamp to Series of dtype object results in cast to Long #15526

.loc assignment of pd.Timestamp to Series of dtype object results in cast to Long #15526

Comments

joseortiz3 commented Feb 27, 2017 • edited by jreback Loading

Code Sample, a copy-pastable example if possible

Problem description

Expected Output

Output of pd.show_versions()

INSTALLED VERSIONS

jreback commented Feb 27, 2017

jreback commented Feb 27, 2017

joseortiz3 commented Feb 27, 2017 • edited Loading

jreback commented Feb 27, 2017

joseortiz3 commented Feb 27, 2017 • edited Loading

jreback commented Oct 30, 2017

jreback commented Mar 9, 2018

joseortiz3 commented Feb 27, 2017 •

edited by jreback

Loading

Output of `pd.show_versions()`

joseortiz3 commented Feb 27, 2017 •

edited

Loading

joseortiz3 commented Feb 27, 2017 •

edited

Loading