Skip to content

.loc assignment of pd.Timestamp to Series of dtype object results in cast to Long #15526

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
joseortiz3 opened this issue Feb 27, 2017 · 7 comments
Labels
Bug Dtype Conversions Unexpected or buggy dtype conversions Duplicate Report Duplicate issue or pull request Indexing Related to indexing on series/frames, not to indexes themselves
Milestone

Comments

@joseortiz3
Copy link
Contributor

joseortiz3 commented Feb 27, 2017

xref #6942
xref #12499
xref #14179

Simple repro, casting to object, but a datetimelike is cast to its underlying repr

In [1]: s = pd.Series([1])

In [2]: s
Out[2]: 
0    1
dtype: int64

In [3]: s[1] = pd.Timestamp('20130101')

In [4]: s
Out[4]: 
0                      1
1    1356998400000000000
dtype: object

Code Sample, a copy-pastable example if possible

>>> import pandas as pd
>>> s = pd.Series()
>>> s.loc['date'] = pd.Timestamp.now()
>>> s
date   2017-02-27 15:04:32.357
dtype: datetime64[ns]
>>> s.loc['date2'] = pd.Timestamp.now()
>>> s
date    2017-02-27 15:04:32.357
date2   2017-02-27 15:04:41.724
dtype: datetime64[ns]
>>> s.loc['date3'] = 3
>>> s
date     2017-02-27 15:04:32.357000
date2    2017-02-27 15:04:41.724000
date3                             3
dtype: object
>>> s.loc['date4'] = pd.Timestamp.now()
>>> s
date     2017-02-27 15:04:32.357000
date2    2017-02-27 15:04:41.724000
date3                             3
date4           1488207907032000000
dtype: object

Problem description

When a series' dtype is object, .loc assignment of pd.Timestamp values results in being cast to a long, rather than remaining a datetime. Since the dtype is object, I would expect objects to be uncasted.

Expected Output

The expected behavior is to not be casted, i.e. to remain a datetime.

>>> s = s.append(pd.Series(data = [pd.Timestamp.now()], index = ['date_x']))
>>> s
date      2017-02-27 15:04:32.357000
date2     2017-02-27 15:04:41.724000
date3                              3
date4            1488207907032000000
date_x    2017-02-27 15:11:03.995000
dtype: object

Output of pd.show_versions()

# Paste the output here pd.show_versions() here 27-Feb-17 15:08:27 DEBUG lzma module is not available 27-Feb-17 15:08:27 DEBUG Registered VCS backend: git 27-Feb-17 15:08:28 DEBUG Registered VCS backend: hg 27-Feb-17 15:08:28 DEBUG Config variable 'Py_DEBUG' is unset, Python ABI tag may be incorrect 27-Feb-17 15:08:28 DEBUG Config variable 'WITH_PYMALLOC' is unset, Python ABI tag may be incorrect 27-Feb-17 15:08:28 DEBUG Config variable 'Py_UNICODE_SIZE' is unset, Python ABI tag may be incorrect 27-Feb-17 15:08:28 DEBUG Config variable 'Py_DEBUG' is unset, Python ABI tag may be incorrect 27-Feb-17 15:08:28 DEBUG Config variable 'WITH_PYMALLOC' is unset, Python ABI tag may be incorrect 27-Feb-17 15:08:28 DEBUG Config variable 'Py_UNICODE_SIZE' is unset, Python ABI tag may be incorrect 27-Feb-17 15:08:28 DEBUG Registered VCS backend: svn 27-Feb-17 15:08:28 DEBUG Registered VCS backend: bzr

INSTALLED VERSIONS

commit: None
python: 2.7.12.final.0
python-bits: 64
OS: Windows
OS-release: 10
machine: AMD64
processor: Intel64 Family 6 Model 60 Stepping 3, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None

pandas: 0.19.2
nose: 1.3.7
pip: 9.0.1
setuptools: 33.1.0.post20170122
Cython: 0.25.2
numpy: 1.10.4
scipy: 0.17.1
statsmodels: 0.8.0
xarray: 0.9.1
IPython: 5.2.2
sphinx: 1.5.2
patsy: 0.4.1
dateutil: 2.6.0
pytz: 2016.10
blosc: None
bottleneck: 1.2.0
tables: 3.3.0
numexpr: 2.6.1
matplotlib: 2.0.0
openpyxl: 2.4.1
xlrd: 1.0.0
xlwt: 1.2.0
xlsxwriter: 0.9.6
lxml: 3.7.2
bs4: 4.5.3
html5lib: 0.999
httplib2: None
apiclient: None
sqlalchemy: 1.1.5
pymysql: None
psycopg2: None
jinja2: 2.8
boto: 2.45.0
pandas_datareader: 0.2.1

@jreback
Copy link
Contributor

jreback commented Feb 27, 2017

I suppose. This is really really odd to do though. I think we are simply not going to allow things like this in pandas2 anyhow.

@jreback jreback added Dtype Conversions Unexpected or buggy dtype conversions Indexing Related to indexing on series/frames, not to indexes themselves Bug Difficulty Intermediate labels Feb 27, 2017
@jreback jreback added this to the Next Major Release milestone Feb 27, 2017
@jreback
Copy link
Contributor

jreback commented Feb 27, 2017

there are several related issues that I marked as well. PR's to look at would be appreciated.

@joseortiz3
Copy link
Contributor Author

joseortiz3 commented Feb 27, 2017

I'm sorry, why is this odd? In my use case, it makes perfect sense, and probably in a lot of other scheduling applications.

>>> tm.get_trade_info_by_id(24)
data name
close_date                                    2017-02-28 14:30:00
decision                                                        1
exit                                                          NaN
exit_comm                                                     NaN
exit_exec                                                     NaN
exit_filled                                                   NaN
exit_id                                                       NaN
exit_price                                                    NaN
exit_status                                                   NaN
order           <ib.ext.Order.Order object at 0x000000000AA38C50>
order_comm                                                    NaN
order_exec      <ib.ext.Execution.Execution object at 0x000000...
order_filled                                                  260
order_id                                                       24
order_price                                                829.34
order_status                                               Filled
order_type                                                    NaN
quantity                                                      260
stock                                                        GOOG
date                                          1487881800000000000 # how I discovered this bug
Name: order_24, dtype: object

@jreback
Copy link
Contributor

jreback commented Feb 27, 2017

pandas is a columnar store. These should be single dtyped for performance. IOW, a DataFrame is a heterogenous collection of rows that that have possibly different dtypes column wise. You are looking at effectively a single row, which mixes dtypes (hence its object). You might as well just use straight python.

virtually None of the pandas magic will happen for object dtypes. Note that strings themselves are represented as object dtype so that is fine.

Mixing is a real no-no. Sure you can do it, and it IS supported. But its not very useful.

@joseortiz3
Copy link
Contributor Author

joseortiz3 commented Feb 27, 2017

Ohhh. I see. In my case, I have a function that returns a row in a Dataframe whose columns are different dtypes (good pandas). Then the row is a Series with dtype object (heterogenous data). But in my case, I just wanted to add a couple things to that series that I would rather not put in that dataframe (although now I am considering it).

Also, even if the performance magic of pandas doesn't happen, organizational/syntax magic does.

@jreback jreback modified the milestones: Next Major Release, 0.21.1 Oct 30, 2017
@jreback
Copy link
Contributor

jreback commented Oct 30, 2017

this looks good in master. just need a validation test.

@jreback
Copy link
Contributor

jreback commented Mar 9, 2018

this is a duplicate of #6942

@jreback jreback closed this as completed Mar 9, 2018
@jreback jreback added the Duplicate Report Duplicate issue or pull request label Mar 9, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Dtype Conversions Unexpected or buggy dtype conversions Duplicate Report Duplicate issue or pull request Indexing Related to indexing on series/frames, not to indexes themselves
Projects
None yet
Development

No branches or pull requests

3 participants