Skip to content

[to_datetime] TypeError: Argument 'date_string' has incorrect type (expected str, got numpy.str_) #32264

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
arnau126 opened this issue Feb 26, 2020 · 4 comments · Fixed by #45280
Labels
Bug Datetime Datetime data dtype
Milestone

Comments

@arnau126
Copy link

arnau126 commented Feb 26, 2020

Code Sample

import pandas as pd
import numpy as np

value = np.str_('2019-02-04 10:18:46.297000+0000')
arr = [value]
s = pd.Series(arr)

pd.to_datetime(value)  # good
pd.to_datetime(s.iloc[0])  # good
pd.to_datetime(arr[0])  # good
pd.to_datetime(arr)  # error
pd.to_datetime(s)  # error

Traceback:

TypeError                                 Traceback (most recent call last)
pandas/_libs/tslib.pyx in pandas._libs.tslib.array_to_datetime()

TypeError: Expected unicode, got numpy.str_

During handling of the above exception, another exception occurred:

TypeError                                 Traceback (most recent call last)
<ipython-input-72-bab25c808085> in <module>
----> 1 pd.to_datetime(s)  # error

~/.virtualenvs/lr/lib/python3.7/site-packages/pandas/core/tools/datetimes.py in to_datetime(arg, errors, dayfirst, yearfirst, utc, format, exact, unit, infer_datetime_format, origin, cache)
    726             result = arg.map(cache_array)
    727         else:
--> 728             values = convert_listlike(arg._values, format)
    729             result = arg._constructor(values, index=arg.index, name=arg.name)
    730     elif isinstance(arg, (ABCDataFrame, abc.MutableMapping)):

~/.virtualenvs/lr/lib/python3.7/site-packages/pandas/core/tools/datetimes.py in _convert_listlike_datetimes(arg, format, name, tz, unit, errors, infer_datetime_format, dayfirst, yearfirst, exact)
    445             errors=errors,
    446             require_iso8601=require_iso8601,
--> 447             allow_object=True,
    448         )
    449 

~/.virtualenvs/lr/lib/python3.7/site-packages/pandas/core/arrays/datetimes.py in objects_to_datetime64ns(data, dayfirst, yearfirst, utc, errors, require_iso8601, allow_object)
   1850             dayfirst=dayfirst,
   1851             yearfirst=yearfirst,
-> 1852             require_iso8601=require_iso8601,
   1853         )
   1854     except ValueError as e:

pandas/_libs/tslib.pyx in pandas._libs.tslib.array_to_datetime()

pandas/_libs/tslib.pyx in pandas._libs.tslib.array_to_datetime()

pandas/_libs/tslib.pyx in pandas._libs.tslib.array_to_datetime_object()

TypeError: Argument 'date_string' has incorrect type (expected str, got numpy.str_)

Problem description

to_datetime raises a TypeError: Argument 'date_string' has incorrect type (expected str, got numpy.str_) when the arg is a list-like of numpy.str_ objects, but not when the arg is a single numpy.str_ object.

Expected Output

I expect that it works with numpy.str_ like it does with str.
assert (pd.to_datetime(s) == pd.to_datetime(s.astype(str))).all()

Or at least to be consistent between pd.to_datetime(s) and pd.to_datetime(s.iloc[0])

Output of pd.show_versions()

INSTALLED VERSIONS

commit : None
python : 3.7.5.final.0
python-bits : 64
OS : Linux
OS-release : 5.3.0-40-generic
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8

pandas : 1.0.1
numpy : 1.18.1
pytz : 2019.2
dateutil : 2.8.0
pip : 20.0.2
setuptools : 45.0.0
Cython : 0.29.13
pytest : None
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : 4.4.2
html5lib : None
pymysql : 0.9.3
psycopg2 : None
jinja2 : 2.10.1
IPython : 7.2.0
pandas_datareader: None
bs4 : 4.8.2
bottleneck : None
fastparquet : None
gcsfs : None
lxml.etree : 4.4.2
matplotlib : 3.0.2
numexpr : None
odfpy : None
openpyxl : 2.6.2
pandas_gbq : None
pyarrow : 0.13.0
pytables : None
pytest : None
pyxlsb : None
s3fs : None
scipy : 1.4.1
sqlalchemy : 1.2.17
tables : None
tabulate : None
xarray : None
xlrd : 1.2.0
xlwt : 1.3.0
xlsxwriter : None
numba : None

@arnau126 arnau126 changed the title [to_datetime] TypeError: Expected unicode, got numpy.str_ [to_datetime] TypeError: Argument 'date_string' has incorrect type (expected str, got numpy.str_) Feb 26, 2020
@jbrockmendel
Copy link
Member

jbrockmendel commented Mar 17, 2020

Looks like there are a couple of places in _libs.tslibs.parsing where things are typed as str so choking on np.str_. This has been a PITA elsewhere because isinstance(value, str) evalutes to True in python-space.

3 options come to mind:

  • make a fused type for str +np.str_ and update types in parsing.pyx
  • add a check in parse_datetime_string for np.str_ and cast if necessary
  • try to get numpy/cython to handle this upstream

If you'd like to try either of the first two, a PR would be welcome.

@simonjayhawkins simonjayhawkins added Bug Datetime Datetime data dtype labels Apr 25, 2020
@ddibiasi
Copy link

Are there any updates on this issue?

@jreback
Copy link
Contributor

jreback commented Sep 17, 2021

@ddibiasi you are welcome to have a look - it's possible this works in master

@awinnett
Copy link

awinnett commented Sep 27, 2021

I had the same issue (with a column ['CollectionTime'] in dataframe collection_ct_data_v2), but solved via:

`

conv_to_string = [str(x) for x in collection_ct_data_V2['CollectionTime']]
collection_ct_data_V2['CollectionTime'] = conv_to_string
collection_ct_data_V2['CollectionTime'] = pd.to_datetime(collection_ct_data_V2['CollectionTime'])
`

This results in converted data from type numpy.str_ to string to DateTime, as desired.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Datetime Datetime data dtype
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants