Skip to content

Timedelta not parsing ISO-8601 strings properly #29773

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
datajanko opened this issue Nov 21, 2019 · 7 comments · Fixed by #37159
Closed

Timedelta not parsing ISO-8601 strings properly #29773

datajanko opened this issue Nov 21, 2019 · 7 comments · Fixed by #37159
Assignees
Labels
Bug Timedelta Timedelta data type
Milestone

Comments

@datajanko
Copy link
Contributor

datajanko commented Nov 21, 2019

Code Sample, a copy-pastable example if possible

pd.Timedelta("P0DT1H")
Timedelta('0 days 00:00:00')

and

pd.Timedelta("PT1H")
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
pandas/_libs/tslibs/timedeltas.pyx in pandas._libs.tslibs.timedeltas.parse_timedelta_unit()

KeyError: ''

During handling of the above exception, another exception occurred:

ValueError                                Traceback (most recent call last)
<ipython-input-5-35a9c7000b3e> in <module>()
----> 1 pd.Timedelta("PT1H")

pandas/_libs/tslibs/timedeltas.pyx in pandas._libs.tslibs.timedeltas.Timedelta.__new__()

pandas/_libs/tslibs/timedeltas.pyx in pandas._libs.tslibs.timedeltas.parse_iso_format_string()

pandas/_libs/tslibs/timedeltas.pyx in pandas._libs.tslibs.timedeltas.timedelta_from_spec()

pandas/_libs/tslibs/timedeltas.pyx in pandas._libs.tslibs.timedeltas.parse_timedelta_unit()

ValueError: invalid unit abbreviation: 

Problem description

The ISO strings should be parsed correctly and provide valid output
as described here and shown here

Expected Output

First example and second example should have output

Timedelta('0 days 01:00:00')

Output of pd.show_versions()

I had the same outputs locally and on google cola

Versions
INSTALLED VERSIONS
------------------
commit           : None
python           : 3.6.8.final.0
python-bits      : 64
OS               : Linux
OS-release       : 4.14.137+
machine          : x86_64
processor        : x86_64
byteorder        : little
LC_ALL           : None
LANG             : en_US.UTF-8
LOCALE           : en_US.UTF-8

pandas           : 0.25.3
numpy            : 1.17.4
pytz             : 2018.9
dateutil         : 2.6.1
pip              : 19.3.1
setuptools       : 41.6.0
Cython           : 0.29.14
pytest           : 3.6.4
hypothesis       : None
sphinx           : 1.8.5
blosc            : None
feather          : 0.4.0
xlsxwriter       : None
lxml.etree       : 4.2.6
html5lib         : 1.0.1
pymysql          : None
psycopg2         : 2.7.6.1 (dt dec pq3 ext lo64)
jinja2           : 2.10.3
IPython          : 5.5.0
pandas_datareader: 0.7.4
bs4              : 4.6.3
bottleneck       : 1.3.0
fastparquet      : None
gcsfs            : None
lxml.etree       : 4.2.6
matplotlib       : 3.1.1
numexpr          : 2.7.0
odfpy            : None
openpyxl         : 2.5.9
pandas_gbq       : 0.11.0
pyarrow          : 0.14.1
pytables         : None
s3fs             : 0.4.0
scipy            : 1.3.2
sqlalchemy       : 1.3.11
tables           : 3.4.4
xarray           : 0.11.3
xlrd             : 1.1.0
xlwt             : 1.3.0
xlsxwriter       : None
@mroeschke
Copy link
Member

Thanks for the suggestion. Duplicate of #22815

@datajanko
Copy link
Contributor Author

datajanko commented Nov 21, 2019

Is this really a duplicate? Things are implemented but do not work as expected.

@datajanko
Copy link
Contributor Author

This is the duplicate: #25422

@datajanko
Copy link
Contributor Author

Additionally, the first example shows that a syntactically valid duration is not handled correctly

@mroeschke
Copy link
Member

Ah my mistake. I had though we didn't support ISO 8601 durations yet.

Right, this definitely looks like a parsing error then. Thanks for the report!

@mroeschke mroeschke reopened this Nov 21, 2019
@mroeschke mroeschke added Bug Timedelta Timedelta data type labels Nov 21, 2019
@danni
Copy link

danni commented Feb 28, 2020

Negative durations are also broken:

>>> pandas.Timedelta('-P21D')
Traceback (most recent call last):
  File "pandas/_libs/tslibs/timedeltas.pyx", line 526, in pandas._libs.tslibs.timedeltas.parse_timedelta_unit
KeyError: 'p'

As are the examples from the Moment docs.

@AnnaDaglis
Copy link
Contributor

take

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Timedelta Timedelta data type
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants