Skip to content

BUG: resample().nearest() throws an exception if tz is provided for the DatetimeIndex #33895

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
ghost opened this issue Apr 30, 2020 · 3 comments · Fixed by #33939
Closed

BUG: resample().nearest() throws an exception if tz is provided for the DatetimeIndex #33895

ghost opened this issue Apr 30, 2020 · 3 comments · Fixed by #33939
Labels
good first issue Needs Tests Unit test(s) needed to prevent regressions Resample resample method Timezones Timezone data dtype
Milestone

Comments

@ghost
Copy link

ghost commented Apr 30, 2020

Here a simple create of a pandas DataFrame:

import pandas as pd
df = pd.DataFrame(range(5), index=pd.date_range('1/1/2000', periods=5, freq='H', tz='UTC'))
print(df) gives:
2000-01-01 00:00:00+00:00  0
2000-01-01 01:00:00+00:00  1
2000-01-01 02:00:00+00:00  2
2000-01-01 03:00:00+00:00  3
2000-01-01 04:00:00+00:00  4

When trying to resample like this

df.resample('15Min').nearest()

this throws an exception:

File "C:\Users\user\anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 3331, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-41-9b9242799057>", line 1, in <module>
    df.resample('15Min').nearest()
  File "C:\Users\user\anaconda3\lib\site-packages\pandas\core\resample.py", line 517, in nearest
    return self._upsample("nearest", limit=limit)
  File "C:\Users\user\anaconda3\lib\site-packages\pandas\core\resample.py", line 1095, in _upsample
    res_index, method=method, limit=limit, fill_value=fill_value
  File "C:\Users\user\anaconda3\lib\site-packages\pandas\util\_decorators.py", line 227, in wrapper
    return func(*args, **kwargs)
  File "C:\Users\user\anaconda3\lib\site-packages\pandas\core\frame.py", line 3856, in reindex
    return self._ensure_type(super().reindex(**kwargs))
  File "C:\Users\user\anaconda3\lib\site-packages\pandas\core\generic.py", line 4544, in reindex
    axes, level, limit, tolerance, method, fill_value, copy
  File "C:\Users\user\anaconda3\lib\site-packages\pandas\core\frame.py", line 3744, in _reindex_axes
    index, method, copy, level, fill_value, limit, tolerance
  File "C:\Users\user\anaconda3\lib\site-packages\pandas\core\frame.py", line 3760, in _reindex_index
    new_index, method=method, level=level, limit=limit, tolerance=tolerance
  File "C:\Users\user\anaconda3\lib\site-packages\pandas\core\indexes\base.py", line 3145, in reindex
    target, method=method, limit=limit, tolerance=tolerance
  File "C:\Users\user\anaconda3\lib\site-packages\pandas\core\indexes\base.py", line 2740, in get_indexer
    indexer = self._get_nearest_indexer(target, limit, tolerance)
  File "C:\Users\user\anaconda3\lib\site-packages\pandas\core\indexes\base.py", line 2821, in _get_nearest_indexer
    left_distances = abs(self.values[left_indexer] - target)
numpy.core._exceptions.UFuncTypeError: ufunc 'subtract' cannot use operands with types dtype('<M8[ns]') and dtype('O')

However, if we remove the tz filed it works:

import pandas as pd
df = pd.DataFrame(range(5), index=pd.date_range('1/1/2000', periods=5, freq='H'))
df.resample('15Min').nearest()

OutPut: 
                     0
2000-01-01 00:00:00  0
2000-01-01 00:15:00  0
2000-01-01 00:30:00  1
2000-01-01 00:45:00  1
2000-01-01 01:00:00  1
2000-01-01 01:15:00  1
2000-01-01 01:30:00  2
2000-01-01 01:45:00  2
2000-01-01 02:00:00  2
2000-01-01 02:15:00  2
2000-01-01 02:30:00  3
2000-01-01 02:45:00  3
2000-01-01 03:00:00  3
2000-01-01 03:15:00  3
2000-01-01 03:30:00  4
2000-01-01 03:45:00  4
2000-01-01 04:00:00  4

The used pandas version is:

python           : 3.7.6.final.0
python-bits      : 64
OS               : Windows
OS-release       : 10
machine          : AMD64
processor        : Intel64 Family 6 Model 142 Stepping 12, GenuineIntel

pandas           : 1.0.1
numpy            : 1.18.1
pytz             : 2019.3
dateutil         : 2.8.1
@ghost ghost added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Apr 30, 2020
@mroeschke
Copy link
Member

This looks to work on master

In [1]: import pandas as pd
   ...: df = pd.DataFrame(range(5), index=pd.date_range('1/1/2000', periods=5, freq='H', tz='UTC'))

In [2]: df.resample('15Min').nearest()
Out[2]:
                           0
2000-01-01 00:00:00+00:00  0
2000-01-01 00:15:00+00:00  0
2000-01-01 00:30:00+00:00  1
2000-01-01 00:45:00+00:00  1
2000-01-01 01:00:00+00:00  1
2000-01-01 01:15:00+00:00  1
2000-01-01 01:30:00+00:00  2
2000-01-01 01:45:00+00:00  2
2000-01-01 02:00:00+00:00  2
2000-01-01 02:15:00+00:00  2
2000-01-01 02:30:00+00:00  3
2000-01-01 02:45:00+00:00  3
2000-01-01 03:00:00+00:00  3
2000-01-01 03:15:00+00:00  3
2000-01-01 03:30:00+00:00  4
2000-01-01 03:45:00+00:00  4
2000-01-01 04:00:00+00:00  4

I'm guessing this method doesn't have a lot of testing, so it'd be good to add one so it doesn't break again.

@mroeschke mroeschke added good first issue Needs Tests Unit test(s) needed to prevent regressions and removed Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Apr 30, 2020
@weikhor
Copy link
Contributor

weikhor commented May 1, 2020

Hi, @mroeschke .

I am beginner on open source project. I interested to add tests for this issue. Which path (pandas\tests ) in repository should I add test to? Thank

@mroeschke
Copy link
Member

pandas/tests/resample/test_datetime_index.py

@jreback jreback added this to the 1.1 milestone May 2, 2020
@jreback jreback added Resample resample method Timezones Timezone data dtype labels May 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Needs Tests Unit test(s) needed to prevent regressions Resample resample method Timezones Timezone data dtype
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants