Skip to content

BUG: Convert float freqstrs to ints at finer resolution #14378

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

jisantuc
Copy link
Contributor

@jisantuc jisantuc commented Oct 8, 2016

Passing '0.5min' as a frequency string should generate 30 second
intervals, rather than five minute intervals. By recursively increasing
resolution until one is found for which the frequency is an integer,
this commit ensures that that's the case for resolutions from days to
microsecond

Reso = frequencies.Resolution

self.assertEqual(Reso.bump_to_int(1.5, 'T'), (90, 'S'))
self.assertEqual(Reso.bump_to_int(62.4, 'T'), (3744, 'S'))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can u exercise some more fail cases
and systematically cover cases that should succeed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, added the failure case for being unable to convert to an int in a less toy example than starting all the way at the bottom and two success cases that together cover the entire implemented resolution range, from days --> milliseconds and from hours --> microseconds

@codecov-io
Copy link

codecov-io commented Oct 8, 2016

Current coverage is 85.31% (diff: 100%)

Merging #14378 into master will increase coverage by <.01%

@@             master     #14378   diff @@
==========================================
  Files           144        144          
  Lines         50957      50972    +15   
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
+ Hits          43469      43485    +16   
+ Misses         7488       7487     -1   
  Partials          0          0          

Powered by Codecov. Last update 86233e1...65a5163

else:
start_reso = cls.get_reso_from_freq(freq)
if start_reso == 0:
raise ValueError(_CANNOT_CONVERT_TO_INT)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pls move error message here, because the message is used only once.

'T': 'S',
'S': 'L',
'L': 'U',
'U': None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pls add Nanosecond.

@@ -108,10 +113,6 @@ def test_to_offset_invalid(self):
with tm.assertRaisesRegexp(ValueError, 'Invalid frequency: -2D:3H'):
frequencies.to_offset('-2D:3H')

# ToDo: Must be fixed in #8419
with tm.assertRaisesRegexp(ValueError, 'Invalid frequency: .5S'):
frequencies.to_offset('.5S')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you change it to compare with 500 milisec.

Copy link
Member

@sinhrks sinhrks Oct 8, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pls add some tests whether this works with combined offset, like .5H.5S.

And also some failure cases with invalid str, such as 1.5.0S, 1.S.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1.S is valid now, parsed as 1 second. Should it not be?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. should be 1 sec like float("1.") is valid.

@@ -30,3 +30,5 @@ Performance Improvements

Bug Fixes
~~~~~~~~~

- Bug in date range treating e.g. 0.5s as 5s (:issue:`8419`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add occurrence condition, like

  • Bug in multiple offset aliases with decimal points is regarded as int (e.g. 0.5s as 5s).

@@ -160,6 +181,34 @@ def get_reso_from_freq(cls, freq):
"""
return cls.get_reso(cls.get_str_from_freq(freq))

@classmethod
def bump_to_int(cls, value, freq):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can the method name be clearler, likeget_stride_from_decimal

@classmethod
def bump_to_int(cls, value, freq):
"""
Increase resolution an integer value is available to create an
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Convert freq with decimal stride to higher freq with integer stride.

@sinhrks sinhrks added Frequency DateOffsets Bug labels Oct 8, 2016
@jreback jreback changed the title Convert float freqstrs to ints at finer resolution BUG: Convert float freqstrs to ints at finer resolution Oct 9, 2016
T_RESO = 3
H_RESO = 4
D_RESO = 5
NS_RESO = 0
Copy link
Contributor

@jreback jreback Oct 9, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so would be nice to clean this up. This is essentially an Enum, but that only exists in 3.5 (maybe we could add the backport of https://pypi.python.org/pypi/enum34 as an in-inline package), not sure this is worth it though.

In any event just put these in a list and enumerate if needed (and you can put these directly in the class I think)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just learned they need to be available at the module level for some of the c imports, but inside the class they can live in a list, so I did that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They're used in some of the @classmethods it turns out so need to stay for now

@jisantuc jisantuc force-pushed the frequencies-float-to-int branch from 4ff2457 to 33664a6 Compare October 11, 2016 01:47
Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you rebase and update?

@@ -30,3 +30,5 @@ Performance Improvements

Bug Fixes
~~~~~~~~~

- Bug in multiple offset aliases with decimal points regarded as ints (e.g. 0.5s as 5s) (:issue:`8419`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you move this to v0.20.0 file and add it to the enhancements section? It was a bug that is was parsed incorrectly and did not error, but supporting fractional resolutions is an enhancement that can be listed in the enhancement section

RESO_MIN = T_RESO
RESO_HR = H_RESO
RESO_DAY = D_RESO
RESOS = [RESO_NS, RESO_US, RESO_MS, RESO_SEC, RESO_MIN, RESO_HR, RESO_DAY]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this used somewhere?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, will destruct.

Passing `'0.5min'` as a frequency string should generate 30 second
intervals, rather than five minute intervals. By recursively increasing
resolution until one is found for which the frequency is an integer,
this commit ensures that that's the case for resolutions from days to
microseconds.

Fixes pandas-dev#8419
@jisantuc jisantuc force-pushed the frequencies-float-to-int branch from 33664a6 to 47723c3 Compare October 27, 2016 14:29
@jisantuc jisantuc force-pushed the frequencies-float-to-int branch from fb422d3 to 01afe8a Compare October 27, 2016 15:05
@jreback
Copy link
Contributor

jreback commented Dec 6, 2016

can you rebase

@@ -379,6 +402,25 @@ def test_freq_to_reso(self):
result = Reso.get_freq(Reso.get_str(Reso.get_reso_from_freq(freq)))
self.assertEqual(freq, result)

def test_resolution_bumping(self):
Reso = frequencies.Resolution
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add the issue number here as as comment

@classmethod
def get_stride_from_decimal(cls, value, freq):
"""
Convert freq with decimal stride into a higher freq with integer stride
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add Parameters/Returns section to detail the args

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and Raises

@jreback
Copy link
Contributor

jreback commented Dec 14, 2016

just a little docs, otherwise lgtm.

@jorisvandenbossche jorisvandenbossche added this to the 0.20.0 milestone Dec 14, 2016
@jorisvandenbossche jorisvandenbossche merged commit 8b89ece into pandas-dev:master Dec 14, 2016
@jorisvandenbossche
Copy link
Member

Made the latest changes quickly myself. @jisantuc Thanks a lot for this nice contribution!

ischurov pushed a commit to ischurov/pandas that referenced this pull request Dec 19, 2016
…4378)

Passing `'0.5min'` as a frequency string should generate 30 second
intervals, rather than five minute intervals. By recursively increasing
resolution until one is found for which the frequency is an integer,
this commit ensures that that's the case for resolutions from days to
microseconds.

Fixes pandas-dev#8419
@jisantuc jisantuc deleted the frequencies-float-to-int branch January 18, 2019 18:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Frequency DateOffsets
Projects
None yet
Development

Successfully merging this pull request may close these issues.

date_range treats '.5s' as '5s'
5 participants