Skip to content

CLN: de-duplicate generate_range #23218

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Oct 23, 2018
Merged

Conversation

jbrockmendel
Copy link
Member

Split off from #23140.

@codecov
Copy link

codecov bot commented Oct 18, 2018

Codecov Report

Merging #23218 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #23218      +/-   ##
==========================================
+ Coverage   92.19%    92.2%   +<.01%     
==========================================
  Files         169      169              
  Lines       50972    50968       -4     
==========================================
- Hits        46996    46993       -3     
+ Misses       3976     3975       -1
Flag Coverage Δ
#multiple 90.62% <100%> (ø) ⬆️
#single 42.26% <81.25%> (-0.01%) ⬇️
Impacted Files Coverage Δ
pandas/core/arrays/datetimelike.py 94.88% <ø> (-0.02%) ⬇️
pandas/core/indexes/period.py 93.22% <ø> (-0.02%) ⬇️
pandas/core/arrays/period.py 95.58% <100%> (+0.01%) ⬆️
pandas/core/indexes/datetimelike.py 98.27% <100%> (+0.02%) ⬆️
pandas/core/arrays/timedeltas.py 94.47% <100%> (+0.5%) ⬆️
pandas/core/indexes/datetimes.py 96.47% <100%> (-0.03%) ⬇️
pandas/core/indexes/timedeltas.py 90.62% <100%> (-0.12%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 145c227...a9c021a. Read the comment docs.

@@ -332,6 +332,11 @@ def _validate_frequency(cls, index, freq, **kwargs):
# Frequency validation is not meaningful for Period Array/Index
return None

# DatetimeArray may pass `ambiguous`, nothing else will be accepted
# by cls._generate_range below
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really like this, can we just directly pass ambiguous as required?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could, but it is much more verbose (BTW the letter "v" on my keyboard is super-sticky, so we really need to stop talking about erbosity).

This would end up looking something like:

if is_timedelta64_dtype(index):
    assert ambiguous is None
    on_freq = cls._generate_range(start=index[0], end=None,
                                                        periods=len(index), freq=freq)
else:
    on_freq = cls._generate_range(start=index[0], end=None,
                                                        periods=len(index), freq=freq, ambiguous=ambiguous)

Or do you have something else in mind? I like this less than the way the PR currently does it, but by a small enough margin that I'm fine changing it to get this done.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no what I would do is add this to DatetimeIndex. It doesn't totally de-duplicate but is more correct and less hacky here

@classmethod
    def _generate_range(cls, start, end, periods, name=None, freq=None,
                        tz=None, normalize=False, ambiguous='raise',
                        closed=None):
        return super(DatetimeIndex, cls)._generate_range(
            start, end, periods, freq,
            tz=tz, normalize=normalize, ambiguous=ambiguous, closed=closed)

@jreback jreback added Datetime Datetime data dtype Clean labels Oct 18, 2018
@jreback jreback added this to the 0.24.0 milestone Oct 18, 2018
@@ -332,6 +332,11 @@ def _validate_frequency(cls, index, freq, **kwargs):
# Frequency validation is not meaningful for Period Array/Index
return None

# DatetimeArray may pass `ambiguous`, nothing else will be accepted
# by cls._generate_range below
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no what I would do is add this to DatetimeIndex. It doesn't totally de-duplicate but is more correct and less hacky here

@classmethod
    def _generate_range(cls, start, end, periods, name=None, freq=None,
                        tz=None, normalize=False, ambiguous='raise',
                        closed=None):
        return super(DatetimeIndex, cls)._generate_range(
            start, end, periods, freq,
            tz=tz, normalize=normalize, ambiguous=ambiguous, closed=closed)

@jbrockmendel
Copy link
Member Author

no what I would do is add this to DatetimeIndex. [...]

Um, I don't see how that is really relevant to the _validate_frequency usage. DatetimeArray._generate_range already accepts ambiguous; it is a matter of passing that arg in _validate_frequency.

@jbrockmendel
Copy link
Member Author

I mean, I could just take out the new assertion in _validate_frequency and the method wouldn't be changed at all by this PR.

@jorisvandenbossche jorisvandenbossche changed the title de-duplicate generate_range CLN: de-duplicate generate_range Oct 18, 2018
@jreback
Copy link
Contributor

jreback commented Oct 19, 2018

Um, I don't see how that is really relevant to the _validate_frequency usage. DatetimeArray._generate_range already accepts ambiguous; it is a matter of passing that arg in _validate_frequency.

then let's just make this the default. This is a very kludgy way of doing that.

@pep8speaks
Copy link

Hello @jbrockmendel! Thanks for updating the PR.

Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks OK to me

@@ -703,6 +703,12 @@ def astype(self, dtype, copy=True):
raise TypeError(msg.format(name=type(self).__name__, dtype=dtype))
return super(DatetimeIndexOpsMixin, self).astype(dtype, copy=copy)

@Appender(DatetimeLikeArrayMixin._time_shift.__doc__)
def _time_shift(self, periods, freq=None):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how is this related to the rest of the PR?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR changes DatetimeArrayMixin._time_shift such that it no longer pins self.name for Index subclasses.

@jreback
Copy link
Contributor

jreback commented Oct 23, 2018

looks fine. not super happy about having to set name all over the place, but this is an improvement over before. merging after CI is fixed.

@jreback jreback merged commit 225b9b0 into pandas-dev:master Oct 23, 2018
@jreback
Copy link
Contributor

jreback commented Oct 23, 2018

thanks!

@jbrockmendel jbrockmendel deleted the genrange branch October 23, 2018 03:09
jorisvandenbossche added a commit that referenced this pull request Oct 23, 2018
JustinZhengBC pushed a commit to JustinZhengBC/pandas that referenced this pull request Oct 23, 2018
tm9k1 pushed a commit to tm9k1/pandas that referenced this pull request Nov 19, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Clean Datetime Datetime data dtype
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants