Skip to content

Period(weekend_date, 'B') returns Monday's Period #10575

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
max-sixty opened this issue Jul 15, 2015 · 8 comments
Closed

Period(weekend_date, 'B') returns Monday's Period #10575

max-sixty opened this issue Jul 15, 2015 · 8 comments
Labels
Bug Indexing Related to indexing on series/frames, not to indexes themselves Period Period data type

Comments

@max-sixty
Copy link
Contributor

The asof docs state that the method should return the most recent label up to and including the passed label. For period indexes, this doesn't seem to be happening - if I supply a period of a weekend date to a weekday series, it returns the Monday, rather than the Friday.

If I do the same on a DatetimeIndex, it behaves as expected

In [50]:
dates

Out[50]:
<class 'pandas.tseries.period.PeriodIndex'>
[1990-05-07, ..., 2015-07-13]
Length: 6568, Freq: B

In [51]:
dates_ts=dates.to_timestamp()
dates_ts

Out[51]:
<class 'pandas.tseries.index.DatetimeIndex'>
[1990-05-07, ..., 2015-07-13]
Length: 6568, Freq: None, Timezone: None

In [52]:
dates.asof('2015-07-11')

Out[52]:
Period('2015-07-13', 'B')

In [53]:
dates_ts.asof('2015-07-11')

Out[53]:
Timestamp('2015-07-10 00:00:00')
@sinhrks sinhrks added Bug Indexing Related to indexing on series/frames, not to indexes themselves Period Period data type labels Jul 17, 2015
@sinhrks
Copy link
Member

sinhrks commented Jul 17, 2015

Thanks for the report!

Internally it uses get_loc, and these also return different results. Because Period creates next valid date when passed date and freq is invalid, needs special handling in get_loc. PR is appreciated:)

# DatetimeIndex
tidx = pd.date_range('2015-07-01', freq='B', periods=30)
tidx.get_loc('2015-07-19', method='pad')
# 12

# PeriodIndex
pidx = pd.period_range('2015-07-01', freq='B', periods=30)
pidx.get_loc('2015-07-19', method='pad')
# 13

pd.Period('2015-07-19', freq='B')
# Period('2015-07-20', 'B')

@max-sixty
Copy link
Contributor Author

Is there a reason why it creates the next valid date rather than raising? While we could add a check in get_loc, the behavior we're correcting for seems unexpected.

@sinhrks
Copy link
Member

sinhrks commented Jul 18, 2015

I'm not sure whether there is a background of current Period behavior. It looks get_period_ordinal in period_helper.c doesn't care freqs with gaps.

@jreback @bjonen @cancan101 What do you think, pd.Period('2015-07-18', 'B') (2015-07-18 is Saturday) should raise ValueError?

@max-sixty
Copy link
Contributor Author

@jreback @bjonen @cancan101, gentle follow up to @sinhrks's comment: do you have any thoughts here?

@max-sixty max-sixty changed the title asof seeks forward rather than backwards on PeriodIndex Period(weekend_date, 'B') returns Monday's Period Aug 12, 2015
@max-sixty
Copy link
Contributor Author

This hit us again today - @jreback do you have a view here?
At worst "this isn't important" or "don't change anything" would be helpful!

@jreback
Copy link
Contributor

jreback commented Sep 4, 2015

Well, you have to have a policy for what to happend when you pass a non-supported freq to your frequency based Period. I don't think rolling forward is that surprising. An alternative is what I gave in the other issue #10978 , e.g. move to a business day, then apply the op. to_period() just does this.

So I think .asof should as well, e.g. coerce the input to the appropriate frequency first.

@max-sixty
Copy link
Contributor Author

OK. I agree with you for converting frequencies.

The concept of business days is a bit different - they have gaps. (I can't think of any other examples beyond business hours and some fanciful ones such as 'summer months'.)

So I don't think there's a concept of a business day representation of Saturday, while there is a month representation of '2010-01-01', or day representation of '2010-01'.

Although it may be that I'm alone in this view! If anyone else has thoughts + there's any consensus, happy to do a PR.

@toobaz toobaz added Index Related to the Index class or subclasses Datetime Datetime data dtype and removed Indexing Related to indexing on series/frames, not to indexes themselves labels Jun 28, 2019
@mroeschke mroeschke added Indexing Related to indexing on series/frames, not to indexes themselves and removed Index Related to the Index class or subclasses Datetime Datetime data dtype labels Apr 18, 2021
@mroeschke
Copy link
Member

Period with "B" frequency is deprecated now so not sure if there are any action items here so closing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Indexing Related to indexing on series/frames, not to indexes themselves Period Period data type
Projects
None yet
Development

No branches or pull requests

5 participants