Week Start/End Resamping: WS/WE frequencies #7049

quasiben · 2014-05-05T22:54:07Z

There are a number of offset aliases listed here:
http://pandas.pydata.org/pandas-docs/dev/timeseries.html#offset-aliases

I'm currently trying to resample a timeseries where the buckets are return with the beginning and/or end of the week. There are a number of Start Month/End Month frequency aliases. Am I missing something or should we include Week End/Week Start resampling aliases as well?

jreback · 2014-05-05T22:59:56Z

look at the next section
u can specify w-mon for example

quasiben · 2014-05-05T23:03:28Z

Yes, the anchored offsets are nice. But these will change the values:

In [1]: import pandas as pd

In [2]: import datetime as dt

In [3]: daily_daterange = pd.date_range(start=dt.datetime(2013, 1, 31),
   ...:                                 end=dt.datetime(2013,2,28),
   ...:                                 freq='B')

In [4]: daily_dataset = pd.DataFrame({'value' : range(len(daily_daterange))}, index=daily_daterange)

In [5]: daily_dataset.resample('W-MON')
Out[5]: 
            value
2013-02-04      1
2013-02-11      5
2013-02-18     10
2013-02-25     15
2013-03-04     19

In [6]: daily_dataset.resample('W-SUN')
Out[6]: 
            value
2013-02-03    0.5
2013-02-10    4.0
2013-02-17    9.0
2013-02-24   14.0
2013-03-03   18.5

I want the values to stay the same and the buckets returned to be the beginning of the week.

jreback · 2014-05-05T23:06:54Z

ok

so give an example then of what u want

quasiben · 2014-05-06T15:48:55Z

Thu  2013-01-31  val_1
Fri  2013-02-01  val_2
Sat  2013-02-02  val_3
Sun  2013-02-03  val_4

In [4]: daily_dataset.resample('W-THU')
Out[4]: 
            value
2013-01-31      0
2013-02-07      3
2013-02-14      8
2013-02-21     13
2013-02-28     18

In [5]: daily_dataset.resample('W-MON')
Out[5]: 
            value
2013-02-04      1
2013-02-11      5
2013-02-18     10
2013-02-25     15
2013-03-04     19

In [6]: daily_dataset.resample('W-SUN')
Out[6]: 
            value
2013-02-03    0.5
2013-02-10    4.0
2013-02-17    9.0
2013-02-24   14.0
2013-03-03   18.5

I want to resample weekly but the bucket returned should be the weeks beginning:

daily_dataset.resample('WBEGIN') 
            value
2013-02-03    0
2013-02-10    3
2013-02-17    8
2013-02-24   13
2013-03-03   18

Essentially, adding a timedelta to the index.

jreback · 2014-05-06T15:50:13Z

how is this diffferent from W-SUN ? (I mean your values are), but conceptually

quasiben · 2014-05-06T16:02:12Z

Conceptually, nothing is different but the resampling of the values is significant. I believe month end and month start frequency don't change the values but it does return the a bucket at the beginning of the month or the end of month

You could do something like:

daily_dataset.resample('7D')

Then fix the index with an appropriate offset. Maybe I should submit a PR and not waste too much more time :)

jreback · 2014-05-16T17:41:50Z

why wouldn't you reindex to a daily freq, then simply divide by 7 starting on the day you want?

aiguofer · 2017-01-27T19:01:42Z

+1 on this. I keep running into issues, and none of the solutions are nice.... a simple WS would make my life easier.

Take this for example. Feb in 2016 started on a MON, and the last week of march 31 was on THURS.

I want to group by week (starting on MON and ending on SUN), count the number of days in the week, but use the first day of the week as the label. That means that for the month of Feb in 2016, my expected output is:

2016-02-01    7
2016-02-08    7
2016-02-15    7
2016-02-22    7
2016-02-29    1

I can do this a few different ways, but they all seem cumbersome and not very intuitive.

import pandas as pd
df = pd.DataFrame(pd.date_range('2016-02-01', '2016-02-29').rename('dt'))
df['val'] = 1

# option 1
a = df.groupby(pd.Grouper(key='dt',freq='W-MON', closed='left', label='left'))['val'].sum()

# option 2
a = df.groupby(pd.Grouper(key='dt',freq='W'))['val'].sum()
a.index = a.index + pd.DateOffset(weekday=0, days=-7)

# option 3
a = df.groupby(pd.Grouper(key='dt',freq='W-MON', closed='left'))['val'].sum()
a.index = a.index + pd.DateOffset(days=-7)

# option 4
a = df.groupby(pd.Grouper(key='dt',freq='W', label='left'))['val'].sum()
a.index = a.index + pd.DateOffset(days=1)

# ideal solution
a = df.groupby(pd.Grouper(key='dt',freq='WS'))['val'].sum()

jreback · 2017-01-31T15:07:11Z

FYI you can do this with .resample directly (looking at your option 1). I will comment more following

In [30]: df.groupby(pd.Grouper(key='dt',freq='W-MON', closed='left', label='left'))['val'].sum()
    ...: 
Out[30]: 
dt
2016-02-01    7
2016-02-08    7
2016-02-15    7
2016-02-22    7
2016-02-29    1
Freq: W-MON, Name: val, dtype: int64

In [32]: df.resample('W-MON', on='dt', closed='left', label='left').val.sum()
Out[32]: 
dt
2016-02-01    7
2016-02-08    7
2016-02-15    7
2016-02-22    7
2016-02-29    1
Freq: W-MON, Name: val, dtype: int64

jreback · 2017-01-31T15:14:53Z

why is this not the solution you want?

In [65]: df.resample('7D', on='dt').val.sum()
Out[65]: 
dt
2016-02-01    7
2016-02-08    7
2016-02-15    7
2016-02-22    7
2016-02-29    1
Freq: 7D, Name: val, dtype: int64

aiguofer · 2017-01-31T15:23:49Z

Thanks! I actually used groupby because in my real example I'm also adding other groupby cols, I just simplified for posting here.

The issue with the 7D approach is that I would have to make sure the first date available is on a Monday.

For example:

df = pd.DataFrame(pd.date_range('2016-01-29', '2016-02-29').rename('dt'))
df['val'] = 1
df.resample('7D', on='dt').val.sum()

dt
2016-01-29    7
2016-02-05    7
2016-02-12    7
2016-02-19    7
2016-02-26    4
Freq: 7D, Name: val, dtype: int64

I want a solution that will always group by weeks starting on Monday and using that Monday as the label, regardless of what dates are available (in my real data, the dates come from elsewhere).

jreback · 2017-01-31T15:30:07Z

@aiguofer

why is this not reasonable? (or the .resample version)

# option 1
a = df.groupby(pd.Grouper(key='dt',freq='W-MON', closed='left', label='left'))['val'].sum()

you are esentially saying that the defaults for anchored are not what you want.

aiguofer · 2017-01-31T15:36:55Z

In my mind, that's essentialy saying, "end the week on Monday, but close the inclusion on the left (so that the monday is included on the 'following' week), and use the previous week's label"

Additionally, since there is a MS and QS, it seems odd to have to do it differently with weeks.

In my app, I have a dropdown choice for how to group and I have to add a check for W to then add the other parameters, it'd be much simpler to do WS.

jreback · 2017-01-31T15:40:37Z

ahh, I c, you are angling for a WS and WE frequencies. IOW, like W-SUN, but have a tiny bit of special handling on inclusions in resampling.

jreback · 2017-01-31T15:41:56Z

this would not be very hard to implement actually. Its much more work to properly test this :>

ok if you would like to put up a PR for WS/WE (I actually think these should just work in resampling directly). that would be great!

aiguofer · 2017-01-31T15:42:07Z

Exactly. And I think it could even work with a specific day anchoring.

So for example, if someone wanted to bucket on weeks starting on SUN using the first day as label, they could do WS-SUN. If they wanted the end then they'd do W-SAT.

vipasha-cashify · 2019-07-05T06:36:09Z

Thanks...it helped a lot :)

jgrajos · 2021-01-22T10:37:12Z

Hi, Thanks for these comments. @aiguofer @jreback
Where i can find this changes? I would like add this feature to my code.

aiguofer · 2021-01-22T21:47:57Z

Hey, unfortunately I never had time to finish this, and I believe a lot has changed since then. You can see the original PR here: #15449

jgrajos · 2021-02-07T10:11:09Z

Thanks for your response.
I have used this lines for the purpose:

data = data.groupby(pd.Grouper(freq='W-SUN')).last()
new_index = data.index - pd.DateOffset(days=6)
data.index = new_index

It is works for me!

jreback added Timeseries and removed Reshaping labels May 16, 2014

jreback added this to the Someday milestone Mar 8, 2015

jreback modified the milestones: Next Major Release, Someday Jan 27, 2017

jreback added API Design Frequency DateOffsets labels Jan 31, 2017

jreback added Difficulty Intermediate labels Jan 31, 2017

jreback changed the title ~~Week Start/End Resamping~~ Week Start/End Resamping: WS/WE frequencies Jan 31, 2017

aiguofer pushed a commit to aiguofer/pandas that referenced this issue Feb 17, 2017

Initial work for pandas-dev#7049

6b3aca1

aiguofer mentioned this issue Feb 17, 2017

Week Start/End Resamping: WS/WE frequencies #15449

Closed

4 tasks

ghost mentioned this issue Jul 8, 2019

WIP: Add DayEnd, DayBegin Offsets (Help Wanted) #27087

Closed

3 tasks

jbrockmendel removed Effort Medium labels Oct 21, 2019

mroeschke added the Enhancement label Mar 31, 2020

mroeschke removed API Design Datetime Datetime data dtype labels Apr 11, 2021

mroeschke removed this from the Contributions Welcome milestone Oct 13, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Week Start/End Resamping: WS/WE frequencies #7049

Week Start/End Resamping: WS/WE frequencies #7049

quasiben commented May 5, 2014

jreback commented May 5, 2014

quasiben commented May 5, 2014

jreback commented May 5, 2014

quasiben commented May 6, 2014

jreback commented May 6, 2014

quasiben commented May 6, 2014

jreback commented May 16, 2014

aiguofer commented Jan 27, 2017 •

edited

Loading

jreback commented Jan 31, 2017 •

edited

Loading

jreback commented Jan 31, 2017

aiguofer commented Jan 31, 2017

jreback commented Jan 31, 2017

aiguofer commented Jan 31, 2017 •

edited

Loading

jreback commented Jan 31, 2017

jreback commented Jan 31, 2017

aiguofer commented Jan 31, 2017 •

edited

Loading

vipasha-cashify commented Jul 5, 2019

jgrajos commented Jan 22, 2021

aiguofer commented Jan 22, 2021

jgrajos commented Feb 7, 2021

Week Start/End Resamping: WS/WE frequencies #7049

Week Start/End Resamping: WS/WE frequencies #7049

Comments

quasiben commented May 5, 2014

jreback commented May 5, 2014

quasiben commented May 5, 2014

jreback commented May 5, 2014

quasiben commented May 6, 2014

jreback commented May 6, 2014

quasiben commented May 6, 2014

jreback commented May 16, 2014

aiguofer commented Jan 27, 2017 • edited Loading

jreback commented Jan 31, 2017 • edited Loading

jreback commented Jan 31, 2017

aiguofer commented Jan 31, 2017

jreback commented Jan 31, 2017

aiguofer commented Jan 31, 2017 • edited Loading

jreback commented Jan 31, 2017

jreback commented Jan 31, 2017

aiguofer commented Jan 31, 2017 • edited Loading

vipasha-cashify commented Jul 5, 2019

jgrajos commented Jan 22, 2021

aiguofer commented Jan 22, 2021

jgrajos commented Feb 7, 2021

aiguofer commented Jan 27, 2017 •

edited

Loading

jreback commented Jan 31, 2017 •

edited

Loading

aiguofer commented Jan 31, 2017 •

edited

Loading

aiguofer commented Jan 31, 2017 •

edited

Loading