Skip to content

ENH/BUG: Offset.apply dont preserve time #7375

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 11, 2014
Merged

Conversation

sinhrks
Copy link
Member

@sinhrks sinhrks commented Jun 6, 2014

Closes #7156.

@jreback jreback added this to the 0.14.1 milestone Jun 6, 2014
@@ -1208,7 +1395,7 @@ def test_offset(self):
def test_normalize(self):
dt = datetime(2007, 1, 1, 3)

result = dt + BMonthEnd()
result = dt + BMonthEnd(normalize=True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this an API change?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Depends on definition. Whether to regard BMonthEnd and MonthEnd inconsistencies are spec or bug. From users point of view, I think it looks a bug as most of other offsets behave differently.

@sinhrks sinhrks changed the title ENH/BUG: Offset.aplly dont preserve time ENH/BUG: Offset.apply dont preserve time Jun 7, 2014
@@ -115,8 +115,19 @@ Enhancements
- Implemented ``sem`` (standard error of the mean) operation for ``Series``,
``DataFrame``, ``Panel``, and ``Groupby`` (:issue:`6897`)

- All ``offsets`` suppports ``normalize`` keyword to specify whether ``offsets.apply``, ``rollforward`` and ``rollback`` resets time (hour, minute, etc) or not (default ``False``, preserves time) (:issue:`7156`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why don't you move these to API section (they technically are a missing 'feature', but just to make it obvious)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks fine, pls move to API section, then good 2 go

@sinhrks
Copy link
Member Author

sinhrks commented Jun 11, 2014

@jreback Moved the description to API, and rebased.

jreback added a commit that referenced this pull request Jun 11, 2014
ENH/BUG: Offset.apply dont preserve time
@jreback jreback merged commit 4233c0f into pandas-dev:master Jun 11, 2014
@sinhrks sinhrks deleted the offsets branch June 14, 2014 02:05
@jorisvandenbossche
Copy link
Member

@jreback While trying to clean up the whatsnew page for 0.14.1, I wanted to rewrite the api change on the offsets a bit (as it is now not clear what did change), but I stumbled on this:

In [26]: d
Out[26]: Timestamp('2014-01-01 09:00:00')

In [27]: d + offsets.MonthEnd()
Out[27]: Timestamp('2014-01-31 09:00:00')

In [28]: d + offsets.MonthEnd(normalize=True)
Out[28]: Timestamp('2014-01-31 00:00:00')

In [29]: d + offsets.Day()
Out[29]: Timestamp('2014-01-02 09:00:00')

In [30]: d + offsets.Day(normalize=True)
Out[30]: Timestamp('2014-01-02 09:00:00')

In [31]: d + offsets.MonthOffset()
Out[31]: Timestamp('2014-01-02 09:00:00')

So for MonthEnd it seems to work as expected, but with Day normalize has no effect, and MonthOffset gives the same as Day. Are these bugs, or just due to my misunderstanding (I have to say that I almost never use offsets directly)

@jreback
Copy link
Contributor

jreback commented Jul 9, 2014

cc @sinhrks ?

@jorisvandenbossche
Copy link
Member

Actually, looking at the dev docs now, and the example of Day is even in the docs (and so incorrect there): just above http://pandas-docs.github.io/pandas-docs-travis/timeseries.html#parametric-offsets

Correction: in the docs it is correct (with apply it seems to work correctly), but still the behaviour above seems strange.

@jreback
Copy link
Contributor

jreback commented Jul 9, 2014

yes it looks like a bug with Day which is handled in Tick

@sinhrks
Copy link
Member Author

sinhrks commented Jul 9, 2014

Yes, it is a bug. Currently normalize is effective only if addition or subtraction internally calls apply. Fixed in #7697.

@jorisvandenbossche
Copy link
Member

And the MonthOffset issue?

And yet something else:

In [55]: offsets.YearOffset()
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-55-b53e59b753d1> in <module>()
----> 1 offsets.YearOffset()

c:\users\vdbosscj\scipy\pandas-joris\pandas\tseries\offsets.pyc in __init__(self
, n, normalize, **kwds)
   1331
   1332     def __init__(self, n=1, normalize=False, **kwds):
-> 1333         self.month = kwds.get('month', self._default_month)
   1334
   1335         if self.month < 1 or self.month > 12:

AttributeError: 'YearOffset' object has no attribute '_default_month'

@jreback
Copy link
Contributor

jreback commented Jul 9, 2014

can u separate out the fix for this into a new pr?

@sinhrks
Copy link
Member Author

sinhrks commented Jul 9, 2014

@jreback Sure, will do shortly.

@jorisvandenbossche YearOffset looks an wrapper. It doesn't define effective apply and other methods. Maybe instanciation should be prohibited?

@jreback
Copy link
Contributor

jreback commented Jul 9, 2014

@sinhrks In theory YearOffset should work like MonthOffset. open a new bug report for that pls.

@sinhrks
Copy link
Member Author

sinhrks commented Jul 9, 2014

OK. Is it correct DateOffset, MonthOffset and YearOffset can be used in itselves (even though not included in pd.offsets.__all__). The result should be a timestamp increased one unit of the day, month and year?

@jreback
Copy link
Contributor

jreback commented Jul 9, 2014

@sinhrks that's correct (I personally find that pretty confusing), as ts + 1 does this.

so its a semi-private method (and one should really use MonthEnd et all)

@jorisvandenbossche
Copy link
Member

I created seperate issues for the YearOffset and MonthOffset issues I reported here. But do I understand correctly that they are not really 'meant to be used' like this?

As I expect MonthOffset to do the same as DateOffset(months=1) (and this is something different as MonthEnd):

In [76]: d + offsets.DateOffset(months=1)
Out[76]: Timestamp('2014-02-01 09:00:00')

In [77]: d + offsets.MonthOffset()
Out[77]: Timestamp('2014-01-02 09:00:00')

In [78]: d + offsets.MonthEnd()
Out[78]: Timestamp('2014-01-31 00:00:00')

@jreback
Copy link
Contributor

jreback commented Jul 9, 2014

@jorisvandenbossche its iffy. I don't see any explicty usage of them in docs/otherwise

I think the idea was that once could have it programatically select the class, e.g.

DateOffset(months=2) -> MonthEnd(2)

which does seem useful. Maybe need to figure out use/doc it (or clearly mark it as semi-private).

@jorisvandenbossche
Copy link
Member

I was just updating my comment above: DateOffset(months=2) is not the same as MonthEnd(2)! And I would expect that MonthOffset would be.

@jreback
Copy link
Contributor

jreback commented Jul 9, 2014

sorry, I mean Month(2), e.g. just advance me 2 months.

@jorisvandenbossche
Copy link
Member

but Month does not exist? You only have MonthBegin, MonthEnd, MonthOffset in pd.tseries.offsets

@jreback
Copy link
Contributor

jreback commented Jul 9, 2014

Month is de-facto MonthOffset(months=n)

The problem it is that

Timestamp('20130105') + pd.offsets.MonthOffset(months=1) works! but pd.offsets.MonthOffset(1) is de-factor a 1-day increase as the default arg is day

@jorisvandenbossche
Copy link
Member

aha, so the __init__ shouldbe overwritten in MonthOffset?
(and where is Month? I get AttributeError: 'module' object has no attribute 'Month')

@jreback
Copy link
Contributor

jreback commented Jul 9, 2014

I think the contructor is ok, it needs an apply (kind of like MonthEnd has , but w/o the month end stuff).

The question remains are DateOffset/MonthOffset/YearOffset public, or abc methods?

@jreback
Copy link
Contributor

jreback commented Jul 9, 2014

I think this is related as well: #4804

@jorisvandenbossche
Copy link
Member

If we decide they are not public, they should get a _ name.
If it is public, I certainly think the constructor is not OK, the default arg should not be day for MonthOffset.

For the rest I can't really say if they should be public or not, I find all this offset stuff quite confusing, and the docs are rather scarce.

@jreback
Copy link
Contributor

jreback commented Jul 9, 2014

ok, you have highlited the issues, I don't think its hard to 'fix' them to be correct. Let's do that, then maybe can expand the docs a bit / improve doc-strings.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Frequency DateOffsets Testing pandas testing functions or related to the test suite
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: Some date-like Offset.apply resets time
4 participants