Skip to content

BUG: preserve frequency across Timestamp addition/subtraction (#4547) #6560

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 7, 2014

Conversation

rosnfeld
Copy link
Contributor

@rosnfeld rosnfeld commented Mar 6, 2014

closes #4547

My second PR, on basically the same code.

Assuming people agree with my comments on #4547 (that users adding timedeltas that don't match frequencies better know what they are doing), then this would be my submission.

I'm also assuming that something is up with Travis and it is not anything I have done - nosetests passes fine for me though Travis still dies more or less at startup, while it didn't on older code. That problem is true of builds not including any code I have added, e.g. https://travis-ci.org/pydata/pandas/jobs/20167915 . Maybe something in PR #6068, or does this kind of thing happen from time to time?

@jreback
Copy link
Contributor

jreback commented Mar 6, 2014

you might want to take a crack at #4553

something up with Travis ATM

@cancan101
Copy link
Contributor

What is the freq/offset of a Timestamp used for? I think it would be cool if the offset were taken into account when printing the Timestamp.

@rosnfeld
Copy link
Contributor Author

rosnfeld commented Mar 6, 2014

Not sure what you mean by "taken into account" - I can add it somewhat simply as in #4553 (just something like "freq='D'") but maybe you mean something more?

In terms of uses, in addition to the "next_calendar_quarter = current_calendar_quarter + 1" usage in #4547 (which is perhaps not advisable when one can increment by the more explicit tseries.offset objects) I see that freq is used in Timestamp.to_period(), which gets a bit of a demo in http://pandas.pydata.org/pandas-docs/stable/10min.html .

@cancan101
Copy link
Contributor

@rosnfeld What I meant by taken into account was that when calling str on a Timestamp of offset='M', it would be cool if the resulting string showed just the year and month rather than the date. This would be consistent with Period formatting:

In [29]: ts = pd.Timestamp("2014-1-1", offset="M")

In [30]: str(ts)
Out[30]: '2014-01-01 00:00:00'

In [31]: str(ts.to_period())
Out[31]: '2014-01'

@jreback
Copy link
Contributor

jreback commented Mar 6, 2014

@cancan101 don't confuse the issue
this is repr; a period repr is currently wrong as well and needs to be fixed

a repr is something that can be copy pasted to reverse the object

this is not pretty-printing

@cancan101
Copy link
Contributor

@jreback I understand repr and str are not the same. I was just suggesting that it would be cool for the str formatting of Timestamp to take into account the offset when deciding how to format the TImestamp.

@jreback
Copy link
Contributor

jreback commented Mar 6, 2014

no problem with that. can you propose some possibilities (e.g. Timestamps / Periods with and w/o freqs for various freqs)

@jreback
Copy link
Contributor

jreback commented Mar 6, 2014

@rosnfeld can you rebase on current master and force push....had some issue with caching of some deps...shouldl be resolved now

@jreback jreback added this to the 0.14.0 milestone Mar 6, 2014
@rosnfeld
Copy link
Contributor Author

rosnfeld commented Mar 6, 2014

Yup, done and Travis seems happy now.

jreback added a commit that referenced this pull request Mar 7, 2014
BUG: preserve frequency across Timestamp addition/subtraction (#4547)
@jreback jreback merged commit fb1b4a9 into pandas-dev:master Mar 7, 2014
@jreback
Copy link
Contributor

jreback commented Mar 7, 2014

@rosnfeld thanks again!

don't be shy about looking for more PR's!

@rosnfeld rosnfeld deleted the issue_4547 branch March 7, 2014 00:17
@rosnfeld
Copy link
Contributor Author

rosnfeld commented Mar 7, 2014

@cancan101 - I played with your suggestion a little bit as it sounded cool, but after working through some examples I'm not sure I am in favor of it. In my understanding, Timestamps represent an instant in time, whereas Periods represent a span of time, and I think we should present these differently to users.

One may have monthly samples of some quantity, but if it's actually a point-in-time sample, taken at say the end of the month vs the start of the month, I think I'd like to see that daily level of detail rather than just seeing the month. If one instead is working with some quantity averaged over the month (e.g. average temperature over a month) then it makes more sense to just show the month, but then I think one should be using a Period rather than a Timestamp.

Admittedly I haven't played with Periods much, I've just read about them in the Python for Data Analysis book.

@cancan101
Copy link
Contributor

@rosnfeld What you say makes sense. That being said, why does the Timestamp even have an offset to begin with?

@rosnfeld
Copy link
Contributor Author

rosnfeld commented Mar 7, 2014

@cancan101 Right, good question - I think it's because a Timestamp really can have a "frequency", it's an instant-in-time sample taken every so often. I don't think pandas does too much with this information at present.

@jreback
Copy link
Contributor

jreback commented Mar 8, 2014

The freq on a Timestamp facilities operations w.r.t. that frequency, e.g.

e.g.

Timestamp('20130101',offsets=pd.offsets.Second())+1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Frequency DateOffsets
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: TimeStamp looses frequency info on arithmetic ops
3 participants