Skip to content

DOC: MonthBegin suggestion to get start of current month not quite right? #52106

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
1 task done
MarcoGorelli opened this issue Mar 21, 2023 · 11 comments · Fixed by #52161
Closed
1 task done

DOC: MonthBegin suggestion to get start of current month not quite right? #52106

MarcoGorelli opened this issue Mar 21, 2023 · 11 comments · Fixed by #52161
Assignees

Comments

@MarcoGorelli
Copy link
Member

MarcoGorelli commented Mar 21, 2023

Pandas version checks

  • I have checked that the issue still exists on the latest versions of the docs on main here

Location of the documentation

https://pandas.pydata.org/docs/dev/reference/api/pandas.tseries.offsets.MonthBegin.html?highlight=monthbegin#pandas.tseries.offsets.MonthBegin

Documentation problem

The docs for MonthEnd say

To get the end of the current month pass the parameter n equals 0.

This is correct:

In [51]: pd.Timestamp('2023-03-21') + pd.offsets.MonthEnd(n=0)
Out[51]: Timestamp('2023-03-31 00:00:00')

But for `MonthBegin, it says

To get the start of the current month pass the parameter n equals 0.

I think this is not quite right:

In [58]: pd.Timestamp('2023-03-21') + pd.offsets.MonthBegin(n=0)
Out[58]: Timestamp('2023-04-01 00:00:00')

Suggested fix for documentation

I would have expected - pd.offsets.MonthBegin(n=0) to go to the beginning of the current month, but it doesn't:

In [64]: pd.Timestamp('2023-03-21') - pd.offsets.MonthBegin(n=0)
Out[64]: Timestamp('2023-04-01 00:00:00')

I'm not sure if this is a bug, I've not looked closely enough.

I think I'd suggest:

  • removing the "To get the start of the current month pass the parameter n equals 0." suggestion
  • adding an example of using + pd.offsets.MonthBegin() - pd.offsets.MonthBegin() to get the start of the current month

cc @natmokval in case you wanted to make a PR (no blame of course! I should have noticed this when reviewing)

@MarcoGorelli MarcoGorelli added Docs Needs Triage Issue that has not been reviewed by a pandas team member Frequency DateOffsets and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Mar 21, 2023
@MarcoGorelli
Copy link
Member Author

Same story in #52105

I'll take a look this week about whether - MonthBegin(n=0) doesn't go to the beginning of the current month is a bug or not, that'll inform which change needs making to the docs

@MarcoGorelli
Copy link
Member Author

Right, the docs for DateOffset say

Zero presents a problem. Should it roll forward or back? We arbitrarily have it rollforward:
date + BDay(0) == BDay.rollforward(date)
Since 0 is a bit weird, we suggest avoiding its use.

So, we probably shouldn't recommend it in the docs.

How about we change the docs to say

To get the start of the current month use .rollback.

and have an example like

In [26]: pd.offsets.MonthBegin().rollback(pd.Timestamp('2023-03-21'))
Out[26]: Timestamp('2023-03-01 00:00:00')

as well as an example like

In [37]: pd.to_datetime(["2017-09-18", "2018-01-01"]).to_period('M').to_timestamp()
Out[37]: DatetimeIndex(['2017-09-01', '2018-01-01'], dtype='datetime64[ns]', freq=None)

to show how to get back to the beginning of the month if one has an array?

@pranav-ravuri
Copy link
Contributor

Hi, can I take up as my first issue to update the docs. Thankyou

@MarcoGorelli
Copy link
Member Author

go ahead, thanks!

@pranav-ravuri
Copy link
Contributor

take

@pranav-ravuri
Copy link
Contributor

Hi, @MarcoGorelli quick question, I have been reading the pandas contributor docs but I am unable to find a guideline to name my local (forked) branch to add this change, could you please help me with that. Thanks

@MarcoGorelli
Copy link
Member Author

hey - it doesn't really matter what you name your branch, up to you

@pranav-ravuri
Copy link
Contributor

pranav-ravuri commented Mar 23, 2023

Hi @MarcoGorelli , I updated the MonthBegin docstring to

    """
    DateOffset of one month at beginning.

    MonthBegin goes to the next date which is a start of the month.

    See Also
    --------
    :class:`~pandas.tseries.offsets.DateOffset` : Standard kind of date increment.

    Examples
    --------
    >>> ts = pd.Timestamp(2022, 11, 30)
    >>> ts + pd.offsets.MonthBegin()
    Timestamp('2022-12-01 00:00:00')

    >>> ts = pd.Timestamp(2022, 12, 1)
    >>> ts + pd.offsets.MonthBegin()
    Timestamp('2023-01-01 00:00:00')

    If you want to get the start of the current month:

    >>> ts = pd.Timestamp(2022, 12, 1)
    >>> pd.offsets.MonthBegin().rollback(ts)
    Timestamp('2022-12-01 00:00:00')
    """
    _prefix = "MS"
    _day_opt = "start"

and MonthEnd docstring to

    """
    DateOffset of one month end.

    MonthEnd goes to the next date which is an end of the month.

    See Also
    --------
    :class:`~pandas.tseries.offsets.DateOffset` : Standard kind of date increment.

    Examples
    --------
    >>> ts = pd.Timestamp(2022, 1, 30)
    >>> ts + pd.offsets.MonthEnd()
    Timestamp('2022-01-31 00:00:00')

    >>> ts = pd.Timestamp(2022, 1, 31)
    >>> ts + pd.offsets.MonthEnd()
    Timestamp('2022-02-28 00:00:00')

    If you want to get the end of the current month pass the parameter n equals 0:

    >>> ts = pd.Timestamp(2022, 1, 31)
    >>> pd.offsets.MonthEnd().rollforward(ts)
    Timestamp('2022-01-31 00:00:00')
    """
    _period_dtype_code = PeriodDtypeCode.M
    _prefix = "M"
    _day_opt = "end"

So, is this okay and are there any more docstrings I need to update. On a side note out of curiosity what is

_period_dtype_code = PeriodDtypeCode.M
    _prefix = "M"
    _day_opt = "end"

at the end of docstring?

@MarcoGorelli
Copy link
Member Author

If you want to get the end of the current month pass the parameter n equals 0:

This needs updating, but the rest looks good!

Those attributes are used internally in some other places (but there is an open PR to change 'M' to 'ME', as 'M' meaning "month end" is very confusing)

@pranav-ravuri
Copy link
Contributor

If you want to get the end of the current month pass the parameter n equals 0:

This needs updating, but the rest looks good!

Those attributes are used internally in some other places (but there is an open PR to change 'M' to 'ME', as 'M' meaning "month end" is very confusing)

so, are they like aliases for some sorts? could you please direct me to the concept being used here internally. On the other note, I have updated MonthBegin and MonthEnd doc string, but as a sanity, I checked whether, BMonthBegin and BMonthEnd had the same issue at it has the same issue

>>> pd.Timestamp('2023-03-21') + pd.offsets.BMonthEnd(n=0)
>>> Timestamp('2023-03-31 00:00:00') # okay
>>> pd.Timestamp('2023-03-21') + pd.offsets.BMonthBegin(n=0)
>>> Timestamp('2023-04-03 00:00:00') # not okay

sould I update these docs also?

@MarcoGorelli
Copy link
Member Author

MarcoGorelli commented Mar 24, 2023

sould I update these docs also?

yes please! as well as any others which you find

it would be good to get this in in time for the 2.0 release, so please do open a pull request even if it's not 100% finished, I can finish it up if necessary

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants