Skip to content

DOC: Fix copy-pasted name in .day_name docstring #22544

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Sep 8, 2018

Conversation

Peque
Copy link
Contributor

@Peque Peque commented Aug 30, 2018

Just a small change... Feel free to close and include this change in any other typo-fixing commit if you like. 😊

@WillAyd
Copy link
Member

WillAyd commented Aug 30, 2018

You can simply remove day_names and month_names from the docstrings since they don't reference any variable that actually exists. It's OK for a returns to simply describe the type of object being returned.

See https://python-sprints.github.io/pandas/guide/pandas_docstring.html#section-4-returns-or-yields

@Peque
Copy link
Contributor Author

Peque commented Aug 30, 2018

@WillAyd Actually, I prefer it that way, did not know it was allowed in Pandas documentation. 😇

Changed it and force-pushed the commit.

@WillAyd
Copy link
Member

WillAyd commented Aug 30, 2018

Do these pass the validate_docstring script? I think there is at least a missing period at the end of the descriptions that we should probably just clean up while touching. Maybe a couple more

@codecov
Copy link

codecov bot commented Aug 30, 2018

Codecov Report

Merging #22544 into master will decrease coverage by <.01%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #22544      +/-   ##
==========================================
- Coverage   92.04%   92.04%   -0.01%     
==========================================
  Files         169      169              
  Lines       50787    50787              
==========================================
- Hits        46746    46745       -1     
- Misses       4041     4042       +1
Flag Coverage Δ
#multiple 90.45% <ø> (-0.01%) ⬇️
#single 42.29% <ø> (ø) ⬆️
Impacted Files Coverage Δ
pandas/core/arrays/datetimes.py 95.52% <ø> (ø) ⬆️
pandas/util/_depr_module.py 65.11% <0%> (-2.33%) ⬇️
pandas/core/strings.py 98.63% <0%> (ø) ⬆️
pandas/core/indexes/multi.py 95.41% <0%> (ø) ⬆️
pandas/core/frame.py 97.2% <0%> (ø) ⬆️
pandas/util/testing.py 85.75% <0%> (ø) ⬆️
pandas/core/indexes/base.py 96.45% <0%> (ø) ⬆️
pandas/core/resample.py 96.13% <0%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3285bdc...1216484. Read the comment docs.

Copy link
Member

@WillAyd WillAyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned above if you could run these docstrings against the validation script and fix any issues would be ideal. Should be easy enough to do in one pass

Copy link
Member

@datapythonista datapythonista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes look great.

But as @WillAyd said, if you don't mind taking care of the whole docstring, that would be very useful.

Meaning:

  • Replace the parameter type from string to str
  • Capitalize the first letter of the parameter description, and finish it with period
  • Move that None means English locale from the type line, to the description of the parameter line.
  • Finish the returns description with a period
  • Add a simple example

Thanks!

@jreback jreback added this to the 0.24.0 milestone Aug 31, 2018
@Peque
Copy link
Contributor Author

Peque commented Aug 31, 2018

Thanks all for your feedback. Updated the docstrings with some of the @datapythonista suggestions (did not add any examples though).

I was unable to run validate_docstrings.py (got a ModuleNotFoundError: No module named 'pandas._libs.window' exception). 😕 Did not spend much time trying to fix it, I must admit.

As a side note, did you ever think about releasing that docstring checker as a separate package? I bet many other projects could benefit from it. 😊

@WillAyd
Copy link
Member

WillAyd commented Aug 31, 2018

Looks like you need to build the C extensions

python setup.py build_ext --inplace -j 4

Not aware of a conversation to have as a separate package but @gfyoung might have some thoughts on that as well

@Peque
Copy link
Contributor Author

Peque commented Aug 31, 2018

@WillAyd Yeah, that failed too with an error, that's when I stopped trying. 😂 I think I just missed the C++ compiler. May try again later...

@Peque
Copy link
Contributor Author

Peque commented Aug 31, 2018

@WillAyd Yeah, I was only missing a C++ compiler.

Validated the docstrings. No errors found. 👍

locale determining the language in which to return the month name
locale : str
Locale determining the language in which to return the month name.
Default is `None` (English locale).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of saying this here simply say locale : str, optional

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry my previous comment was misleading, let me add more detail.

In the first line, after the name of the parameter and the colon, we have the (Python) type or types, and then the default value, and nothing else (e.g. param_name : int or str, default 0). In case the default value is None and this None is just a flag to indicate something, we use instead of default None the word optional. As a example, if fillna(value=None) means that the None is the value used to impute, we will still use default None. If None meant that we won't impute anything, we would have optional instead.

What we need to have in this case is that None means that 'English locale' will be used. But anything except the format mentioned before should go in the description. Besides consistency, the reason is that at some point it'd be nice to be able to parse all the information in the line of the type. And if we have explanations there, they'll be a problem.

You have more documentation here: https://python-sprints.github.io/pandas/guide/pandas_docstring.html#section-3-parameters

locale determining the language in which to return the day name
locale : str
Locale determining the language in which to return the day name.
Default is `None` (English locale).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment

Copy link
Member

@datapythonista datapythonista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes @Peque, looking better.

Besides those comments, do you think you can add a simple example? I think it'd add a lot of value to let people understand easier what the method does.

locale determining the language in which to return the month name
locale : str
Locale determining the language in which to return the month name.
Default is `None` (English locale).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry my previous comment was misleading, let me add more detail.

In the first line, after the name of the parameter and the colon, we have the (Python) type or types, and then the default value, and nothing else (e.g. param_name : int or str, default 0). In case the default value is None and this None is just a flag to indicate something, we use instead of default None the word optional. As a example, if fillna(value=None) means that the None is the value used to impute, we will still use default None. If None meant that we won't impute anything, we would have optional instead.

What we need to have in this case is that None means that 'English locale' will be used. But anything except the format mentioned before should go in the description. Besides consistency, the reason is that at some point it'd be nice to be able to parse all the information in the line of the type. And if we have explanations there, they'll be a problem.

You have more documentation here: https://python-sprints.github.io/pandas/guide/pandas_docstring.html#section-3-parameters

month_names : Index
Index of day names
Index
Index of day names.

.. versionadded:: 0.23.0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the versionadded would be more useful / standard after the description, not here at the end. @WillAyd what do you think?

@Peque Peque force-pushed the fixdocs branch 2 times, most recently from 1b473a7 to 937ff6b Compare September 2, 2018 10:35
@Peque
Copy link
Contributor Author

Peque commented Sep 2, 2018

I uploaded some changes to better match the preferred style and add simple examples (this time I read the style guide 😇).

PS: in the style guide that you shared with me, the first example is missing dots after descriptions. That may be misleading for the lazy (like me) that only read the first example and "infer" the rules. 😜

locale determining the language in which to return the month name
locale : str, optional
Locale determining the language in which to return the month name.
Default is English locale.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is pandas making the default English? Haven't looked in detail but assumed this would be system dependent

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect that to be system independent (when programming, I'd rather have predictable results).

Code is implemented in get_date_name_field():

if locale is None:
names = np.array(DAYS_FULL, dtype=np.object_)

if locale is None:
names = np.array(MONTHS_FULL, dtype=np.object_)

DAYS_FULL and MONTHS_FULL are hard-coded in Pandas:

MONTHS_FULL = ['', 'January', 'February', 'March', 'April', 'May', 'June',
'July', 'August', 'September', 'October', 'November',
'December']
MONTH_NUMBERS = {name: num for num, name in enumerate(MONTHS)}
MONTH_ALIASES = {(num + 1): name for num, name in enumerate(MONTHS)}
MONTH_TO_CAL_NUM = {name: num + 1 for num, name in enumerate(MONTHS)}
DAYS = ['MON', 'TUE', 'WED', 'THU', 'FRI', 'SAT', 'SUN']
DAYS_FULL = ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday',
'Saturday', 'Sunday']

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea was for day_name to replace the depreciated weekday_name which defaulted to hardcoded english names (without localization support).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree this should say the default is based on the locale

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jreback As you can see in the code snippets above, it is not based on the locale. The default is always English.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jreback I do not think the code I am running is relevant here. Unless you refer to a trivial example like the one I added in the docstring (but why would you need that?):

idx = pd.DatetimeIndex(start='2018-09-02', freq='D', periods=3)
idx.day_name()

Can you explain why you still say the default when locale=None is based on the system's locale? I checked the tests, the code and the behavior and everything seems to prove otherwise: the default when locale=None is English, and hence the proposed docstring. Did you check anything?

Copy link
Contributor Author

@Peque Peque Sep 4, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jreback If you tell me exactly what changes to the current docstring you have on your mind (it seems other reviewers are okay with it), I may better understand what you are saying. ❤️

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Peque @jreback The default locale is English as day_name is meant to replace weekday_name which defaulted to english. So the docstring here is correct.

@Peque feel free to open up a new issue suggesting that day_name and month_name should default to system locale. This would be an API change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mroeschke Yeah, but I do prefer the current behavior, so I would rather not open that issue (and note I am not a native English speaker nor live in an English country).

When programming, I prefer everything in English by default. 😜

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha @Peque. I hold a similar opinion for API transition sake.

Therefore, I think there's no action with this docstring here as far as I'm concerned.

Copy link
Member

@WillAyd WillAyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm - @datapythonista merge away

locale determining the language in which to return the month name
locale : str, optional
Locale determining the language in which to return the month name.
Default is English locale.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree this should say the default is based on the locale

Copy link
Member

@datapythonista datapythonista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, added couple of suggestions that I think could make the examples a bit simpler.

Regarding the locale I think you're just misunderstanding each other.

--------
>>> idx = pd.DatetimeIndex(start='2018-09', freq='M', periods=3)
>>> idx.month_name()
Index(['September', 'October', 'November'], dtype='object')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you don't mind, can you first show idx (i.e. >>> idx after creating the index). Also, if we use '2018-01' it'd make the example more obvious.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, will change it and upload it later today. 👍

About the locale discussion, I too think it must be a misunderstanding. With an average of 10 words per comment, I am having a hard time trying to understand what he tries to say. 😅

If you understand both of us, could you help us? 😂

--------
>>> idx = pd.DatetimeIndex(start='2018-09-02', freq='D', periods=3)
>>> idx.day_name()
Index(['Sunday', 'Monday', 'Tuesday'], dtype='object')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as before.

- To fix confusing `month_index` return name in `day_name` method
- To better match the preferred docstring style.
- To add simple examples.
@pep8speaks
Copy link

Hello @Peque! Thanks for updating the PR.

@Peque
Copy link
Contributor Author

Peque commented Sep 4, 2018

@datapythonista Updated the PR showing the idx values and using a different start.

@jreback jreback merged commit ba6a072 into pandas-dev:master Sep 8, 2018
@jreback
Copy link
Contributor

jreback commented Sep 8, 2018

thanks @Peque

aeltanawy pushed a commit to aeltanawy/pandas that referenced this pull request Sep 20, 2018
Sup3rGeo pushed a commit to Sup3rGeo/pandas that referenced this pull request Oct 1, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants