Skip to content

DOC: update the dtypes/ftypes docstring (Seoul) #20100

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Mar 12, 2018

Conversation

jongwony
Copy link
Contributor

Checklist for the pandas documentation sprint (ignore this if you are doing
an unrelated PR):

  • PR title is "DOC: update the docstring"
  • The validation script passes: scripts/validate_docstrings.py <your-function-or-method>
  • The PEP8 style check passes: git diff upstream/master -u -- "*.py" | flake8 --diff
  • The html version looks good: python doc/make.py --single <your-function-or-method>
  • It has been proofread on language by another sprint participant

Please include the output of the validation script below between the "```" ticks:

# paste output of "scripts/validate_docstrings.py <your-function-or-method>" here
# between the "```" (remove this comment, but keep the "```")


################################################################################
##################### Docstring (pandas.DataFrame.dtypes)  #####################
################################################################################

Return the dtypes in this object.

Notes
-----
It returns a Series with the data type of each column.
If object contains data multiple dtypes in a single column,
dtypes will be chosen to accommodate all of the data types.
``object`` is the most general.

Examples
--------
>>> df = pd.DataFrame({'f': pd.np.random.rand(3),
...                    'i': 1,
...                    'd': pd.Timestamp('20180310'),
...                    'o': 'foo'})
>>> df.dtypes
f           float64
i             int64
d    datetime64[ns]
o            object
dtype: object

################################################################################
################################## Validation ##################################
################################################################################

Errors found:
	No extended summary found
	No returns section found
	See Also section not found


################################################################################
##################### Docstring (pandas.DataFrame.ftypes)  #####################
################################################################################

Return the ftypes (indication of sparse/dense and dtype)
in this object.

Notes
-----
Sparse data should have the same dtypes as its dense representation

See Also
--------
dtypes, SparseDataFrame

Examples
--------
>>> arr = pd.np.random.randn(100, 4)
>>> arr[arr < .8] = pd.np.nan
>>> pd.DataFrame(arr).ftypes
0    float64:dense
1    float64:dense
2    float64:dense
3    float64:dense
dtype: object
>>> pd.SparseDataFrame(arr).ftypes
0    float64:sparse
1    float64:sparse
2    float64:sparse
3    float64:sparse
dtype: object

################################################################################
################################## Validation ##################################
################################################################################

Errors found:
	No summary found (a short summary in a single line should be present at the beginning of the docstring)
	No returns section found
	Missing description for See Also "dtypes" reference
	Missing description for See Also "SparseDataFrame" reference

If the validation script still gives errors, but you think there is a good reason
to deviate in this case (and there are certainly such cases), please state this
explicitly.

Lastly, I left errors already occurred in the previous version without changes.

"""
Return the dtypes in this object.

Notes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can remove the "Notes" header, and just make this the extended summary.


Notes
-----
It returns a Series with the data type of each column.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe replace "it" with "This method". And let's say what the values and index is.

This returns a Series with the data type of each column. The result's index is the
original DataFrame's columns.

Let's also replace all instances of "object" with "DataFrame". I'm not sure why this is in generic.py since I think it's specific to DataFrame.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm modifying dtype @property of the NDFrame in generic.py
Is "This property" okay instead of "This method"?

At first, object was NDFrame, but found an error.

Private classes (['NDFrame']) should not be mentioned in public docstring.

but it seems to be better that the way you specify with "DataFrame", I will do that.


Examples
--------
>>> df = pd.DataFrame({'f': pd.np.random.rand(3),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you just write out three floating point values? I'd like to avoid random data.

And FYI in general you don't want to use pd.np. You'll want to import NumPy manually (it's assumed to be imported in our docstrings.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, and could you just make all these length-1 lists, just to be clearer? so {'f': [1.0], 'i': [1], ...}


Notes
-----
Sparse data should have the same dtypes as its dense representation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

End in a .

2 float64:dense
3 float64:dense
dtype: object
>>> pd.SparseDataFrame(arr).ftypes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a blank line before this to break things up a bit.


Examples
--------
>>> arr = pd.np.random.randn(100, 4)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

arr = np.random.RandomState(0).randn(100, 4) for reproducibility

"""Return the dtypes in this object."""
"""
Return the dtypes in this object.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please add a Returns section as specified in the guide?

If object contains data multiple dtypes in a single column,
dtypes will be chosen to accommodate all of the data types.
``object`` is the most general.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe in a See Also section ftypes could be mentioned.

@@ -4285,6 +4307,31 @@ def ftypes(self):
"""
Return the ftypes (indication of sparse/dense and dtype)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docstring guide asks that the summary should fit in a single line. Could you rephrase it that way? If all information could not fit in a single line you can then use an Extended Summary section.


See Also
--------
dtypes, SparseDataFrame
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of simply dtypes you should use pandas.DataFrame.dtypes. You can even try linking to dtypes with the notation found in the guide: :meth:`pandas.Series.sum`

Copy link
Contributor

@joaoavf joaoavf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found some points that could be improved. Good sprint for you all!


Notes
-----
It returns a Series with the data type of each column.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Create a 'Returns' section to explain the output. Notes should be used only to explain technical details about the implementation of the algorithm or function behavior.

@@ -4275,7 +4275,29 @@ def get_ftype_counts(self):

@property
def dtypes(self):
"""Return the dtypes in this object."""
"""
Return the dtypes in this object.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Explain in more depth to a novice user that this is used to get the dtypes per column of the DataFrame.

It returns a Series with the data type of each column.
If object contains data multiple dtypes in a single column,
dtypes will be chosen to accommodate all of the data types.
``object`` is the most general.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is worth explaining that str will be represented as object.

"""Return the dtypes in this object."""
"""
Return the dtypes in this object.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a 'See Also' section to contemplate common dtypes.

@@ -4285,6 +4307,31 @@ def ftypes(self):
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would better organized it there was pull request for dtypes and another pull request for ftypes.

@@ -4285,6 +4307,31 @@ def ftypes(self):
"""
Return the ftypes (indication of sparse/dense and dtype)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Short summary should have 1 line.

@@ -4285,6 +4307,31 @@ def ftypes(self):
"""
Return the ftypes (indication of sparse/dense and dtype)
in this object.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a return section.

-----
Sparse data should have the same dtypes as its dense representation

See Also
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'See Also' should go before notes.


See Also
--------
dtypes, SparseDataFrame
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make a short description for the itens cited in the summary.

@jreback
Copy link
Contributor

jreback commented Mar 10, 2018

is this overlapping with #20099 ?

@jreback jreback added the Docs label Mar 10, 2018
>>> df = pd.DataFrame({'f': pd.np.random.rand(3),
... 'i': 1,
... 'd': pd.Timestamp('20180310'),
... 'o': 'foo'})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add in See Also to Series.dtype

@pep8speaks
Copy link

pep8speaks commented Mar 11, 2018

Hello @lastone9182! Thanks for updating the PR.

Cheers ! There are no PEP8 issues in this Pull Request. 🍻

Comment last updated on March 12, 2018 at 21:10 Hours UTC

@codecov
Copy link

codecov bot commented Mar 11, 2018

Codecov Report

Merging #20100 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master   #20100   +/-   ##
=======================================
  Coverage   91.72%   91.72%           
=======================================
  Files         150      150           
  Lines       49149    49149           
=======================================
  Hits        45083    45083           
  Misses       4066     4066
Flag Coverage Δ
#multiple 90.11% <ø> (ø) ⬆️
#single 41.85% <ø> (ø) ⬆️
Impacted Files Coverage Δ
pandas/core/generic.py 95.84% <ø> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 52cffa3...8058931. Read the comment docs.

@TomAugspurger TomAugspurger merged commit 22b6749 into pandas-dev:master Mar 12, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants