Adding sample datasets to be used in the documentation #19933

datapythonista · 2018-02-28T00:03:46Z

closes DOC: develop a set of standard example DataFrames for use in docstring examples #19710
tests added / passed
passes git diff upstream/master -u -- "*.py" | flake8 --diff
whatsnew entry

jreback · 2018-03-01T01:19:26Z

I thought we had an issue about this already, IIRC @mrocklin either made it or commented (was for a slightly different purpose though).

jreback · 2018-03-01T01:20:19Z

We already have lots of data constructors in pandas.util.testing, though these are 'nicer' ones. These would need testing (e.g. do they run), and can be de-privatized (no leading _).

jorisvandenbossche · 2018-03-01T08:47:32Z

I thought we had an issue about this already, IIRC @mrocklin either made it or commented (was for a slightly different purpose though).

I don't know if there is an older issue, but this is related to what is recently discussed in #19710 as @datapythonista linked to.

We already have lots of data constructors in pandas.util.testing, though these are 'nicer' ones.

Indeed, I don't think the ones in util.testing are suitable for this, as the exact purpose of those are to be 'nicer' relate-able small dataframes.

datapythonista · 2018-03-02T14:57:17Z

Any more thoughts on this? Knowing which data to use for the examples in the docstrings is the main blocker for the sprint. Any feedback on how you think we can improve this first draft is highly appreciated. Thanks!

jorisvandenbossche · 2018-03-02T15:14:09Z

This needs to be imported in some __init__ files, as otherwise you cannnot do pd.io.samples. .... Or what would be the intended use in the docs?

jreback · 2018-03-04T20:20:17Z

pandas/io/samples.py

+import pandas
+
+
+def _countries_with_penguins():


these need to be de-privatized (no leading _)

jreback · 2018-03-04T20:20:22Z

pandas/io/samples.py

+    columns = ('Code', 'Name', 'Capital', 'Continent',
+               'Penguin species', 'Avg. temperature')
+    data = [
+        ('AO', 'Angola', 'Luanda', 'AF', 1, 21.55),


these need tests

datapythonista · 2018-03-06T00:08:15Z

For what has been discussed in #19710, seems like it probably makes more sense to simply have some ideas on data to be used, but use custom datasets as simple and illustrative as possible depending on each case. So, closing this.

Adding sample datasets to be used in the documentation

9eb8a8d

jorisvandenbossche added the Docs label Feb 28, 2018

jreback requested changes Mar 4, 2018

View reviewed changes

datapythonista closed this Mar 6, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding sample datasets to be used in the documentation #19933

Adding sample datasets to be used in the documentation #19933

datapythonista commented Feb 28, 2018

jreback commented Mar 1, 2018

jreback commented Mar 1, 2018

jorisvandenbossche commented Mar 1, 2018 •

edited

Loading

datapythonista commented Mar 2, 2018

jorisvandenbossche commented Mar 2, 2018

jreback Mar 4, 2018

jreback Mar 4, 2018

datapythonista commented Mar 6, 2018

		import pandas


		def _countries_with_penguins():

Adding sample datasets to be used in the documentation #19933

Adding sample datasets to be used in the documentation #19933

Conversation

datapythonista commented Feb 28, 2018

jreback commented Mar 1, 2018

jreback commented Mar 1, 2018

jorisvandenbossche commented Mar 1, 2018 • edited Loading

datapythonista commented Mar 2, 2018

jorisvandenbossche commented Mar 2, 2018

jreback Mar 4, 2018

Choose a reason for hiding this comment

jreback Mar 4, 2018

Choose a reason for hiding this comment

datapythonista commented Mar 6, 2018

jorisvandenbossche commented Mar 1, 2018 •

edited

Loading