Skip to content

DOC: Fix examples in documentation #31472

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 17 commits into from
Mar 7, 2020

Conversation

ShaharNaveh
Copy link
Member

@ShaharNaveh ShaharNaveh commented Jan 30, 2020

  • closes #xxxx
  • tests added / passed
  • passes black pandas
  • passes git diff upstream/master -u -- "*.py" | flake8 --diff
  • whatsnew entry

@pep8speaks
Copy link

pep8speaks commented Jan 30, 2020

Hello @MomIsBestFriend! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2020-02-27 17:35:24 UTC

"pandas_version": "0.20.0"},
"data": [{"index": "row 1", "col 1": "a", "col 2": "b"},
{"index": "row 2", "col 1": "c", "col 2": "d"}]}'
'{"schema":{"fields":[{"name":"index","type":"string"},{"name":"col 1","type":"string"},{"name":"col 2","type":"string"}],"primaryKey":["index"],"pandas_version":"0.20.0"},"data":[{"index":"row 1","col 1":"a","col 2":"b"},{"index":"row 2","col 1":"c","col 2":"d"}]}'
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because of lines like these (and basically every other output line, that's received by df.to_json()), I think it's a good idea that we include a "pprint" example under each one, so it will look somewhat like this:

def example():
    """
    Examples
    --------
    Encoding with table schema

    >>> df = pd.DataFrame(
    ...     [["a", "b"], ["c", "d"]],
    ...     index=["row 1", "row 2"],
    ...     columns=["col 1", "col 2"],
    ... )

    >>> df.to_json(orient='table')
    '{"schema":{"fields":[{"name":"index","type":"string"},{"name":"col 1","type":"string"},{"name":"col 2","type":"string"}],"primaryKey":["index"],"pandas_version":"0.20.0"},"data":[{"index":"row 1","col 1":"a","col 2":"b"},{"index":"row 2","col 1":"c","col 2":"d"}]}'


    Pretty print version:

    >>> import json
    >>> result = df.to_json(orient="table")
    >>> parsed = json.loads(result)
    >>> json.dumps(parsed, indent=4)
    {
        "schema": {
            "fields": [
                {
                    "name": "index",
                    "type": "string"
                },
                {
                    "name": "col 1",
                    "type": "string"
                },
                {
                    "name": "col 2",
                    "type": "string"
                }
            ],
            "primaryKey": [
                "index"
            ],
            "pandas_version": "0.20.0"
        },
        "data": [
            {
                "index": "row 1",
                "col 1": "a",
                "col 2": "b"
            },
            {
                "index": "row 2",
                "col 1": "c",
                "col 2": "d"
            }
        ]
    }
    """

@alimcmaster1
Copy link
Member

Flake8 errors in CI:

##[error]./pandas/core/generic.py:2217:89:E501:line too long (94 > 88 characters)

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also pls merge master

"index":["row 1","row 2"],
"data":[["a","b"],["c","d"]]}'
'{"columns":["col 1","col 2"],\
"index":["row 1","row 2"],\
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you use ... instead here?

"data": [{"index": "row 1", "col 1": "a", "col 2": "b"},
{"index": "row 2", "col 1": "c", "col 2": "d"}]}'
'{"schema":{"fields":[{"name":"index","type":"string"},\
{"name":"col 1","type":"string"},{"name":"col 2","type":"string"}],\
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you use ... here? (else could do a json prettify, e.g. json.dump(...., indent=4)

freq 2
first 2000-01-01 00:00:00
last 2010-01-01 00:00:00
count 3
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are these on the doctest list that we check?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since 4d66fa8 they are.

@@ -9589,16 +9685,16 @@ def describe(

Excluding numeric columns from a ``DataFrame`` description.

>>> df.describe(exclude=[np.number])
>>> df.describe(exclude=[np.number]) # doctest: +SKIP
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How come you are adding doctest skips opposed to our current pytest -k approach? Think we should be consistent.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem with describe, is that the output can be random, If you know how to skip a specific line in the output, it would be great!

@ShaharNaveh
Copy link
Member Author

Restarting azure

@ShaharNaveh ShaharNaveh reopened this Feb 22, 2020
@datapythonista
Copy link
Member

A bit unsure about this, but I think it's reasonable. At least we test all what we can test.

You have conflicts to fix. Also, for inline comments (with #), please leave two spaces instead of one before the hash. That's PEP-8, not sure why the PEP-8 validation is not complaining, I guess we don't have it active in the CI because there are failing cases.

Also, why describe output is not deterministic? I think it should.

@datapythonista
Copy link
Member

Thanks for the fixes @MomIsBestFriend. About the describe, why do we need to skip the tests? Isn't the output deterministic?

@jreback, if you want to have another look and see if your comments were addressed...

@ShaharNaveh
Copy link
Member Author

Thanks for the fixes @MomIsBestFriend. About the describe, why do we need to skip the tests? Isn't the output deterministic?

I haven't got it to work without the skip maybe one of the core developers knows something?

@datapythonista
Copy link
Member

I haven't got it to work without the skip maybe one of the core developers knows something?

Do you know what was the error?

@datapythonista datapythonista merged commit 6852012 into pandas-dev:master Mar 7, 2020
@datapythonista
Copy link
Member

Thanks for fixing those @MomIsBestFriend.

Do you mind opening an issue for the describe? I think the output should be deterministic and we shouldn't need the SKIP there. Would be good to have a look and know what's wrong.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants