Skip to content

ENH: add Series.info #31796

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 71 commits into from
Closed

Conversation

MarcoGorelli
Copy link
Member

@MarcoGorelli MarcoGorelli commented Feb 7, 2020

  • initial draft to close API: add Series.info method #5167
  • tests added / passed
  • passes black pandas
  • passes git diff upstream/master -u -- "*.py" | flake8 --diff
  • whatsnew entry

screenshot of the example
image


Note to self: this change from ba72b59b44f15076877e784e16cbc66aa03e0adc breaks the tests

-    # groupby dtype.name to collect e.g. Categorical columns
-    counts = data.dtypes.value_counts().groupby(lambda x: x.name).sum()
+    counts = data._data.get_dtype_counts()

Updated

Currently, looks like this:

image

@MarcoGorelli
Copy link
Member Author

Some screenshots
image
image
image
image

@pep8speaks
Copy link

pep8speaks commented Feb 9, 2020

Hello @MarcoGorelli! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2020-09-19 16:26:44 UTC

@MarcoGorelli MarcoGorelli changed the title (wip) ENH: add Series.info ENH: add Series.info Feb 9, 2020
@MarcoGorelli
Copy link
Member Author

Thanks @WillAyd for your review, have updated

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so, before doing this, can you do a pre-cursor which moves the .info code into pandas/io/formats/info.py. we already have way too many things inside the core/frame.py and core/generic.py modules. I think this would promote easy sharing of this code. also happy for the added tests, you can move them as well to pandas/io/formats/test_info.py

@MarcoGorelli MarcoGorelli mentioned this pull request Feb 11, 2020
@MarcoGorelli MarcoGorelli changed the title ENH: add Series.info wip ENH: add Series.info Feb 19, 2020
Marco Gorelli added 2 commits February 19, 2020 12:27
@MarcoGorelli MarcoGorelli changed the title wip ENH: add Series.info ENH: add Series.info Feb 19, 2020
@MarcoGorelli
Copy link
Member Author

Screenshots of docs:

image
image
image

cols = Index([data.name])
dtypes = Index([data.dtypes])

col_count = len(cols)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm I think this is confusing to have col_count be populated by a Series in some cases - can you think of a way to refactor?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@WillAyd sure, have given this a go

@MarcoGorelli MarcoGorelli marked this pull request as ready for review July 4, 2020 09:45
@MarcoGorelli MarcoGorelli requested review from WillAyd and jreback July 4, 2020 10:03
@MarcoGorelli
Copy link
Member Author

I've pushed a new attempt at this. I've factored out some pieces of _verbose_repr to avoid code duplication, although their arguments lists and return values have become rather long.

@WillAyd
Copy link
Member

WillAyd commented Sep 10, 2020

@MarcoGorelli can you fix up the PR here? Someone will take a look again thereafter

@MarcoGorelli MarcoGorelli marked this pull request as draft September 14, 2020 07:51
@MarcoGorelli MarcoGorelli marked this pull request as ready for review September 19, 2020 16:26
@MarcoGorelli
Copy link
Member Author

@MarcoGorelli can you fix up the PR here? Someone will take a look again thereafter

@WillAyd Sure, have given another go at it, now using a namedtuple to group together some similar variables

Copy link
Member

@WillAyd WillAyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is headed in the right direction

@@ -15,6 +15,32 @@
from pandas.core.series import Series # noqa: F401


class CountConfigs(NamedTuple):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you generate this using collections.namedtuple instead? We typically don't subclass things from typing

Copy link
Member Author

@MarcoGorelli MarcoGorelli Sep 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, can do, although I think this is the newer syntax, it's taken directly from https://docs.python.org/3/library/typing.html#typing.NamedTuple

They also recommend it in the mypy docs: https://mypy.readthedocs.io/en/stable/kinds_of_types.html#named-tuples

@ivanovmg ivanovmg mentioned this pull request Sep 30, 2020
5 tasks
@MarcoGorelli
Copy link
Member Author

(closing as @ivanovmg will take it over)

@MarcoGorelli MarcoGorelli deleted the series-info branch October 10, 2020 14:14
@ivanovmg ivanovmg mentioned this pull request Oct 21, 2020
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Design Series Series data structure
Projects
None yet
Development

Successfully merging this pull request may close these issues.

API: add Series.info method
4 participants